|
|
Created:
8 years ago by please use gerrit instead Modified:
8 years ago Reviewers:
groby-ooo-7-16 CC:
chromium-reviews, rpetterson Base URL:
https://chromium.googlesource.com/chromium/deps/hunspell.git@master Visibility:
Public. |
DescriptionFix array-out-of-bounds error in hunspell
If you invoke SuggestMgr::forgotchar_utf() with wl=99, then the method will
write past the candidate_utf[MAXSWL] array. Here's a step through of what
happens:
int wl = 99; // word length is 99 charachters.
w_char candidate_utf[MAXSWL]; // buffer size is 100 chars.
w_char * p = candidate_utf + wl; // p = candidate_utf + 99.
*(p + 1) = *p; // writing to p + 1, which is candidate_utf + 100.
The fix is to reduce maximum length of spellchecked words from 99 to 98 characters.
Corresponding upstream bug report:
https://sourceforge.net/tracker/?func=detail&aid=3595024&group_id=143754&atid=756395
BUG=130128
Committed: 174476
Patch Set 1 #
Total comments: 3
Patch Set 2 : Better fix #Patch Set 3 : Update google patch #Patch Set 4 : Address comments #
Total comments: 5
Patch Set 5 : Merge with master #
Messages
Total messages: 24 (0 generated)
Groby: PTAL.
https://codereview.chromium.org/11442040/diff/1/src/hunspell/suggestmgr.cxx File src/hunspell/suggestmgr.cxx (right): https://codereview.chromium.org/11442040/diff/1/src/hunspell/suggestmgr.cxx#n... src/hunspell/suggestmgr.cxx:849: char candidate[MAXSWUTF8L + 4]; does this work for 100 character pastes now? And 101? ;) (I.e. where do we clip wl to be < MAXSWL)?
https://codereview.chromium.org/11442040/diff/1/src/hunspell/suggestmgr.cxx File src/hunspell/suggestmgr.cxx (right): https://codereview.chromium.org/11442040/diff/1/src/hunspell/suggestmgr.cxx#n... src/hunspell/suggestmgr.cxx:849: char candidate[MAXSWUTF8L + 4]; On 2012/12/12 01:19:38, groby wrote: > does this work for 100 character pastes now? And 101? ;) > > (I.e. where do we clip wl to be < MAXSWL)? Words of 100 characters or more do not get passed into this method from experience. As for the exact location of clipping wl to be < MAXSWL, I will step through the code tomorrow, because browsing the code was not fruitful.
https://codereview.chromium.org/11442040/diff/1/src/hunspell/suggestmgr.cxx File src/hunspell/suggestmgr.cxx (right): https://codereview.chromium.org/11442040/diff/1/src/hunspell/suggestmgr.cxx#n... src/hunspell/suggestmgr.cxx:849: char candidate[MAXSWUTF8L + 4]; On 2012/12/12 05:33:53, Rouslan Solomakhin wrote: > On 2012/12/12 01:19:38, groby wrote: > > does this work for 100 character pastes now? And 101? ;) > > > > (I.e. where do we clip wl to be < MAXSWL)? > > Words of 100 characters or more do not get passed into this method from > experience. As for the exact location of clipping wl to be < MAXSWL, I will step > through the code tomorrow, because browsing the code was not fruitful. The clip of wl to be < MAXSWL happens on line 140 in hunspell.cxx: if (*nc >= MAXWORDLEN) return 0; (MAXWORDLEN is defined to be 100 in atypes.hxx.) This sets wl=0 in Hunspell::suggest(), which prevents calling SuggestMgr::suggest().
Wondering if a better fix would be to change that line to if (*nc *>* MAXWORDLEN) return 0; (i.e. instead of less-or-equal). After all, if *any* other places use MAXSWL, we'd potentially need to add a +1 there too, while changing the if statement cuts it off higher in the call graph. On Wed, Dec 12, 2012 at 1:00 PM, <rouslan@chromium.org> wrote: > > https://codereview.chromium.**org/11442040/diff/1/src/** > hunspell/suggestmgr.cxx<https://codereview.chromium.org/11442040/diff/1/src/hunspell/suggestmgr.cxx> > File src/hunspell/suggestmgr.cxx (right): > > https://codereview.chromium.**org/11442040/diff/1/src/** > hunspell/suggestmgr.cxx#**newcode849<https://codereview.chromium.org/11442040/diff/1/src/hunspell/suggestmgr.cxx#newcode849> > src/hunspell/suggestmgr.cxx:**849: char candidate[MAXSWUTF8L + 4]; > On 2012/12/12 05:33:53, Rouslan Solomakhin wrote: > >> On 2012/12/12 01:19:38, groby wrote: >> > does this work for 100 character pastes now? And 101? ;) >> > >> > (I.e. where do we clip wl to be < MAXSWL)? >> > > Words of 100 characters or more do not get passed into this method >> > from > >> experience. As for the exact location of clipping wl to be < MAXSWL, I >> > will step > >> through the code tomorrow, because browsing the code was not fruitful. >> > > > The clip of wl to be < MAXSWL happens on line 140 in hunspell.cxx: > > if (*nc >= MAXWORDLEN) return 0; > > (MAXWORDLEN is defined to be 100 in atypes.hxx.) > > This sets wl=0 in Hunspell::suggest(), which prevents calling > SuggestMgr::suggest(). > > https://codereview.chromium.**org/11442040/<https://codereview.chromium.org/1... >
On 2012/12/12 21:05:45, groby wrote: > Wondering if a better fix would be to change that line to > > if (*nc *>* MAXWORDLEN) return 0; > > (i.e. instead of less-or-equal). After all, if *any* other places use > MAXSWL, we'd potentially need to add a +1 there too, while changing the if > statement cuts it off higher in the call graph. Such change would make us accept words that are 100 chars in length, but we'll still crash in SuggestMgr::forgotchar_utf() when we try to write to 100th position of 100-char buffer for character insertion (candidate_utf). If we go with this change, we'd have to increase candidate_utf by 2 characters. Perhaps you meant: if (*nc >= MAXWORDLEN - 1) return 0; That would limit checked words to 98 chars, which the existing 100-char buffers for character insertion will handle fine.
That's what I meant. Clearly, my brain is still a bit flu-y ;) - rachel On Wed, Dec 12, 2012 at 1:11 PM, <rouslan@chromium.org> wrote: > On 2012/12/12 21:05:45, groby wrote: > >> Wondering if a better fix would be to change that line to >> > > if (*nc *>* MAXWORDLEN) return 0; >> > > (i.e. instead of less-or-equal). After all, if *any* other places use >> >> MAXSWL, we'd potentially need to add a +1 there too, while changing the if >> statement cuts it off higher in the call graph. >> > > Such change would make us accept words that are 100 chars in length, but > we'll > still crash in SuggestMgr::forgotchar_utf() when we try to write to 100th > position of 100-char buffer for character insertion (candidate_utf). > > If we go with this change, we'd have to increase candidate_utf by 2 > characters. > > Perhaps you meant: > > if (*nc >= MAXWORDLEN - 1) return 0; > > That would limit checked words to 98 chars, which the existing 100-char > buffers > for character insertion will handle fine. > > https://codereview.chromium.**org/11442040/<https://codereview.chromium.org/1... >
Would you mind reducing the max word length from 99 to 98?
On 2012/12/13 00:28:47, Rouslan Solomakhin wrote: > Would you mind reducing the max word length from 99 to 98? I don't really have an issue with that, but ultimately, that should be cleared with the hunspell people. Also, until it's upstreamed, that change should be in google.patch so it gets re-applied if we roll a new hunspell.
Perhaps a cleaner solution would be to reduce MAXWORDLEN to 99.
On 2012/12/13 16:18:25, Rouslan Solomakhin wrote: > Perhaps a cleaner solution would be to reduce MAXWORDLEN to 99. No, it would not, because MAXWORDLEN is used in many places to allocate buffers.
Groby: PTAL. I've regenerated google.patch using the difference between our hunspell and hunspell-1.3.2 from sourceforge. I am not sure why the difference is so huge.
Well, for one, it seems that the google patch diffs against hunspell 1.41 :) I'd ask previous committers to the patch what exactly they did.
Hironori: How did you generate google.patch originally? It does not appear to be a diff from hunspell-1.3.2, which is what README.chromium says.
Greetings, If I recall correctly, it is a diff between hunspell.cvs.sourceforge.net (the cvs repository of the original hunspell) and our code. (Even though I have updated this file when I rolled hunspell 1.3.2, I have not monitored all hunspell changes and this file may become out of sync.) By the way, for a change to a third-party library, I strongly recommend to upstream it first and merge the upstreamed change to Chromium unless there is a resson why we should not upstream it. (We have not upstreamed the code that reads/writes BDICT files because it is heavily dependent on the Chromium code and we cannot compmile it without the Chromium code.) If it is better not to upstream this change, I recommend to enclose your modified code with #ifdef HUNSPELL_CHROME_CLIENT and #endif to tell we should leave it when merging upstream changes. Regards, Hironori Bono
On 2012/12/16 05:32:45, Hironori Bono wrote: > By the way, for a change to a third-party library, I strongly recommend to > upstream it first and merge the upstreamed change to Chromium unless there is a > resson why we should not upstream it. Hironori: I think this should be upstreamed. The upstream bug-report with patch is 3 days old without reply from maintainer. How long does upstreaming take? Is bug-tracking correct way of contacting the maintainer?
On 2012/12/16 20:29:22, Rouslan Solomakhin wrote: > On 2012/12/16 05:32:45, Hironori Bono wrote: > > By the way, for a change to a third-party library, I strongly recommend to > > upstream it first and merge the upstreamed change to Chromium unless there is > a > > resson why we should not upstream it. > Greetings, Unfortunately, I do not have ideas about when your upstream change is landed to hunspell. (Even though hunspell people checks upstream patches pretty often, they may take holidays now.) When it takes time to upstream a change, it is pretty reasonable to land an interim change and to revert it when the upstream change is landed. (I recommend to add a comment to refer the URL of an upstream bug in this case.) Regards, Hironori Bono
Groby: PTAL. - Added a comment linking to the upstream change next to the interim fix. - Diffed against Hunspell in CVS from March 23 2012 to reduce the number of changes to google.patch.
LGTM, but you might want to do the suggested .patch update first. https://codereview.chromium.org/11442040/diff/24001/google.patch File google.patch (right): https://codereview.chromium.org/11442040/diff/24001/google.patch#newcode911 google.patch:911: wordbreak = pAMgr->get_breaktable(); This one and the free statement a bit later are odd additions. Can you verify that they're from previous CLs that forgot to update the .patch, just to make sure? https://codereview.chromium.org/11442040/diff/24001/google.patch#newcode1214 google.patch:1214: + bdict_reader = reader; This shouldn't be part of the patch either. The purist in me says that you should first land a CL without your changes, that just brings the .patch file up to date...
https://codereview.chromium.org/11442040/diff/24001/google.patch File google.patch (right): https://codereview.chromium.org/11442040/diff/24001/google.patch#newcode1214 google.patch:1214: + bdict_reader = reader; On 2012/12/20 22:56:28, groby wrote: > This shouldn't be part of the patch either. The purist in me says that you > should first land a CL without your changes, that just brings the .patch file up > to date... SGTM, let me do that first.
https://codereview.chromium.org/11442040/diff/24001/google.patch File google.patch (right): https://codereview.chromium.org/11442040/diff/24001/google.patch#newcode1214 google.patch:1214: + bdict_reader = reader; On 2012/12/20 22:58:00, Rouslan Solomakhin wrote: > On 2012/12/20 22:56:28, groby wrote: > > This shouldn't be part of the patch either. The purist in me says that you > > should first land a CL without your changes, that just brings the .patch file > up > > to date... > > SGTM, let me do that first. CL to bring google.patch file up to date: https://codereview.chromium.org/11639053/
https://codereview.chromium.org/11442040/diff/24001/google.patch File google.patch (right): https://codereview.chromium.org/11442040/diff/24001/google.patch#newcode911 google.patch:911: wordbreak = pAMgr->get_breaktable(); On 2012/12/20 22:56:28, groby wrote: > This one and the free statement a bit later are odd additions. Can you verify > that they're from previous CLs that forgot to update the .patch, just to make > sure? "wordbreak = pAMgr->get_breaktable()" itself is not an addition. A diff of a diff is confusing :-) The actual change is below, surrounding pSMgr instantiation with an #ifdef HUNSPELL_CHROME_CLIENT.
Groby: PTAL. I've merged with master, which makes google.patch much more reasonable.
LGTM |