|
|
DescriptionGuard against undefined shift.
This Cl fixes the CFDE_XMLSyntaxParser::ParseTextChar() to handle entities
where the value goes negative. Currently this could cause an undefined-shift
as due to the (ch << 4) calls. Instead, detect if the value has gone negative
and return a space character.
BUG=chromium:603489
Committed: https://pdfium.googlesource.com/pdfium/+/22eeccb34f91f9932f7cec295bcaf641ba249e3a
Patch Set 1 #Patch Set 2 : Make windows happy? #Patch Set 3 : Special case windows #Patch Set 4 : sigh #
Total comments: 3
Patch Set 5 : Handle overflow by length #Patch Set 6 : Update windows result #
Total comments: 4
Patch Set 7 : Do calculation in uint32 space #
Total comments: 2
Patch Set 8 : cleanup #Messages
Total messages: 50 (31 generated)
The CQ bit was checked by dsinclair@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
dsinclair@chromium.org changed reviewers: + thestig@chromium.org, weili@chromium.org
PTAL.
The CQ bit was checked by dsinclair@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: win_xfa_rel_gyp on master.tryserver.client.pdfium (JOB_FAILED, https://build.chromium.org/p/tryserver.client.pdfium/builders/win_xfa_rel_gyp...)
The CQ bit was checked by dsinclair@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: win_xfa_rel_gyp on master.tryserver.client.pdfium (JOB_FAILED, https://build.chromium.org/p/tryserver.client.pdfium/builders/win_xfa_rel_gyp...)
The CQ bit was checked by dsinclair@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
https://codereview.chromium.org/2223823003/diff/60001/xfa/fde/xml/fde_xml_imp... File xfa/fde/xml/fde_xml_imp.cpp (right): https://codereview.chromium.org/2223823003/diff/60001/xfa/fde/xml/fde_xml_imp... xfa/fde/xml/fde_xml_imp.cpp:1871: if (ch < 0) { Should we check overflow instead of being negative? Maybe we can check the number of non-zero wchar, which should be no more than 4, right?
https://codereview.chromium.org/2223823003/diff/60001/xfa/fde/xml/fde_xml_imp... File xfa/fde/xml/fde_xml_imp.cpp (right): https://codereview.chromium.org/2223823003/diff/60001/xfa/fde/xml/fde_xml_imp... xfa/fde/xml/fde_xml_imp.cpp:1871: if (ch < 0) { On 2016/08/08 22:38:52, Wei Li wrote: > Should we check overflow instead of being negative? > Maybe we can check the number of non-zero wchar, which should be no more than 4, > right? I can switch to using overflow checking, will change the code tomorrow. I think the windows test will still need to be special cased as it doesn't overflow a windows wchar_t, but this code will be nicer. Not sure what you mean by checking the non-zero wchar?
Patchset #5 (id:80001) has been deleted
https://codereview.chromium.org/2223823003/diff/60001/xfa/fde/xml/fde_xml_imp... File xfa/fde/xml/fde_xml_imp.cpp (right): https://codereview.chromium.org/2223823003/diff/60001/xfa/fde/xml/fde_xml_imp... xfa/fde/xml/fde_xml_imp.cpp:1871: if (ch < 0) { On 2016/08/08 22:38:52, Wei Li wrote: > Should we check overflow instead of being negative? > Maybe we can check the number of non-zero wchar, which should be no more than 4, > right? I tried the checked numeric but I'd also still need the < 0 check. The wchar_t signedness is undefined by the spec. So, it happens that Linux leaves it as signed, so the CheckedNumeric doesn't overflow, but the shift still fails.
On 2016/08/09 14:33:33, dsinclair wrote: > https://codereview.chromium.org/2223823003/diff/60001/xfa/fde/xml/fde_xml_imp... > File xfa/fde/xml/fde_xml_imp.cpp (right): > > https://codereview.chromium.org/2223823003/diff/60001/xfa/fde/xml/fde_xml_imp... > xfa/fde/xml/fde_xml_imp.cpp:1871: if (ch < 0) { > On 2016/08/08 22:38:52, Wei Li wrote: > > Should we check overflow instead of being negative? > > Maybe we can check the number of non-zero wchar, which should be no more than > 4, > > right? > > > I tried the checked numeric but I'd also still need the < 0 check. The wchar_t > signedness is undefined by the spec. So, it happens that Linux leaves it as > signed, so the CheckedNumeric doesn't overflow, but the shift still fails. Would checking the length work? The entity should be '#wwww;' or '#xwwww;' or loosely '#0*wwww;' or '#0*xwwww;'? Would the length exceeds four? If not, it should not overflow since we already check each individual char's validity.
On 2016/08/09 17:08:37, Wei Li wrote: > On 2016/08/09 14:33:33, dsinclair wrote: > > > https://codereview.chromium.org/2223823003/diff/60001/xfa/fde/xml/fde_xml_imp... > > File xfa/fde/xml/fde_xml_imp.cpp (right): > > > > > https://codereview.chromium.org/2223823003/diff/60001/xfa/fde/xml/fde_xml_imp... > > xfa/fde/xml/fde_xml_imp.cpp:1871: if (ch < 0) { > > On 2016/08/08 22:38:52, Wei Li wrote: > > > Should we check overflow instead of being negative? > > > Maybe we can check the number of non-zero wchar, which should be no more > than > > 4, > > > right? > > > > > > I tried the checked numeric but I'd also still need the < 0 check. The wchar_t > > signedness is undefined by the spec. So, it happens that Linux leaves it as > > signed, so the CheckedNumeric doesn't overflow, but the shift still fails. > > Would checking the length work? The entity should be '#wwww;' or '#xwwww;' > or loosely '#0*wwww;' or '#0*xwwww;'? Would the length exceeds four? If not, it > should not overflow since we already check each individual char's validity. So, I thought about that, but looking at the XML spec, I don't see it specifying that the maximum entity length is 4 characters. It's possible I missed it? If it says that in the spec somewhere I'm happy to use the length check as it's a lot cleaner. I'll look again and see if I can find anything.
On 2016/08/09 17:19:03, dsinclair wrote: > On 2016/08/09 17:08:37, Wei Li wrote: > > On 2016/08/09 14:33:33, dsinclair wrote: > > > > > > https://codereview.chromium.org/2223823003/diff/60001/xfa/fde/xml/fde_xml_imp... > > > File xfa/fde/xml/fde_xml_imp.cpp (right): > > > > > > > > > https://codereview.chromium.org/2223823003/diff/60001/xfa/fde/xml/fde_xml_imp... > > > xfa/fde/xml/fde_xml_imp.cpp:1871: if (ch < 0) { > > > On 2016/08/08 22:38:52, Wei Li wrote: > > > > Should we check overflow instead of being negative? > > > > Maybe we can check the number of non-zero wchar, which should be no more > > than > > > 4, > > > > right? > > > > > > > > > I tried the checked numeric but I'd also still need the < 0 check. The > wchar_t > > > signedness is undefined by the spec. So, it happens that Linux leaves it as > > > signed, so the CheckedNumeric doesn't overflow, but the shift still fails. > > > > Would checking the length work? The entity should be '#wwww;' or '#xwwww;' > > or loosely '#0*wwww;' or '#0*xwwww;'? Would the length exceeds four? If not, > it > > should not overflow since we already check each individual char's validity. > > > So, I thought about that, but looking at the XML spec, I don't see it specifying > that the maximum entity length is 4 characters. It's possible I missed it? If it > says that in the spec somewhere I'm happy to use the length check as it's a lot > cleaner. I'll look again and see if I can find anything. Not sure whether it says in spec, a quick search of 'numeric character reference' shows https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references Plus, if it is longer than 4 (from the non-zero leading wchar), it will overflow since it shift 4 bits every time, right?
On 2016/08/09 17:25:27, Wei Li wrote: > On 2016/08/09 17:19:03, dsinclair wrote: > > On 2016/08/09 17:08:37, Wei Li wrote: > > > On 2016/08/09 14:33:33, dsinclair wrote: > > > > > > > > > > https://codereview.chromium.org/2223823003/diff/60001/xfa/fde/xml/fde_xml_imp... > > > > File xfa/fde/xml/fde_xml_imp.cpp (right): > > > > > > > > > > > > > > https://codereview.chromium.org/2223823003/diff/60001/xfa/fde/xml/fde_xml_imp... > > > > xfa/fde/xml/fde_xml_imp.cpp:1871: if (ch < 0) { > > > > On 2016/08/08 22:38:52, Wei Li wrote: > > > > > Should we check overflow instead of being negative? > > > > > Maybe we can check the number of non-zero wchar, which should be no more > > > than > > > > 4, > > > > > right? > > > > > > > > > > > > I tried the checked numeric but I'd also still need the < 0 check. The > > wchar_t > > > > signedness is undefined by the spec. So, it happens that Linux leaves it > as > > > > signed, so the CheckedNumeric doesn't overflow, but the shift still fails. > > > > > > Would checking the length work? The entity should be '#wwww;' or '#xwwww;' > > > or loosely '#0*wwww;' or '#0*xwwww;'? Would the length exceeds four? If not, > > it > > > should not overflow since we already check each individual char's validity. > > > > > > So, I thought about that, but looking at the XML spec, I don't see it > specifying > > that the maximum entity length is 4 characters. It's possible I missed it? If > it > > says that in the spec somewhere I'm happy to use the length check as it's a > lot > > cleaner. I'll look again and see if I can find anything. > > Not sure whether it says in spec, a quick search of 'numeric character > reference' shows > https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references > > Plus, if it is longer than 4 (from the non-zero leading wchar), it will overflow > since > it shift 4 bits every time, right? It says: The nnnn or hhhh may be any number of digits and may include leading zeros. So, you could have { which would be valid (I think) and won't overflow. I could check the length after trimming leading zero's though and go that route, as you're right, it shifts 4 each time so after 4 chars it would overflow.
The CQ bit was checked by dsinclair@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
PTAL.
And, I broke Windows again. Will ping when the tests pass on Win.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: win_xfa_rel_gyp on master.tryserver.client.pdfium (JOB_FAILED, https://build.chromium.org/p/tryserver.client.pdfium/builders/win_xfa_rel_gyp...)
The CQ bit was checked by dsinclair@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
PTAL.
https://codereview.chromium.org/2223823003/diff/120001/xfa/fde/xml/fde_xml_im... File xfa/fde/xml/fde_xml_imp.cpp (right): https://codereview.chromium.org/2223823003/diff/120001/xfa/fde/xml/fde_xml_im... xfa/fde/xml/fde_xml_imp.cpp:1886: if (ch < 0) { Could you also use length based checking here? If so, the test result would be same across platforms.
https://codereview.chromium.org/2223823003/diff/120001/xfa/fde/xml/fde_xml_im... File xfa/fde/xml/fde_xml_imp.cpp (right): https://codereview.chromium.org/2223823003/diff/120001/xfa/fde/xml/fde_xml_im... xfa/fde/xml/fde_xml_imp.cpp:1886: if (ch < 0) { On 2016/08/10 17:28:30, Wei Li wrote: > Could you also use length based checking here? If so, the test result would be > same across platforms. I don't think so. This check doesn't do the shifts, but *10 each time. We could do an initial check for length, but that wouldn't help for the case where the value is INT_MAX + 1 that we're parsing. The length would be correct, but we're going to overflow anyway.
https://codereview.chromium.org/2223823003/diff/120001/xfa/fde/xml/fde_xml_im... File xfa/fde/xml/fde_xml_imp.cpp (right): https://codereview.chromium.org/2223823003/diff/120001/xfa/fde/xml/fde_xml_im... xfa/fde/xml/fde_xml_imp.cpp:1886: if (ch < 0) { On 2016/08/10 17:31:04, dsinclair wrote: > On 2016/08/10 17:28:30, Wei Li wrote: > > Could you also use length based checking here? If so, the test result would be > > same across platforms. > > > I don't think so. This check doesn't do the shifts, but *10 each time. We could > do an initial check for length, but that wouldn't help for the case where the > value is INT_MAX + 1 that we're parsing. The length would be correct, but we're > going to overflow anyway. Sorry for the back and forth. I am trying to find a solution consistent across platforms. Maybe the following one works and simpler? use unsigned value for calculation for both hex and decimal cases, and check whether this value falls in [0, 10FFFF] (https://www.w3.org/TR/xml/#wf-Legalchar then points to https://www.w3.org/TR/xml/#NT-Char). WDYT?
The CQ bit was checked by dsinclair@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
https://codereview.chromium.org/2223823003/diff/120001/xfa/fde/xml/fde_xml_im... File xfa/fde/xml/fde_xml_imp.cpp (right): https://codereview.chromium.org/2223823003/diff/120001/xfa/fde/xml/fde_xml_im... xfa/fde/xml/fde_xml_imp.cpp:1886: if (ch < 0) { On 2016/08/10 20:18:48, Wei Li wrote: > On 2016/08/10 17:31:04, dsinclair wrote: > > On 2016/08/10 17:28:30, Wei Li wrote: > > > Could you also use length based checking here? If so, the test result would > be > > > same across platforms. > > > > > > I don't think so. This check doesn't do the shifts, but *10 each time. We > could > > do an initial check for length, but that wouldn't help for the case where the > > value is INT_MAX + 1 that we're parsing. The length would be correct, but > we're > > going to overflow anyway. > > Sorry for the back and forth. I am trying to find a solution consistent across > platforms. Maybe the following one works and simpler? > > use unsigned value for calculation for both hex and decimal cases, > and check whether this value falls in [0, 10FFFF] > (https://www.w3.org/TR/xml/#wf-Legalchar then points to > https://www.w3.org/TR/xml/#NT-Char). > > WDYT? > No need to apologize, code working same on all platforms is definitely better. How does this look?
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
lgtm https://codereview.chromium.org/2223823003/diff/140001/xfa/fde/xml/fde_xml_im... File xfa/fde/xml/fde_xml_imp.cpp (right): https://codereview.chromium.org/2223823003/diff/140001/xfa/fde/xml/fde_xml_im... xfa/fde/xml/fde_xml_imp.cpp:1865: if (iLen - i <= 4) { Now, you no longer need the length checking.
https://codereview.chromium.org/2223823003/diff/140001/xfa/fde/xml/fde_xml_im... File xfa/fde/xml/fde_xml_imp.cpp (right): https://codereview.chromium.org/2223823003/diff/140001/xfa/fde/xml/fde_xml_im... xfa/fde/xml/fde_xml_imp.cpp:1865: if (iLen - i <= 4) { On 2016/08/10 21:03:58, Wei Li wrote: > Now, you no longer need the length checking. Done.
The CQ bit was checked by dsinclair@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from weili@chromium.org Link to the patchset: https://codereview.chromium.org/2223823003/#ps160001 (title: "cleanup")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Message was sent while issue was closed.
Description was changed from ========== Guard against undefined shift. This Cl fixes the CFDE_XMLSyntaxParser::ParseTextChar() to handle entities where the value goes negative. Currently this could cause an undefined-shift as due to the (ch << 4) calls. Instead, detect if the value has gone negative and return a space character. BUG=chromium:603489 ========== to ========== Guard against undefined shift. This Cl fixes the CFDE_XMLSyntaxParser::ParseTextChar() to handle entities where the value goes negative. Currently this could cause an undefined-shift as due to the (ch << 4) calls. Instead, detect if the value has gone negative and return a space character. BUG=chromium:603489 Committed: https://pdfium.googlesource.com/pdfium/+/22eeccb34f91f9932f7cec295bcaf641ba24... ==========
Message was sent while issue was closed.
Committed patchset #8 (id:160001) as https://pdfium.googlesource.com/pdfium/+/22eeccb34f91f9932f7cec295bcaf641ba24... |