Issue 2414423002: Traverse PDF page tree only once in CPDF_Document

npm

The CQ bit was checked by npm@chromium.org to run a CQ dry run

4 years, 2 months ago (2016-10-14 16:04:57 UTC) #1

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2414423002/1

4 years, 2 months ago (2016-10-14 16:05:01 UTC) #2

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 2 months ago (2016-10-14 16:09:51 UTC) #3

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: linux_xfa on master.tryserver.client.pdfium (JOB_FAILED, https://build.chromium.org/p/tryserver.client.pdfium/builders/linux_xfa/builds/2468) win_xfa on ...

4 years, 2 months ago (2016-10-14 16:09:51 UTC) #4

npm

npm@chromium.org changed reviewers: + dsinclair@chromium.org, thestig@chromium.org, tsepez@chromium.org

4 years, 2 months ago (2016-10-14 18:21:28 UTC) #5

npm

PTAL. I think corpus tests will pass after https://codereview.chromium.org/2419793004/ So please ignore red bots for ...

4 years, 2 months ago (2016-10-14 18:21:28 UTC) #6

Tom Sepez

https://codereview.chromium.org/2414423002/diff/1/core/fpdfapi/parser/cpdf_document.cpp File core/fpdfapi/parser/cpdf_document.cpp (right): https://codereview.chromium.org/2414423002/diff/1/core/fpdfapi/parser/cpdf_document.cpp#newcode507 core/fpdfapi/parser/cpdf_document.cpp:507: int nPages = pKid->GetIntegerFor("Count"); I think we have to ...

4 years, 2 months ago (2016-10-14 19:01:16 UTC) #7

npm

https://codereview.chromium.org/2414423002/diff/1/core/fpdfapi/parser/cpdf_document.cpp File core/fpdfapi/parser/cpdf_document.cpp (right): https://codereview.chromium.org/2414423002/diff/1/core/fpdfapi/parser/cpdf_document.cpp#newcode507 core/fpdfapi/parser/cpdf_document.cpp:507: int nPages = pKid->GetIntegerFor("Count"); On 2016/10/14 19:01:16, Tom Sepez ...

4 years, 2 months ago (2016-10-14 19:31:25 UTC) #8

dsinclair

https://codereview.chromium.org/2414423002/diff/1/core/fpdfapi/parser/cpdf_document.cpp File core/fpdfapi/parser/cpdf_document.cpp (right): https://codereview.chromium.org/2414423002/diff/1/core/fpdfapi/parser/cpdf_document.cpp#newcode507 core/fpdfapi/parser/cpdf_document.cpp:507: int nPages = pKid->GetIntegerFor("Count"); On 2016/10/14 19:31:25, npm wrote: ...

4 years, 2 months ago (2016-10-17 13:26:37 UTC) #9

npm

The CQ bit was checked by npm@chromium.org to run a CQ dry run

4 years, 2 months ago (2016-10-17 20:55:36 UTC) #10

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2414423002/40001

4 years, 2 months ago (2016-10-17 20:55:46 UTC) #11

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 2 months ago (2016-10-17 21:10:10 UTC) #12

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

4 years, 2 months ago (2016-10-17 21:10:11 UTC) #13

npm

PTAL https://codereview.chromium.org/2414423002/diff/1/core/fpdfapi/parser/cpdf_document.cpp File core/fpdfapi/parser/cpdf_document.cpp (right): https://codereview.chromium.org/2414423002/diff/1/core/fpdfapi/parser/cpdf_document.cpp#newcode507 core/fpdfapi/parser/cpdf_document.cpp:507: int nPages = pKid->GetIntegerFor("Count"); On 2016/10/17 13:26:37, dsinclair ...

4 years, 2 months ago (2016-10-17 21:10:23 UTC) #14

PTAL

https://codereview.chromium.org/2414423002/diff/1/core/fpdfapi/parser/cpdf_do...
File core/fpdfapi/parser/cpdf_document.cpp (right):

https://codereview.chromium.org/2414423002/diff/1/core/fpdfapi/parser/cpdf_do...
core/fpdfapi/parser/cpdf_document.cpp:507: int nPages =
pKid->GetIntegerFor("Count");
On 2016/10/17 13:26:37, dsinclair wrote:
> On 2016/10/14 19:31:25, npm wrote:
> > On 2016/10/14 19:01:16, Tom Sepez wrote:
> > > I think we have to deal with the situation where Count does not agree with
> > > sizeof /Kids array, which is why they had a separate parameter.
> > 
> > Well, it is not clear to me how we should behave in that case.
> > But in fact I think this will behave similar in that case:
> > 
> > if Count > sizeOf, some pages indices will be skipped
> > if Count < sizeOf, all the pages in the Kids array will be used.
> > But then iPage only increases by Count. So if there's anything after,
> > the m_PageList values for those will be overriden.
> > 
> > Before this CL:
> > if Count > sizeOf, some pages indices will be skipped
> > if Count < sizeOf, the excess pages will be ignored.
> 
> 
> Should we increase iPage by the min(count, sizeof Kids array) to fix the issue
> where we either skip or ignore?

We were not doing this before. I think this might not be the right thing to do:
we should use Count to keep track of the page indices, that is its purpose.

https://codereview.chromium.org/2414423002/diff/1/core/fpdfapi/parser/cpdf_do...
core/fpdfapi/parser/cpdf_document.cpp:539: TraversePDFPages(pPages, 0, 0);
On 2016/10/17 13:26:37, dsinclair wrote:
> This may have to do more work then needed in some cases. If we have a 10k page
> document and we just want page 1 we'll have to walk all 10k pages the first
time
> instead of just the first page.

Changed to traverse only until needed.

https://codereview.chromium.org/2414423002/diff/40001/core/fpdfapi/parser/cpd...
File core/fpdfapi/parser/cpdf_document.cpp (right):

https://codereview.chromium.org/2414423002/diff/40001/core/fpdfapi/parser/cpd...
core/fpdfapi/parser/cpdf_document.cpp:491: if (nPagesToGo != 1)
Before nPagesToGo was the number of pages left minus 1. So compare with 1
instead of 0.

https://codereview.chromium.org/2414423002/diff/40001/core/fpdfapi/parser/cpd...
core/fpdfapi/parser/cpdf_document.cpp:502: bool shouldFinish =
pPages->GetIntegerFor("Count") <= nPagesToGo;
This will make sure that the page is popped properly even when the number
of leafs does not match Count.

https://codereview.chromium.org/2414423002/diff/40001/core/fpdfapi/parser/cpd...
core/fpdfapi/parser/cpdf_document.cpp:504: for (size_t i = lastProc->second + 1;
i < pKidList->GetCount(); i++) {
lastProc->second is the index of the last completely processed kid, which
initially is -1, and should be pKidList->GetCount()-1 when done.

dsinclair

Can you update the description with some numbers on the performance gain? https://codereview.chromium.org/2414423002/diff/40001/core/fpdfapi/parser/cpdf_document.cpp File core/fpdfapi/parser/cpdf_document.cpp ...

4 years, 2 months ago (2016-10-18 13:36:00 UTC) #15

npm

Description was changed from ========== Traverse PDF page tree only once in CPDF_Document In our ...

4 years, 2 months ago (2016-10-18 14:12:58 UTC) #16

npm

Updated with performance https://codereview.chromium.org/2414423002/diff/40001/core/fpdfapi/parser/cpdf_document.cpp File core/fpdfapi/parser/cpdf_document.cpp (right): https://codereview.chromium.org/2414423002/diff/40001/core/fpdfapi/parser/cpdf_document.cpp#newcode491 core/fpdfapi/parser/cpdf_document.cpp:491: if (nPagesToGo != 1) On 2016/10/18 ...

4 years, 2 months ago (2016-10-18 14:26:31 UTC) #17

npm

Description was changed from ========== Traverse PDF page tree only once in CPDF_Document In our ...

4 years, 2 months ago (2016-10-18 14:30:59 UTC) #18

npm

The CQ bit was checked by npm@chromium.org to run a CQ dry run

4 years, 2 months ago (2016-10-18 14:51:25 UTC) #19

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2414423002/60001

4 years, 2 months ago (2016-10-18 14:51:32 UTC) #20

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 2 months ago (2016-10-18 15:00:21 UTC) #21

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

4 years, 2 months ago (2016-10-18 15:00:22 UTC) #22

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2414423002/60001

4 years, 2 months ago (2016-10-18 17:01:28 UTC) #25

commit-bot: I haz the power

Description was changed from ========== Traverse PDF page tree only once in CPDF_Document In our ...

4 years, 2 months ago (2016-10-18 17:38:23 UTC) #26

commit-bot: I haz the power

Committed patchset #4 (id:60001) as https://pdfium.googlesource.com/pdfium/+/7c29e27dae139a205755c1a29b7f3ac8b36ec0da

4 years, 2 months ago (2016-10-18 17:38:24 UTC) #27

dsinclair

4 years, 2 months ago (2016-10-20 17:23:32 UTC) #28

Message was sent while issue was closed.

A revert of this CL (patchset #4 id:60001) has been created in
https://codereview.chromium.org/2430313006/ by dsinclair@chromium.org.

The reason for reverting is: Possible cause of crbug.com/657897 reverting to
find out.

BUG=657897.

Issue 2414423002: Traverse PDF page tree only once in CPDF_Document (Closed)

Description

Patch Set 1 #

Patch Set 2 : Traverse until needed #

Patch Set 3 : Nits #

Patch Set 4 : Move methods from namespace to private #

Messages