|
|
Created:
4 years, 2 months ago by ahaas Modified:
4 years, 1 month ago Reviewers:
Benedikt Meurer, titzer, bradn CC:
v8-reviews_googlegroups.com Target Ref:
refs/pending/heads/master Project:
v8 Visibility:
Public. |
Description[wasm] Trim graph before scheduling.
The scheduler expects a trimmed graph, so we have to trim the graph
before scheduling.
R=titzer@chromium.org, bmeurer@chromium.org
TEST=cctest/test-run-wasm/RunWasmCompiled_GraphTrimming
Committed: https://crrev.com/990236825974e683640f9ca56ddd53e3e831a278
Cr-Commit-Position: refs/heads/master@{#40446}
Patch Set 1 #
Total comments: 3
Patch Set 2 : Remove dead code. #Patch Set 3 : Fix pair operations on 32 bit platforms. #Patch Set 4 : The first projection node of a pair has to exist. #Patch Set 5 : Also make the high word projection optional. #
Total comments: 9
Patch Set 6 : Comments addressed #Patch Set 7 : Use proper temp registers if Projection(1) does not exist. #
Created: 4 years, 2 months ago
Messages
Total messages: 48 (28 generated)
The CQ bit was checked by ahaas@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
LGTM once nit addressed. https://codereview.chromium.org/2428443002/diff/1/src/compiler/pipeline.cc File src/compiler/pipeline.cc (right): https://codereview.chromium.org/2428443002/diff/1/src/compiler/pipeline.cc#ne... src/compiler/pipeline.cc:1653: // Run<LateGraphTrimmingPhase>(); Nit: Remove this dead code then.
The CQ bit was checked by ahaas@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
https://codereview.chromium.org/2428443002/diff/1/src/compiler/pipeline.cc File src/compiler/pipeline.cc (right): https://codereview.chromium.org/2428443002/diff/1/src/compiler/pipeline.cc#ne... src/compiler/pipeline.cc:1653: // Run<LateGraphTrimmingPhase>(); On 2016/10/17 at 07:18:46, Benedikt Meurer wrote: > Nit: Remove this dead code then. Done
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: v8_linux64_rel_ng on master.tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux64_rel_ng/builds/14657) v8_linux64_rel_ng_triggered on master.tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux64_rel_ng_triggered...) v8_linux_nodcheck_rel_ng on master.tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux_nodcheck_rel_ng/bu...) v8_linux_nodcheck_rel_ng_triggered on master.tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux_nodcheck_rel_ng_tr...)
On 2016/10/17 07:23:57, ahaas wrote: > https://codereview.chromium.org/2428443002/diff/1/src/compiler/pipeline.cc > File src/compiler/pipeline.cc (right): > > https://codereview.chromium.org/2428443002/diff/1/src/compiler/pipeline.cc#ne... > src/compiler/pipeline.cc:1653: // Run<LateGraphTrimmingPhase>(); > On 2016/10/17 at 07:18:46, Benedikt Meurer wrote: > > Nit: Remove this dead code then. > > Done lgtm
https://codereview.chromium.org/2428443002/diff/1/src/compiler/pipeline.cc File src/compiler/pipeline.cc (right): https://codereview.chromium.org/2428443002/diff/1/src/compiler/pipeline.cc#ne... src/compiler/pipeline.cc:1653: // Run<LateGraphTrimmingPhase>(); Thx!
The CQ bit was checked by ahaas@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was checked by ahaas@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
On 2016/10/17 at 08:38:14, bmeurer wrote: > https://codereview.chromium.org/2428443002/diff/1/src/compiler/pipeline.cc > File src/compiler/pipeline.cc (right): > > https://codereview.chromium.org/2428443002/diff/1/src/compiler/pipeline.cc#ne... > src/compiler/pipeline.cc:1653: // Run<LateGraphTrimmingPhase>(); > Thx! PTAL: There was a problem with the instruction selection for Int32PairXXX nodes on 32 bit platforms. Graph trimming may delete the projection nodes of Int32PairXXX nodes, which means that we cannot reliably use the projection nodes for register allocation. I use the following observation to deal with the problem: As far as I can see, if a projection node is deleted, then it is always the projection node of the high word. This means that for Int32PairSub, Int32PairAdd, and Int32PairMul, if the projection node of the high word gets deleted, we can just emit code for normal 32 bit Add, Sub, and Mul. For shift instructions, if the projection node of the high word gets deleted, we just allocate a temp register instead of an output register.
On 2016/10/17 12:13:20, ahaas wrote: > On 2016/10/17 at 08:38:14, bmeurer wrote: > > https://codereview.chromium.org/2428443002/diff/1/src/compiler/pipeline.cc > > File src/compiler/pipeline.cc (right): > > > > > https://codereview.chromium.org/2428443002/diff/1/src/compiler/pipeline.cc#ne... > > src/compiler/pipeline.cc:1653: // Run<LateGraphTrimmingPhase>(); > > Thx! > > PTAL: There was a problem with the instruction selection for Int32PairXXX nodes > on 32 bit platforms. Graph trimming may delete the projection nodes of > Int32PairXXX nodes, which means that we cannot reliably use the projection nodes > for register allocation. I use the following observation to deal with the > problem: As far as I can see, if a projection node is deleted, then it is always > the projection node of the high word. This means that for Int32PairSub, > Int32PairAdd, and Int32PairMul, if the projection node of the high word gets > deleted, we can just emit code for normal 32 bit Add, Sub, and Mul. For shift > instructions, if the projection node of the high word gets deleted, we just > allocate a temp register instead of an output register. The assumption that only the high word gets deleted isn't sound in general. E.g. if there is a Int32PairAdd flowing into a shift-right by 32 with no other uses, then only the high projection will be live.
On 2016/10/17 at 12:21:14, titzer wrote: > On 2016/10/17 12:13:20, ahaas wrote: > > On 2016/10/17 at 08:38:14, bmeurer wrote: > > > https://codereview.chromium.org/2428443002/diff/1/src/compiler/pipeline.cc > > > File src/compiler/pipeline.cc (right): > > > > > > > > https://codereview.chromium.org/2428443002/diff/1/src/compiler/pipeline.cc#ne... > > > src/compiler/pipeline.cc:1653: // Run<LateGraphTrimmingPhase>(); > > > Thx! > > > > PTAL: There was a problem with the instruction selection for Int32PairXXX nodes > > on 32 bit platforms. Graph trimming may delete the projection nodes of > > Int32PairXXX nodes, which means that we cannot reliably use the projection nodes > > for register allocation. I use the following observation to deal with the > > problem: As far as I can see, if a projection node is deleted, then it is always > > the projection node of the high word. This means that for Int32PairSub, > > Int32PairAdd, and Int32PairMul, if the projection node of the high word gets > > deleted, we can just emit code for normal 32 bit Add, Sub, and Mul. For shift > > instructions, if the projection node of the high word gets deleted, we just > > allocate a temp register instead of an output register. > > The assumption that only the high word gets deleted isn't sound in general. E.g. if there is a Int32PairAdd flowing into a shift-right by 32 with no other uses, then only the high projection will be live. That's true, but we haven't implemented that optimization yet.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
On 2016/10/17 12:36:28, ahaas wrote: > On 2016/10/17 at 12:21:14, titzer wrote: > > On 2016/10/17 12:13:20, ahaas wrote: > > > On 2016/10/17 at 08:38:14, bmeurer wrote: > > > > https://codereview.chromium.org/2428443002/diff/1/src/compiler/pipeline.cc > > > > File src/compiler/pipeline.cc (right): > > > > > > > > > > > > https://codereview.chromium.org/2428443002/diff/1/src/compiler/pipeline.cc#ne... > > > > src/compiler/pipeline.cc:1653: // Run<LateGraphTrimmingPhase>(); > > > > Thx! > > > > > > PTAL: There was a problem with the instruction selection for Int32PairXXX > nodes > > > on 32 bit platforms. Graph trimming may delete the projection nodes of > > > Int32PairXXX nodes, which means that we cannot reliably use the projection > nodes > > > for register allocation. I use the following observation to deal with the > > > problem: As far as I can see, if a projection node is deleted, then it is > always > > > the projection node of the high word. This means that for Int32PairSub, > > > Int32PairAdd, and Int32PairMul, if the projection node of the high word gets > > > deleted, we can just emit code for normal 32 bit Add, Sub, and Mul. For > shift > > > instructions, if the projection node of the high word gets deleted, we just > > > allocate a temp register instead of an output register. > > > > The assumption that only the high word gets deleted isn't sound in general. > E.g. if there is a Int32PairAdd flowing into a shift-right by 32 with no other > uses, then only the high projection will be live. > > That's true, but we haven't implemented that optimization yet. We will, then we'll have revisit this issue again. I think it's better to just go ahead and handle the general case.
The CQ bit was checked by ahaas@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
On 2016/10/17 at 12:44:11, titzer wrote: > On 2016/10/17 12:36:28, ahaas wrote: > > On 2016/10/17 at 12:21:14, titzer wrote: > > > On 2016/10/17 12:13:20, ahaas wrote: > > > > On 2016/10/17 at 08:38:14, bmeurer wrote: > > > > > https://codereview.chromium.org/2428443002/diff/1/src/compiler/pipeline.cc > > > > > File src/compiler/pipeline.cc (right): > > > > > > > > > > > > > > > > https://codereview.chromium.org/2428443002/diff/1/src/compiler/pipeline.cc#ne... > > > > > src/compiler/pipeline.cc:1653: // Run<LateGraphTrimmingPhase>(); > > > > > Thx! > > > > > > > > PTAL: There was a problem with the instruction selection for Int32PairXXX > > nodes > > > > on 32 bit platforms. Graph trimming may delete the projection nodes of > > > > Int32PairXXX nodes, which means that we cannot reliably use the projection > > nodes > > > > for register allocation. I use the following observation to deal with the > > > > problem: As far as I can see, if a projection node is deleted, then it is > > always > > > > the projection node of the high word. This means that for Int32PairSub, > > > > Int32PairAdd, and Int32PairMul, if the projection node of the high word gets > > > > deleted, we can just emit code for normal 32 bit Add, Sub, and Mul. For > > shift > > > > instructions, if the projection node of the high word gets deleted, we just > > > > allocate a temp register instead of an output register. > > > > > > The assumption that only the high word gets deleted isn't sound in general. > > E.g. if there is a Int32PairAdd flowing into a shift-right by 32 with no other > > uses, then only the high projection will be live. > > > > That's true, but we haven't implemented that optimization yet. > > We will, then we'll have revisit this issue again. I think it's better to just go ahead and handle the general case. Apparently not having Projection(0) is not a big deal because for register allocation we use the pair node itself and not Projection(0). This means we handle the case with no Projection(0) node correctly already.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
On 2016/10/18 at 13:49:57, ahaas wrote: > On 2016/10/17 at 12:44:11, titzer wrote: > > On 2016/10/17 12:36:28, ahaas wrote: > > > On 2016/10/17 at 12:21:14, titzer wrote: > > > > On 2016/10/17 12:13:20, ahaas wrote: > > > > > On 2016/10/17 at 08:38:14, bmeurer wrote: > > > > > > https://codereview.chromium.org/2428443002/diff/1/src/compiler/pipeline.cc > > > > > > File src/compiler/pipeline.cc (right): > > > > > > > > > > > > > > > > > > > > https://codereview.chromium.org/2428443002/diff/1/src/compiler/pipeline.cc#ne... > > > > > > src/compiler/pipeline.cc:1653: // Run<LateGraphTrimmingPhase>(); > > > > > > Thx! > > > > > > > > > > PTAL: There was a problem with the instruction selection for Int32PairXXX > > > nodes > > > > > on 32 bit platforms. Graph trimming may delete the projection nodes of > > > > > Int32PairXXX nodes, which means that we cannot reliably use the projection > > > nodes > > > > > for register allocation. I use the following observation to deal with the > > > > > problem: As far as I can see, if a projection node is deleted, then it is > > > always > > > > > the projection node of the high word. This means that for Int32PairSub, > > > > > Int32PairAdd, and Int32PairMul, if the projection node of the high word gets > > > > > deleted, we can just emit code for normal 32 bit Add, Sub, and Mul. For > > > shift > > > > > instructions, if the projection node of the high word gets deleted, we just > > > > > allocate a temp register instead of an output register. > > > > > > > > The assumption that only the high word gets deleted isn't sound in general. > > > E.g. if there is a Int32PairAdd flowing into a shift-right by 32 with no other > > > uses, then only the high projection will be live. > > > > > > That's true, but we haven't implemented that optimization yet. > > > > We will, then we'll have revisit this issue again. I think it's better to just go ahead and handle the general case. > > Apparently not having Projection(0) is not a big deal because for register allocation we use the pair node itself and not Projection(0). This means we handle the case with no Projection(0) node correctly already. PTAL
https://codereview.chromium.org/2428443002/diff/80001/src/compiler/ia32/instr... File src/compiler/ia32/instruction-selector-ia32.cc (right): https://codereview.chromium.org/2428443002/diff/80001/src/compiler/ia32/instr... src/compiler/ia32/instruction-selector-ia32.cc:743: outputs[output_count++] = g.DefineAsFixed(projection1, edx); This doesn't seem quite right. Isn't edx overwritten, even if the result is not used? https://codereview.chromium.org/2428443002/diff/80001/src/compiler/instructio... File src/compiler/instruction-selector.cc (right): https://codereview.chromium.org/2428443002/diff/80001/src/compiler/instructio... src/compiler/instruction-selector.cc:939: MarkAsWord32(projection0); If projection0 is not found, do you need to mark {node} as word32? https://codereview.chromium.org/2428443002/diff/80001/test/cctest/compiler/te... File test/cctest/compiler/test-run-machops.cc (right): https://codereview.chromium.org/2428443002/diff/80001/test/cctest/compiler/te... test/cctest/compiler/test-run-machops.cc:4164: TEST(RunInt32PairAddUseOnlyHighWord) { Can you add tests for all the Int32Pair* operations? https://codereview.chromium.org/2428443002/diff/80001/test/cctest/wasm/test-r... File test/cctest/wasm/test-run-wasm-64.cc (right): https://codereview.chromium.org/2428443002/diff/80001/test/cctest/wasm/test-r... test/cctest/wasm/test-run-wasm-64.cc:150: WASM_EXEC_TEST(I64AddUseOnlyLowWord) { Can you also add tests for using only the high word?
The CQ bit was checked by ahaas@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
https://codereview.chromium.org/2428443002/diff/80001/src/compiler/ia32/instr... File src/compiler/ia32/instruction-selector-ia32.cc (right): https://codereview.chromium.org/2428443002/diff/80001/src/compiler/ia32/instr... src/compiler/ia32/instruction-selector-ia32.cc:743: outputs[output_count++] = g.DefineAsFixed(projection1, edx); On 2016/10/19 at 09:20:42, titzer wrote: > This doesn't seem quite right. Isn't edx overwritten, even if the result is not used? We always use edx because it is one of the input registers. If projection1 exists, then we also define edx as an output register. https://codereview.chromium.org/2428443002/diff/80001/src/compiler/instructio... File src/compiler/instruction-selector.cc (right): https://codereview.chromium.org/2428443002/diff/80001/src/compiler/instructio... src/compiler/instruction-selector.cc:939: MarkAsWord32(projection0); On 2016/10/19 at 09:20:42, titzer wrote: > If projection0 is not found, do you need to mark {node} as word32? I mark {node} now always as word32. https://codereview.chromium.org/2428443002/diff/80001/test/cctest/compiler/te... File test/cctest/compiler/test-run-machops.cc (right): https://codereview.chromium.org/2428443002/diff/80001/test/cctest/compiler/te... test/cctest/compiler/test-run-machops.cc:4164: TEST(RunInt32PairAddUseOnlyHighWord) { On 2016/10/19 at 09:20:42, titzer wrote: > Can you add tests for all the Int32Pair* operations? Done. https://codereview.chromium.org/2428443002/diff/80001/test/cctest/wasm/test-r... File test/cctest/wasm/test-run-wasm-64.cc (right): https://codereview.chromium.org/2428443002/diff/80001/test/cctest/wasm/test-r... test/cctest/wasm/test-run-wasm-64.cc:150: WASM_EXEC_TEST(I64AddUseOnlyLowWord) { On 2016/10/19 at 09:20:42, titzer wrote: > Can you also add tests for using only the high word? I cannot do that in a WebAssembly test at the moment because there is no instruction which accesses just the high word. I will do it in test-run-machops.cc
lgtm modulo last comment https://codereview.chromium.org/2428443002/diff/80001/src/compiler/ia32/instr... File src/compiler/ia32/instruction-selector-ia32.cc (right): https://codereview.chromium.org/2428443002/diff/80001/src/compiler/ia32/instr... src/compiler/ia32/instruction-selector-ia32.cc:743: outputs[output_count++] = g.DefineAsFixed(projection1, edx); On 2016/10/19 09:59:19, ahaas wrote: > On 2016/10/19 at 09:20:42, titzer wrote: > > This doesn't seem quite right. Isn't edx overwritten, even if the result is > not used? > > We always use edx because it is one of the input registers. If projection1 > exists, then we also define edx as an output register. That's what I mean. Shouldn't we always mark edx as clobbered so that the register allocator knows?
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by ahaas@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
On 2016/10/19 at 11:07:08, titzer wrote: > lgtm modulo last comment > > https://codereview.chromium.org/2428443002/diff/80001/src/compiler/ia32/instr... > File src/compiler/ia32/instruction-selector-ia32.cc (right): > > https://codereview.chromium.org/2428443002/diff/80001/src/compiler/ia32/instr... > src/compiler/ia32/instruction-selector-ia32.cc:743: outputs[output_count++] = g.DefineAsFixed(projection1, edx); > On 2016/10/19 09:59:19, ahaas wrote: > > On 2016/10/19 at 09:20:42, titzer wrote: > > > This doesn't seem quite right. Isn't edx overwritten, even if the result is > > not used? > > > > We always use edx because it is one of the input registers. If projection1 > > exists, then we also define edx as an output register. > > That's what I mean. Shouldn't we always mark edx as clobbered so that the register allocator knows? Done.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by ahaas@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from bmeurer@chromium.org, titzer@chromium.org Link to the patchset: https://codereview.chromium.org/2428443002/#ps120001 (title: "Use proper temp registers if Projection(1) does not exist.")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Message was sent while issue was closed.
Committed patchset #7 (id:120001)
Message was sent while issue was closed.
bradnelson@google.com changed reviewers: + bradnelson@google.com
Message was sent while issue was closed.
Note this regresses performance for asm->wasm on Fasta, Primes, Box2d, Corrections, and others by approximately double the runtime: https://chromeperf.appspot.com/graph_csv?test_path=internal.client.v8/Atom_x6... Looking into why...
Message was sent while issue was closed.
Oops, wrong link. Here's the right one: https://chromeperf.appspot.com/report?sid=fd3bfa4c0ff033c9fae08631292e2bd681c... Filling an issue: https://bugs.chromium.org/p/v8/issues/detail?id=5555
Message was sent while issue was closed.
Description was changed from ========== [wasm] Trim graph before scheduling. The scheduler expects a trimmed graph, so we have to trim the graph before scheduling. R=titzer@chromium.org, bmeurer@chromium.org TEST=cctest/test-run-wasm/RunWasmCompiled_GraphTrimming ========== to ========== [wasm] Trim graph before scheduling. The scheduler expects a trimmed graph, so we have to trim the graph before scheduling. R=titzer@chromium.org, bmeurer@chromium.org TEST=cctest/test-run-wasm/RunWasmCompiled_GraphTrimming Committed: https://crrev.com/990236825974e683640f9ca56ddd53e3e831a278 Cr-Commit-Position: refs/heads/master@{#40446} ==========
Message was sent while issue was closed.
Patchset 7 (id:??) landed as https://crrev.com/990236825974e683640f9ca56ddd53e3e831a278 Cr-Commit-Position: refs/heads/master@{#40446} |