Issue 2395063002: [wasm] Fix wasm instantiation flakes

Mircea Trofin

The CQ bit was checked by mtrofin@chromium.org to run a CQ dry run

4 years, 2 months ago (2016-10-06 05:54:12 UTC) #1

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2395063002/1

4 years, 2 months ago (2016-10-06 05:54:14 UTC) #2

Mircea Trofin

Description was changed from ========== [wasm] fix flaky asm-wasm BUG= ========== to ========== [wasm] fix ...

4 years, 2 months ago (2016-10-06 06:04:31 UTC) #3

Mircea Trofin

mtrofin@chromium.org changed reviewers: + bradnelson@chromium.org, titzer@chromium.org

4 years, 2 months ago (2016-10-06 06:04:32 UTC) #4

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 2 months ago (2016-10-06 06:32:59 UTC) #5

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

4 years, 2 months ago (2016-10-06 06:33:00 UTC) #6

Mircea Trofin

Description was changed from ========== [wasm] fix flaky asm-wasm The spurious failures were caused by ...

4 years, 2 months ago (2016-10-06 06:43:26 UTC) #7

titzer

https://codereview.chromium.org/2395063002/diff/1/src/wasm/wasm-module.cc File src/wasm/wasm-module.cc (right): https://codereview.chromium.org/2395063002/diff/1/src/wasm/wasm-module.cc#newcode1215 src/wasm/wasm-module.cc:1215: WeakCell* tmp = original->ptr_to_weak_owning_instance(); Why do we have a ...

4 years, 2 months ago (2016-10-07 12:40:32 UTC) #8

Mircea Trofin

https://codereview.chromium.org/2395063002/diff/1/src/wasm/wasm-module.cc File src/wasm/wasm-module.cc (right): https://codereview.chromium.org/2395063002/diff/1/src/wasm/wasm-module.cc#newcode1215 src/wasm/wasm-module.cc:1215: WeakCell* tmp = original->ptr_to_weak_owning_instance(); On 2016/10/07 12:40:32, titzer wrote: ...

4 years, 2 months ago (2016-10-07 14:44:24 UTC) #9

titzer

https://codereview.chromium.org/2395063002/diff/1/src/wasm/wasm-module.cc File src/wasm/wasm-module.cc (right): https://codereview.chromium.org/2395063002/diff/1/src/wasm/wasm-module.cc#newcode1221 src/wasm/wasm-module.cc:1221: code_table = factory->CopyFixedArray(old_code_table); On 2016/10/07 14:44:23, Mircea Trofin wrote: ...

4 years, 2 months ago (2016-10-07 15:37:42 UTC) #10

Mircea Trofin

On 2016/10/07 15:37:42, titzer wrote: > https://codereview.chromium.org/2395063002/diff/1/src/wasm/wasm-module.cc > File src/wasm/wasm-module.cc (right): > > https://codereview.chromium.org/2395063002/diff/1/src/wasm/wasm-module.cc#newcode1221 > ...

4 years, 2 months ago (2016-10-07 15:49:43 UTC) #11

Mircea Trofin

The CQ bit was checked by mtrofin@chromium.org to run a CQ dry run

4 years, 2 months ago (2016-10-07 18:15:15 UTC) #12

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2395063002/20001

4 years, 2 months ago (2016-10-07 18:15:19 UTC) #13

Mircea Trofin

Description was changed from ========== [wasm] fix flaky asm-wasm The spurious failures were caused by ...

4 years, 2 months ago (2016-10-07 18:27:05 UTC) #14

Description was changed from

==========
[wasm] fix flaky asm-wasm

The spurious failures were caused by finalization of compiled module
template between time we get its code object array and time we 
determine if it was owned. In turn, this meant that we didn't use the
correct address for patching references to globals.

I plan to make a more substantial change to improve the robustness
of the instance management, by upfronting all "data collection" phases. 
For now, this change should remove the flake.

BUG=v8:5451
==========

to

==========
[wasm] fix flaky asm-wasm

The spurious failures were caused by the compiled module
template and its corresponding owning object getting out of
sync due to memory allocations (which may trigger GC) between the points each
were fetched. 

Specifically, the {original} was first obtained; then a GC 
may happen when cloning the {code_table}. At this point, 
the {original}'s owner may have been collected, getting us
down the path of not cloning. When time comes to patch up
globals, we incorrectly try to patch them assuming the
global start is at 0 (nullptr), which in fact it isn't.

This change roots early, in a GC-free area, both objects. 
Additionally, it avoids publishing to the instances chain 
the new instance until the very end. This way:
- the objects used to create the new instance offer a 
consistent view
- the instances chain does not see the object we try to
form. If something fails, we can safely retry.
- since the owner is rooted, the state of the front of the 
instances chain stays unchanged - with the same compiled
module we started from. So the early belief that we needed
to clone is not invalidated by any interspersed GC.

This situation suffers from a sub-optimality discussed in
the design document, in that, in a memory constrained 
system, the following snippet may surprisingly fail:


var m = new WebAssembly.Module(...);
var i1 = new WebAssembly.Instance(m);
i1 = null;
var i2 = new WebAssembly.Instance(m); //may fail.

This will be addressed subsequently.

BUG=v8:5451
==========

Mircea Trofin

Description was changed from ========== [wasm] fix flaky asm-wasm The spurious failures were caused by ...

4 years, 2 months ago (2016-10-07 18:28:09 UTC) #15

Description was changed from

==========
[wasm] fix flaky asm-wasm

The spurious failures were caused by the compiled module
template and its corresponding owning object getting out of
sync due to memory allocations (which may trigger GC) between the points each
were fetched. 

Specifically, the {original} was first obtained; then a GC 
may happen when cloning the {code_table}. At this point, 
the {original}'s owner may have been collected, getting us
down the path of not cloning. When time comes to patch up
globals, we incorrectly try to patch them assuming the
global start is at 0 (nullptr), which in fact it isn't.

This change roots early, in a GC-free area, both objects. 
Additionally, it avoids publishing to the instances chain 
the new instance until the very end. This way:
- the objects used to create the new instance offer a 
consistent view
- the instances chain does not see the object we try to
form. If something fails, we can safely retry.
- since the owner is rooted, the state of the front of the 
instances chain stays unchanged - with the same compiled
module we started from. So the early belief that we needed
to clone is not invalidated by any interspersed GC.

This situation suffers from a sub-optimality discussed in
the design document, in that, in a memory constrained 
system, the following snippet may surprisingly fail:


var m = new WebAssembly.Module(...);
var i1 = new WebAssembly.Instance(m);
i1 = null;
var i2 = new WebAssembly.Instance(m); //may fail.

This will be addressed subsequently.

BUG=v8:5451
==========

to

==========
[wasm] fix flaky asm-wasm

The spurious failures were caused by the compiled module
template and its corresponding owning object getting out of
sync due to memory allocations (which may trigger GC)
between the points each were fetched. 

Specifically, the {original} was first obtained; then a GC 
may happen when cloning the {code_table}. At this point, 
the {original}'s owner may have been collected, getting us
down the path of not cloning. When time comes to patch up
globals, we incorrectly try to patch them assuming the
global start is at 0 (nullptr), which in fact it isn't.

This change roots early, in a GC-free area, both objects. 
Additionally, it avoids publishing to the instances chain 
the new instance until the very end. This way:
- the objects used to create the new instance offer a 
consistent view
- the instances chain does not see the object we try to
form. If something fails, we can safely retry.
- since the owner is rooted, the state of the front of the 
instances chain stays unchanged - with the same compiled
module we started from. So the early belief that we needed
to clone is not invalidated by any interspersed GC.

This situation suffers from a sub-optimality discussed in
the design document, in that, in a memory constrained 
system, the following snippet may surprisingly fail:


var m = new WebAssembly.Module(...);
var i1 = new WebAssembly.Instance(m);
i1 = null;
var i2 = new WebAssembly.Instance(m); //may fail.

This will be addressed subsequently.

BUG=v8:5451
==========

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 2 months ago (2016-10-07 18:52:31 UTC) #16

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

4 years, 2 months ago (2016-10-07 18:52:32 UTC) #17

Mircea Trofin

The CQ bit was checked by mtrofin@chromium.org to run a CQ dry run

4 years, 2 months ago (2016-10-07 20:27:57 UTC) #18

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2395063002/40001

4 years, 2 months ago (2016-10-07 20:27:59 UTC) #19

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 2 months ago (2016-10-07 21:06:58 UTC) #22

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

4 years, 2 months ago (2016-10-07 21:06:59 UTC) #23

Mircea Trofin

The CQ bit was checked by mtrofin@chromium.org to run a CQ dry run

4 years, 2 months ago (2016-10-07 21:12:12 UTC) #24

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2395063002/60001

4 years, 2 months ago (2016-10-07 21:12:15 UTC) #25

Mircea Trofin

Description was changed from ========== [wasm] fix flaky asm-wasm The spurious failures were caused by ...

4 years, 2 months ago (2016-10-07 21:19:45 UTC) #26

Description was changed from

==========
[wasm] fix flaky asm-wasm

The spurious failures were caused by the compiled module
template and its corresponding owning object getting out of
sync due to memory allocations (which may trigger GC)
between the points each were fetched. 

Specifically, the {original} was first obtained; then a GC 
may happen when cloning the {code_table}. At this point, 
the {original}'s owner may have been collected, getting us
down the path of not cloning. When time comes to patch up
globals, we incorrectly try to patch them assuming the
global start is at 0 (nullptr), which in fact it isn't.

This change roots early, in a GC-free area, both objects. 
Additionally, it avoids publishing to the instances chain 
the new instance until the very end. This way:
- the objects used to create the new instance offer a 
consistent view
- the instances chain does not see the object we try to
form. If something fails, we can safely retry.
- since the owner is rooted, the state of the front of the 
instances chain stays unchanged - with the same compiled
module we started from. So the early belief that we needed
to clone is not invalidated by any interspersed GC.

This situation suffers from a sub-optimality discussed in
the design document, in that, in a memory constrained 
system, the following snippet may surprisingly fail:


var m = new WebAssembly.Module(...);
var i1 = new WebAssembly.Instance(m);
i1 = null;
var i2 = new WebAssembly.Instance(m); //may fail.

This will be addressed subsequently.

BUG=v8:5451
==========

to

==========
[wasm] Fix wasm instantiation flakes

The spurious failures were caused by the compiled module
template and its corresponding owning object getting out of
sync due to memory allocations (which may trigger GC)
between the points each were fetched. 

Specifically, the {original} was first obtained; then a GC 
may happen when cloning the {code_table}. At this point, 
the {original}'s owner may have been collected, getting us
down the path of not cloning. When time comes to patch up
globals, we incorrectly try to patch them assuming the
global start is at 0 (nullptr), which in fact it isn't.

This change roots early, in a GC-free area, both objects. 
Additionally, it avoids publishing to the instances chain 
the new instance until the very end. This way:
- the objects used to create the new instance offer a 
consistent view
- the instances chain does not see the object we try to
form. If something fails, we can safely retry.
- since the owner is rooted, the state of the front of the 
instances chain stays unchanged - with the same compiled
module we started from. So the early belief that we needed
to clone is not invalidated by any interspersed GC.

This situation suffers from a sub-optimality discussed in
the design document, in that, in a memory constrained 
system, the following snippet may surprisingly fail:


var m = new WebAssembly.Module(...);
var i1 = new WebAssembly.Instance(m);
i1 = null;
var i2 = new WebAssembly.Instance(m); //may fail.

This will be addressed subsequently.

BUG=v8:5451
==========

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 2 months ago (2016-10-07 21:45:43 UTC) #27

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

4 years, 2 months ago (2016-10-07 21:45:44 UTC) #28

titzer

On 2016/10/07 15:49:43, Mircea Trofin wrote: > On 2016/10/07 15:37:42, titzer wrote: > > https://codereview.chromium.org/2395063002/diff/1/src/wasm/wasm-module.cc ...

4 years, 2 months ago (2016-10-08 09:39:57 UTC) #29

Mircea Trofin

Gentle reminder - ptal. I re-ran the stress tests a few times. All green, consistently, ...

4 years, 2 months ago (2016-10-10 06:09:20 UTC) #30

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2395063002/60001

4 years, 2 months ago (2016-10-10 14:25:54 UTC) #33

commit-bot: I haz the power

Description was changed from ========== [wasm] Fix wasm instantiation flakes The spurious failures were caused ...

4 years, 2 months ago (2016-10-10 14:53:52 UTC) #34

Message was sent while issue was closed.

Description was changed from

==========
[wasm] Fix wasm instantiation flakes

The spurious failures were caused by the compiled module
template and its corresponding owning object getting out of
sync due to memory allocations (which may trigger GC)
between the points each were fetched. 

Specifically, the {original} was first obtained; then a GC 
may happen when cloning the {code_table}. At this point, 
the {original}'s owner may have been collected, getting us
down the path of not cloning. When time comes to patch up
globals, we incorrectly try to patch them assuming the
global start is at 0 (nullptr), which in fact it isn't.

This change roots early, in a GC-free area, both objects. 
Additionally, it avoids publishing to the instances chain 
the new instance until the very end. This way:
- the objects used to create the new instance offer a 
consistent view
- the instances chain does not see the object we try to
form. If something fails, we can safely retry.
- since the owner is rooted, the state of the front of the 
instances chain stays unchanged - with the same compiled
module we started from. So the early belief that we needed
to clone is not invalidated by any interspersed GC.

This situation suffers from a sub-optimality discussed in
the design document, in that, in a memory constrained 
system, the following snippet may surprisingly fail:


var m = new WebAssembly.Module(...);
var i1 = new WebAssembly.Instance(m);
i1 = null;
var i2 = new WebAssembly.Instance(m); //may fail.

This will be addressed subsequently.

BUG=v8:5451
==========

to

==========
[wasm] Fix wasm instantiation flakes

The spurious failures were caused by the compiled module
template and its corresponding owning object getting out of
sync due to memory allocations (which may trigger GC)
between the points each were fetched. 

Specifically, the {original} was first obtained; then a GC 
may happen when cloning the {code_table}. At this point, 
the {original}'s owner may have been collected, getting us
down the path of not cloning. When time comes to patch up
globals, we incorrectly try to patch them assuming the
global start is at 0 (nullptr), which in fact it isn't.

This change roots early, in a GC-free area, both objects. 
Additionally, it avoids publishing to the instances chain 
the new instance until the very end. This way:
- the objects used to create the new instance offer a 
consistent view
- the instances chain does not see the object we try to
form. If something fails, we can safely retry.
- since the owner is rooted, the state of the front of the 
instances chain stays unchanged - with the same compiled
module we started from. So the early belief that we needed
to clone is not invalidated by any interspersed GC.

This situation suffers from a sub-optimality discussed in
the design document, in that, in a memory constrained 
system, the following snippet may surprisingly fail:


var m = new WebAssembly.Module(...);
var i1 = new WebAssembly.Instance(m);
i1 = null;
var i2 = new WebAssembly.Instance(m); //may fail.

This will be addressed subsequently.

BUG=v8:5451
==========

commit-bot: I haz the power

Description was changed from ========== [wasm] Fix wasm instantiation flakes The spurious failures were caused ...

4 years, 2 months ago (2016-10-10 14:54:02 UTC) #36

Message was sent while issue was closed.

Description was changed from

==========
[wasm] Fix wasm instantiation flakes

The spurious failures were caused by the compiled module
template and its corresponding owning object getting out of
sync due to memory allocations (which may trigger GC)
between the points each were fetched. 

Specifically, the {original} was first obtained; then a GC 
may happen when cloning the {code_table}. At this point, 
the {original}'s owner may have been collected, getting us
down the path of not cloning. When time comes to patch up
globals, we incorrectly try to patch them assuming the
global start is at 0 (nullptr), which in fact it isn't.

This change roots early, in a GC-free area, both objects. 
Additionally, it avoids publishing to the instances chain 
the new instance until the very end. This way:
- the objects used to create the new instance offer a 
consistent view
- the instances chain does not see the object we try to
form. If something fails, we can safely retry.
- since the owner is rooted, the state of the front of the 
instances chain stays unchanged - with the same compiled
module we started from. So the early belief that we needed
to clone is not invalidated by any interspersed GC.

This situation suffers from a sub-optimality discussed in
the design document, in that, in a memory constrained 
system, the following snippet may surprisingly fail:


var m = new WebAssembly.Module(...);
var i1 = new WebAssembly.Instance(m);
i1 = null;
var i2 = new WebAssembly.Instance(m); //may fail.

This will be addressed subsequently.

BUG=v8:5451
==========

to

==========
[wasm] Fix wasm instantiation flakes

The spurious failures were caused by the compiled module
template and its corresponding owning object getting out of
sync due to memory allocations (which may trigger GC)
between the points each were fetched.

Specifically, the {original} was first obtained; then a GC
may happen when cloning the {code_table}. At this point,
the {original}'s owner may have been collected, getting us
down the path of not cloning. When time comes to patch up
globals, we incorrectly try to patch them assuming the
global start is at 0 (nullptr), which in fact it isn't.

This change roots early, in a GC-free area, both objects.
Additionally, it avoids publishing to the instances chain
the new instance until the very end. This way:
- the objects used to create the new instance offer a
consistent view
- the instances chain does not see the object we try to
form. If something fails, we can safely retry.
- since the owner is rooted, the state of the front of the
instances chain stays unchanged - with the same compiled
module we started from. So the early belief that we needed
to clone is not invalidated by any interspersed GC.

This situation suffers from a sub-optimality discussed in
the design document, in that, in a memory constrained
system, the following snippet may surprisingly fail:

var m = new WebAssembly.Module(...);
var i1 = new WebAssembly.Instance(m);
i1 = null;
var i2 = new WebAssembly.Instance(m); //may fail.

This will be addressed subsequently.

BUG=v8:5451

Committed: https://crrev.com/b75a0c4a555278de8c59695e55f26a4e2ea6c862
Cr-Commit-Position: refs/heads/master@{#40126}
==========

commit-bot: I haz the power

4 years, 2 months ago (2016-10-10 14:54:03 UTC) #37

Message was sent while issue was closed.

Patchset 3 (id:??) landed as
https://crrev.com/b75a0c4a555278de8c59695e55f26a4e2ea6c862
Cr-Commit-Position: refs/heads/master@{#40126}

Issue 2395063002: [wasm] Fix wasm instantiation flakes (Closed)

Description

Patch Set 1 #

Patch Set 2 : better fix #

Patch Set 3 : better fix #

Messages