Issue 196793026: Add self-scheduling to threaded translation (vs static)

Issue 196793026: Add self-scheduling to threaded translation (vs static) (Closed)

Created:
6 years, 9 months ago by jvoung (off chromium)

Modified:
6 years, 9 months ago

Reviewers:
Jim Stichnoth, JF

CC:
native-client-reviews_googlegroups.com

Base URL:
http://git.chromium.org/native_client/pnacl-llvm.git@master

Visibility:
Public.

More Reviews

Description

Add self-scheduling to threaded translation (vs static) For the x86-64 sandboxed translator it makes about a 5% difference over a set of 13 pexes from the web / webstore. For python.pexe this reduces the translation time w/ 4 threads from 43 seconds to 37 seconds. Most of the other web benchmarks only get slight improvements (if any). For the ARM sandboxed translator, it's about 4.5% aggregate difference for on the a15. Again, most of the apps are a wash except for python, gamecake and the gameboy emulator. For the unsandboxed translator, this is often a wash, except on a few benchmarks. Testing on spec2k: Same or better for sandboxed translator. Notably, crafty translation time drops from 1.59 secs to 1.19 secs w/ CPU utilization going from 1.8 to 2.7 w/ 4 threads. With two threads it's 2.1 secs to 1.6 secs and CPU utilization from 1.2 to 1.7. Crafty was one of the benchmarks that regressed after switching from 3.3 to 3.4 and perhaps part of it was bad scheduling. The gcc and eon benchmarks also get a nice boost. Similar on ARM: crafty drops from 5.7 to 4.8 w/ 2 threads. Use a ChunkSize of 8, to reduce the synchronization overhead somewhat, hopefully without hurting streaming translation. This tapers off to 1 when there are few functions remaining to attempt finer grained load balancing near the end. For some idea of how this affects streaming translation the average function size for the benchmarks range from 160 bytes to 1700 bytes. So 8 times that is about 1.3KB to 14KB. Geomean of average func size is 570 bytes. For that geomean, 4 threads times 8 functions would grab ~18KB at a time on geo-average-ish. BUG= http://code.google.com/p/nativeclient/issues/detail?id=3777 R=jfb@chromium.org, stichnot@chromium.org Committed: https://gerrit.chromium.org/gerrit/gitweb?p=native_client/pnacl-llvm.git;a=commit;h=91ec3c7

Patch Set 1 #

Patch Set 2 : cleanup #

Patch Set 3 : lower chunk size just in case #

Total comments: 15

Patch Set 4 : jim review #

Patch Set 5 : two dots #

Patch Set 6 : disallow copy and assign #

Total comments: 21

Patch Set 7 : JFs review #

Patch Set 8 : add assert and comment, etc. #

Patch Set 9 : reinstate check #

Created: 6 years, 9 months ago

Download [raw] [tar.bz2]

		Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+185 lines, -13 lines)			Patch
	A	tools/pnacl-llc/ThreadedFunctionQueue.h	View	1 2 3 4 5 6 7	1 chunk	+109 lines, -0 lines	0 comments	Download
	M	tools/pnacl-llc/pnacl-llc.cpp	View	1 2 3 4 5 6 7 8	11 chunks	+76 lines, -13 lines	0 comments	Download

Messages

Total messages: 12 (0 generated)

Expand Messages | Collapse Messages

Jim Stichnoth

https://codereview.chromium.org/196793026/diff/30001/tools/pnacl-llc/ThreadedFunctionQueue.h File tools/pnacl-llc/ThreadedFunctionQueue.h (right): https://codereview.chromium.org/196793026/diff/30001/tools/pnacl-llc/ThreadedFunctionQueue.h#newcode68 tools/pnacl-llc/ThreadedFunctionQueue.h:68: unsigned RecommendedChunkSize() { declare as const https://codereview.chromium.org/196793026/diff/30001/tools/pnacl-llc/ThreadedFunctionQueue.h#newcode81 tools/pnacl-llc/ThreadedFunctionQueue.h:81: unsigned ...

6 years, 9 months ago (2014-03-18 20:49:49 UTC) #2

jvoung (off chromium)

Thanks! https://codereview.chromium.org/196793026/diff/30001/tools/pnacl-llc/ThreadedFunctionQueue.h File tools/pnacl-llc/ThreadedFunctionQueue.h (right): https://codereview.chromium.org/196793026/diff/30001/tools/pnacl-llc/ThreadedFunctionQueue.h#newcode68 tools/pnacl-llc/ThreadedFunctionQueue.h:68: unsigned RecommendedChunkSize() { On 2014/03/18 20:49:49, stichnot wrote: ...

6 years, 9 months ago (2014-03-18 22:01:17 UTC) #3

Jim Stichnoth

lgtm https://codereview.chromium.org/196793026/diff/30001/tools/pnacl-llc/pnacl-llc.cpp File tools/pnacl-llc/pnacl-llc.cpp (right): https://codereview.chromium.org/196793026/diff/30001/tools/pnacl-llc/pnacl-llc.cpp#newcode135 tools/pnacl-llc/pnacl-llc.cpp:135: NoSplitModuleDynamic("no-split-module-dynamic", On 2014/03/18 22:01:17, jvoung (cr) wrote: > ...

6 years, 9 months ago (2014-03-18 22:23:12 UTC) #4

https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/ThreadedFunctionQueue.h File tools/pnacl-llc/ThreadedFunctionQueue.h (right): https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/ThreadedFunctionQueue.h#newcode23 tools/pnacl-llc/ThreadedFunctionQueue.h:23: : SplitModuleCount(SplitModuleCount), Silly corner case, but checking that SplitModuleCount ...

6 years, 9 months ago (2014-03-19 04:47:04 UTC) #5

jvoung (off chromium)

6 years, 9 months ago (2014-03-19 18:44:36 UTC) #6

https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/Threaded...
File tools/pnacl-llc/ThreadedFunctionQueue.h (right):

https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/Threaded...
tools/pnacl-llc/ThreadedFunctionQueue.h:23: :
SplitModuleCount(SplitModuleCount),
On 2014/03/19 04:47:05, JF wrote:
> Silly corner case, but checking that SplitModuleCount isn't 0 seems like a
good
> idea.

Done.

https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/Threaded...
tools/pnacl-llc/ThreadedFunctionQueue.h:35: return FuncIndex % SplitModuleCount
== ThreadIndex;
On 2014/03/19 04:47:05, JF wrote:
> IIUC this won't work if SplitModuleCount < ThreadIndex? Say you have 16
threads
> and you statically split modules 8 ways, then each function will get
translated
> twice?

It's pretty unlikely SplitModuleCount < ThreadIndex, or if it is we would have a
different parallelization strategy.

That would mean it's safe to share the same module between threads. If that was
true, then we would just share the single global module between every thread
instead of having the overhead of cloning global variables and memcpy'ing
bitcode stream bytes for the different modules.

However, this probably won't work with SplitModuleCount > # Threads either. It
would leave some functions out because there wouldn't be any threads to work on
some of the modules. In order to support that, the module sharding would need to
be organized a bit differently. E.g., some modules would take functions [0,
NumFuncs/N), some would take [NumFuncs/N, 2 * NumFuncs/N), ... Then those module
sets would be sharded further between the threads, e.g., with this modulo style
of sharding.

Overall, I think this class is only concerned about sharding things between # of
threads, and the range of functions. For now, the range of functions is [0,
NumFuncs). Perhaps SplitModuleCount should be renamed to NumThreads to be more
accurate then.

If we ever change the sharding so that SplitModuleCount != NumThreads, then we
only need to change the function range assumption away from [0, NumFuncs). The
linker will also complain loudly when we make such changes to the assumptions
without having fixed up each part.

https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/Threaded...
tools/pnacl-llc/ThreadedFunctionQueue.h:57: return
__sync_bool_compare_and_swap(&CurrentFunction, Cur, NextIndex);
On 2014/03/19 04:47:05, JF wrote:
> The failure case here doesn't actually set NextIndex at the right location:
say
> ChunkSize was 8 for this thread and 4 for another, the other thread consumes
> first, then this thread will fail consuming, but its NextIndex will be
advanced
> by 8 instead of 4.

Good catch -- I'll see if the cmpxchange succeeded or not. If it failed then
re-read CurrentFunction into NextIndex.

Otherwise, perhaps just using a mutex would have been more clean. The perf is
actually about the same w/ a mutex vs cmpxchange (at least on my linux x86
workstation)...

https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/Threaded...
tools/pnacl-llc/ThreadedFunctionQueue.h:71: std::max((int)(NumFunctions -
CurrentFunction), 1);
On 2014/03/19 04:47:05, JF wrote:
> Converting an unsigned that's not representable as signed is undefined
behavior.
> I'd just have everything be signed from the start :-)

Note that NumFunctions was unsigned to begin with because .size() returns a
size_t (unsigned). So if I make it signed, then I'd have to add a check that
size() returns something within range. Once I do that check in the ctor, I guess
I wouldn't need to do any checks after though.

https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/Threaded...
tools/pnacl-llc/ThreadedFunctionQueue.h:76: return std::max(1U,
DynamicChunkSize);
On 2014/03/19 04:47:05, JF wrote:
> return max(1, min(8, DynamicChunkSize));

Done.

https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/Threaded...
tools/pnacl-llc/ThreadedFunctionQueue.h:86: void operator=(const
ThreadedFunctionQueue&) LLVM_DELETED_FUNCTION;
On 2014/03/19 04:47:05, JF wrote:
> Not that it matters since it's deleted, but operator= usually returns T&.

LLVM's style is to have void, e.g.,

include/llvm/IR/Instruction.h:  void operator=(const Instruction &)
LLVM_DELETED_FUNCTION;

So... I'm just following that style.

https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/pnacl-ll...
File tools/pnacl-llc/pnacl-llc.cpp (right):

https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/pnacl-ll...
tools/pnacl-llc/pnacl-llc.cpp:453: if (SplitModuleSched == SplitModuleStatic) {
On 2014/03/19 04:47:05, JF wrote:
> It would be nicer if this were a switch on SplitModuleSched, since adding a
new
> scheduling type then wouldn't require changing if/else/else if, and may give a
> compiler warning on unhandled enum value.

Done.

https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/pnacl-ll...
tools/pnacl-llc/pnacl-llc.cpp:470: while (FuncIndex < NextIndex && I != E) {
On 2014/03/19 04:47:05, JF wrote:
> Can it happen that I == E? Shouldn't we have failed to get that many
functions?

It happened back when I was experimenting with ChunkSize tapering and did not
necessarily have ChunkSize taper down to 1 near the end. E.g., taper down to 2
instead.

Now that it tapers down to 1, we can assume "I" will hit "E" exactly when
FuncIndex hits NextIndex. If we ever change the tapering, then we'd have to
readjust.

https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/ThreadedFunctionQueue.h File tools/pnacl-llc/ThreadedFunctionQueue.h (right): https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/ThreadedFunctionQueue.h#newcode35 tools/pnacl-llc/ThreadedFunctionQueue.h:35: return FuncIndex % SplitModuleCount == ThreadIndex; On 2014/03/19 18:44:37, ...

6 years, 9 months ago (2014-03-19 20:26:07 UTC) #7

https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/Threaded...
File tools/pnacl-llc/ThreadedFunctionQueue.h (right):

https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/Threaded...
tools/pnacl-llc/ThreadedFunctionQueue.h:35: return FuncIndex % SplitModuleCount
== ThreadIndex;
On 2014/03/19 18:44:37, jvoung (cr) wrote:
> On 2014/03/19 04:47:05, JF wrote:
> > IIUC this won't work if SplitModuleCount < ThreadIndex? Say you have 16
> threads
> > and you statically split modules 8 ways, then each function will get
> translated
> > twice?
> 
> It's pretty unlikely SplitModuleCount < ThreadIndex, or if it is we would have
a
> different parallelization strategy.
> 
> That would mean it's safe to share the same module between threads. If that
was
> true, then we would just share the single global module between every thread
> instead of having the overhead of cloning global variables and memcpy'ing
> bitcode stream bytes for the different modules.
> 
> 
> However, this probably won't work with SplitModuleCount > # Threads either. It
> would leave some functions out because there wouldn't be any threads to work
on
> some of the modules. In order to support that, the module sharding would need
to
> be organized a bit differently. E.g., some modules would take functions [0,
> NumFuncs/N), some would take [NumFuncs/N, 2 * NumFuncs/N), ... Then those
module
> sets would be sharded further between the threads, e.g., with this modulo
style
> of sharding.
> 
> Overall, I think this class is only concerned about sharding things between #
of
> threads, and the range of functions. For now, the range of functions is [0,
> NumFuncs). Perhaps SplitModuleCount should be renamed to NumThreads to be more
> accurate then.
> 
> If we ever change the sharding so that SplitModuleCount != NumThreads, then we
> only need to change the function range assumption away from [0, NumFuncs). The
> linker will also complain loudly when we make such changes to the assumptions
> without having fixed up each part.
> 
> 

It would be useful to assert on this value then, with a comment explaining this
tradeoff.

https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/Threaded...
tools/pnacl-llc/ThreadedFunctionQueue.h:57: return
__sync_bool_compare_and_swap(&CurrentFunction, Cur, NextIndex);
On 2014/03/19 18:44:37, jvoung (cr) wrote:
> On 2014/03/19 04:47:05, JF wrote:
> > The failure case here doesn't actually set NextIndex at the right location:
> say
> > ChunkSize was 8 for this thread and 4 for another, the other thread consumes
> > first, then this thread will fail consuming, but its NextIndex will be
> advanced
> > by 8 instead of 4.
> 
> Good catch -- I'll see if the cmpxchange succeeded or not. If it failed then
> re-read CurrentFunction into NextIndex.
> 
> Otherwise, perhaps just using a mutex would have been more clean. The perf is
> actually about the same w/ a mutex vs cmpxchange (at least on my linux x86
> workstation)...

That works, but it's cleaner to just use:
if (Cur == (Index = __sync_compare_and_swap)) {
  return true;
}
NextIndex = Index;
return false;

You remove one contended read in that case.

jvoung (off chromium)

https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/ThreadedFunctionQueue.h File tools/pnacl-llc/ThreadedFunctionQueue.h (right): https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/ThreadedFunctionQueue.h#newcode35 tools/pnacl-llc/ThreadedFunctionQueue.h:35: return FuncIndex % SplitModuleCount == ThreadIndex; On 2014/03/19 20:26:08, ...

6 years, 9 months ago (2014-03-19 21:45:17 UTC) #8

https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/Threaded...
File tools/pnacl-llc/ThreadedFunctionQueue.h (right):

https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/Threaded...
tools/pnacl-llc/ThreadedFunctionQueue.h:35: return FuncIndex % SplitModuleCount
== ThreadIndex;
On 2014/03/19 20:26:08, JF wrote:
> On 2014/03/19 18:44:37, jvoung (cr) wrote:
> > On 2014/03/19 04:47:05, JF wrote:
> > > IIUC this won't work if SplitModuleCount < ThreadIndex? Say you have 16
> > threads
> > > and you statically split modules 8 ways, then each function will get
> > translated
> > > twice?
> > 
> > It's pretty unlikely SplitModuleCount < ThreadIndex, or if it is we would
have
> a
> > different parallelization strategy.
> > 
> > That would mean it's safe to share the same module between threads. If that
> was
> > true, then we would just share the single global module between every thread
> > instead of having the overhead of cloning global variables and memcpy'ing
> > bitcode stream bytes for the different modules.
> > 
> > 
> > However, this probably won't work with SplitModuleCount > # Threads either.
It
> > would leave some functions out because there wouldn't be any threads to work
> on
> > some of the modules. In order to support that, the module sharding would
need
> to
> > be organized a bit differently. E.g., some modules would take functions [0,
> > NumFuncs/N), some would take [NumFuncs/N, 2 * NumFuncs/N), ... Then those
> module
> > sets would be sharded further between the threads, e.g., with this modulo
> style
> > of sharding.
> > 
> > Overall, I think this class is only concerned about sharding things between
#
> of
> > threads, and the range of functions. For now, the range of functions is [0,
> > NumFuncs). Perhaps SplitModuleCount should be renamed to NumThreads to be
more
> > accurate then.
> > 
> > If we ever change the sharding so that SplitModuleCount != NumThreads, then
we
> > only need to change the function range assumption away from [0, NumFuncs).
The
> > linker will also complain loudly when we make such changes to the
assumptions
> > without having fixed up each part.
> > 
> > 
> 
> It would be useful to assert on this value then, with a comment explaining
this
> tradeoff.

Done.

https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/Threaded...
tools/pnacl-llc/ThreadedFunctionQueue.h:57: return
__sync_bool_compare_and_swap(&CurrentFunction, Cur, NextIndex);
On 2014/03/19 20:26:08, JF wrote:
> On 2014/03/19 18:44:37, jvoung (cr) wrote:
> > On 2014/03/19 04:47:05, JF wrote:
> > > The failure case here doesn't actually set NextIndex at the right
location:
> > say
> > > ChunkSize was 8 for this thread and 4 for another, the other thread
consumes
> > > first, then this thread will fail consuming, but its NextIndex will be
> > advanced
> > > by 8 instead of 4.
> > 
> > Good catch -- I'll see if the cmpxchange succeeded or not. If it failed then
> > re-read CurrentFunction into NextIndex.
> > 
> > Otherwise, perhaps just using a mutex would have been more clean. The perf
is
> > actually about the same w/ a mutex vs cmpxchange (at least on my linux x86
> > workstation)...
> 
> That works, but it's cleaner to just use:
> if (Cur == (Index = __sync_compare_and_swap)) {
>   return true;
> }
> NextIndex = Index;
> return false;
> 
> You remove one contended read in that case.

Done.

jvoung (off chromium)

https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/pnacl-llc.cpp File tools/pnacl-llc/pnacl-llc.cpp (right): https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/pnacl-llc.cpp#newcode470 tools/pnacl-llc/pnacl-llc.cpp:470: while (FuncIndex < NextIndex && I != E) { ...

6 years, 9 months ago (2014-03-20 15:50:30 UTC) #10

https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/pnacl-ll...
File tools/pnacl-llc/pnacl-llc.cpp (right):

https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/pnacl-ll...
tools/pnacl-llc/pnacl-llc.cpp:470: while (FuncIndex < NextIndex && I != E) {
On 2014/03/19 18:44:37, jvoung (cr) wrote:
> On 2014/03/19 04:47:05, JF wrote:
> > Can it happen that I == E? Shouldn't we have failed to get that many
> functions?
> 
> It happened back when I was experimenting with ChunkSize tapering and did not
> necessarily have ChunkSize taper down to 1 near the end. E.g., taper down to 2
> instead.
> 
> Now that it tapers down to 1, we can assume "I" will hit "E" exactly when
> FuncIndex hits NextIndex. If we ever change the tapering, then we'd have to
> readjust.

Looks like not all threads agree on how many functions and declarations there
should be:

Thread 1 finished at 115
Thread 3 finished at 116
Thread 0 finished at 116
Thread 2 finished at 116

This can happen in small programs if some thread is "lucky" and never sees a
function w/ a function call... when they run the ResolvePNaClIntrinsics.cpp
function pass, they won't end up adding intrinsic declarations to the module
because that only happens when visitCall() is used. Meanwhile other threads will
probably see a function call and add an intrinsic declaration. Adding an
intrinsic declaration adds to the set of things that can be iterated over...

If the threads don't agree on where the end should be, then one of the threads
may be in a position where they call GrabFunctionDynamic() w/ say FuncIndex =
TrueEndForThread - 50, lose the race and are told that NextIndex shall be
TrueEndEndForThread + NumExtraDecls (because TrueEndForThread + NumExtraDecls ==
TrueEndForSomeOtherThread).

Going to add back the check for now.

I think ultimately, our intrinsic resolving passes should add needed
declarations up front early and consistently, so that there isn't this type of
difference. Then we can remove the check for I != E. Also, by adding the decls
early, the NumFunctions used to initialized FuncQueue will be accurate and
include the added decls, instead of being an underapproximation.

On 2014/03/20 15:50:30, jvoung (cr) wrote: > https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/pnacl-llc.cpp > File tools/pnacl-llc/pnacl-llc.cpp (right): > > https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/pnacl-llc.cpp#newcode470 ...

6 years, 9 months ago (2014-03-20 16:02:42 UTC) #11

On 2014/03/20 15:50:30, jvoung (cr) wrote:
>
https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/pnacl-ll...
> File tools/pnacl-llc/pnacl-llc.cpp (right):
> 
>
https://codereview.chromium.org/196793026/diff/80001/tools/pnacl-llc/pnacl-ll...
> tools/pnacl-llc/pnacl-llc.cpp:470: while (FuncIndex < NextIndex && I != E) {
> On 2014/03/19 18:44:37, jvoung (cr) wrote:
> > On 2014/03/19 04:47:05, JF wrote:
> > > Can it happen that I == E? Shouldn't we have failed to get that many
> > functions?
> > 
> > It happened back when I was experimenting with ChunkSize tapering and did
not
> > necessarily have ChunkSize taper down to 1 near the end. E.g., taper down to
2
> > instead.
> > 
> > Now that it tapers down to 1, we can assume "I" will hit "E" exactly when
> > FuncIndex hits NextIndex. If we ever change the tapering, then we'd have to
> > readjust.
> 
> Looks like not all threads agree on how many functions and declarations there
> should be:
> 
> Thread 1 finished at 115
> Thread 3 finished at 116
> Thread 0 finished at 116
> Thread 2 finished at 116
> 
> This can happen in small programs if some thread is "lucky" and never sees a
> function w/ a function call... when they run the ResolvePNaClIntrinsics.cpp
> function pass, they won't end up adding intrinsic declarations to the module
> because that only happens when visitCall() is used. Meanwhile other threads
will
> probably see a function call and add an intrinsic declaration. Adding an
> intrinsic declaration adds to the set of things that can be iterated over...
> 
> If the threads don't agree on where the end should be, then one of the threads
> may be in a position where they call GrabFunctionDynamic() w/ say FuncIndex =
> TrueEndForThread - 50, lose the race and are told that NextIndex shall be
> TrueEndEndForThread + NumExtraDecls (because TrueEndForThread + NumExtraDecls
==
> TrueEndForSomeOtherThread).
> 
> Going to add back the check for now.
> 
> I think ultimately, our intrinsic resolving passes should add needed
> declarations up front early and consistently, so that there isn't this type of
> difference. Then we can remove the check for I != E. Also, by adding the decls
> early, the NumFunctions used to initialized FuncQueue will be accurate and
> include the added decls, instead of being an underapproximation.

Sounds good. Could you leave a comment that summarizes this?

jvoung (off chromium)

6 years, 9 months ago (2014-03-20 23:17:00 UTC) #12

Message was sent while issue was closed.

Committed patchset #9 manually as r91ec3c7 (presubmit successful).

Expand Messages | Collapse Messages