Issue 11231021: Zygote: Initialize NSS fully

Issue 11231021: Zygote: Initialize NSS fully (Closed)

Created:
8 years, 2 months ago by jln (very slow on Chromium)

Modified:
8 years, 1 month ago

Reviewers:
wtc, ddorwin, Ryan Sleevi

CC:
chromium-reviews, joi+watch-content_chromium.org, darin-cc_chromium.org, jam, Jorge Lucangeli Obes, Ami GONE FROM CHROMIUM, Tom Finegan

Base URL:
svn://svn.chromium.org/chrome/trunk/src

Visibility:
Public.

More Reviews

Description

Linux: Initialize NSS fully in the Zygote. We fully initialize NSS in the Zygote. We do so after the setuid sandbox initialization, in order to assert that initializing NSS won't do something we don't like "behind our back". This allows to initialize NSS only once for every renderer. This also guarantees that we get initialized before the seccomp-bpf sandbox starts which in turn will allow us to tighten it in the future. BUG=156864

Patch Set 1 #

Total comments: 10

Patch Set 2 : #

Total comments: 6

Patch Set 3 : Update comment. #

Patch Set 4 : rebase #

Created: 8 years, 2 months ago

Download [raw] [tar.bz2]

		Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+18 lines, -0 lines)			Patch
	M	content/zygote/zygote_main_linux.cc	View	1 2 3	1 chunk	+18 lines, -0 lines	0 comments	Download

Messages

Total messages: 15 (0 generated)

Expand Messages | Collapse Messages

Ryan Sleevi

I appreciate the quick turn around on the change, but I'm still not sure I ...

8 years, 2 months ago (2012-10-19 23:50:10 UTC) #2

I appreciate the quick turn around on the change, but I'm still not sure I
understand the issue(s), which is why I'm still somewhat opposed to this.

I'm all for being able to tighten the sandbox, but to my knowledge, the only
hole that has had to be punched for NSS is

1) Ensure the .sos can be loaded. Doesn't call any functions in them, just
ensures they're loaded
2) Ensure NSS has access to /dev/urandom - which arguably is of benefit to any
zygote'd process

Your replies both on the bug and comments here suggest there are some greater
holes that have been punched, and I'd like to understand those. NSS doesn't
(IIRC) cache the urandom FD, and does periodically try to access it (admittedly,
roughly around after 2^48 bytes of data have been read, although it can be
induced to do so sooner), so I'm not sure this will in any way prevent #2 from
being necessary.

https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li...
File content/zygote/zygote_main_linux.cc (right):

https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li...
content/zygote/zygote_main_linux.cc:502: crypto::DisableNSSForkCheck();
Is modifying the |env| like this permitted at this time?

For some reason I was thinking this needed to be done before the first layer.

https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li...
content/zygote/zygote_main_linux.cc:504: // in turn attempts to open a file.
drop the comment on 503-504. It doesn't matter what function - ANY attempt to
open NSS will attempt to open /dev/urandom - which I'd argue the sandbox should
allow (hence the libc overrides)

Or are you seeing a different file?

https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li...
content/zygote/zygote_main_linux.cc:506: // leak descriptors to private files.
What files are being opened? What descriptors are you worried about? I'm not
sure I can make sense of this comment?

https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li...
content/zygote/zygote_main_linux.cc:509: crypto::EnsureNSSInit();
s/benefit/suffer/ :)

It's not clear to me what holes that are punched can be closed with this, thus
it's not clear to me the benefit this brings, compared to the overhead.

ddorwin

This works for me in both chrome and content_browsertests. I can't speak to the correctness, ...

8 years, 2 months ago (2012-10-20 00:45:32 UTC) #3

jln (very slow on Chromium)

Hopefully that's more clear. https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_linux.cc File content/zygote/zygote_main_linux.cc (right): https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_linux.cc#newcode502 content/zygote/zygote_main_linux.cc:502: crypto::DisableNSSForkCheck(); On 2012/10/19 23:50:10, Ryan ...

8 years, 2 months ago (2012-10-20 00:51:29 UTC) #4

Hopefully that's more clear.

https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li...
File content/zygote/zygote_main_linux.cc (right):

https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li...
content/zygote/zygote_main_linux.cc:502: crypto::DisableNSSForkCheck();
On 2012/10/19 23:50:10, Ryan Sleevi wrote:
> Is modifying the |env| like this permitted at this time?
> 
> For some reason I was thinking this needed to be done before the first layer.

I didn't now that the env was affected, but I don't see any reason why that
would be a problem, no.

https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li...
content/zygote/zygote_main_linux.cc:504: // in turn attempts to open a file.
On 2012/10/19 23:50:10, Ryan Sleevi wrote:
> drop the comment on 503-504. It doesn't matter what function - ANY attempt to
> open NSS will attempt to open /dev/urandom - which I'd argue the sandbox
should
> allow (hence the libc overrides)
> 
> Or are you seeing a different file?

Done.

https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li...
content/zygote/zygote_main_linux.cc:506: // leak descriptors to private files.
On 2012/10/19 23:50:10, Ryan Sleevi wrote:
> What files are being opened? What descriptors are you worried about? I'm not
> sure I can make sense of this comment?

I modified it to make it more generic. What we don't want, is as you explained,
a module to load private data to memory or to open file descriptors and keep
them open.

https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li...
content/zygote/zygote_main_linux.cc:509: crypto::EnsureNSSInit();
On 2012/10/19 23:50:10, Ryan Sleevi wrote:
> s/benefit/suffer/ :)
> 
> It's not clear to me what holes that are punched can be closed with this, thus
> it's not clear to me the benefit this brings, compared to the overhead.

When this function is called, NSS will make some system calls that we want to
restrict (they bring attack surface).

For instance: advanced scheduling system calls and sysinfo / uname.

This is why crbug.com/156864 exists: the seccomp-bpf sandbox (that restricts the
kernel's API) was started before NSS.

In renderers, this also happens, except, unlike for PPAPI, in renderers we
allowed sysinfo / uname and some of the advanced scheduling operations (this is
the "hole" I'm talking about). I'm now thinking of closing it.

I'm trying to understand the impact of EnsureNSSInit(). Am I correct that if
anything calls, say, crypto::HMAC, this will have the same impact that
EnsureNSSInit() ?

I would think calling it once and for all from the Zygote is a good thing. All
processes forking from the Zygote (renderers, PPAPI), wouldn't have to go
through it again.

Ryan Sleevi

On 2012/10/20 00:51:29, Julien Tinnes wrote: > Hopefully that's more clear. > > https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_linux.cc > ...

8 years, 2 months ago (2012-10-20 01:06:30 UTC) #5

On 2012/10/20 00:51:29, Julien Tinnes wrote:
> Hopefully that's more clear.
> 
>
https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li...
> File content/zygote/zygote_main_linux.cc (right):
> 
>
https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li...
> content/zygote/zygote_main_linux.cc:502: crypto::DisableNSSForkCheck();
> On 2012/10/19 23:50:10, Ryan Sleevi wrote:
> > Is modifying the |env| like this permitted at this time?
> > 
> > For some reason I was thinking this needed to be done before the first
layer.
> 
> I didn't now that the env was affected, but I don't see any reason why that
> would be a problem, no.
> 
>
https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li...
> content/zygote/zygote_main_linux.cc:504: // in turn attempts to open a file.
> On 2012/10/19 23:50:10, Ryan Sleevi wrote:
> > drop the comment on 503-504. It doesn't matter what function - ANY attempt
to
> > open NSS will attempt to open /dev/urandom - which I'd argue the sandbox
> should
> > allow (hence the libc overrides)
> > 
> > Or are you seeing a different file?
> 
> Done.
> 
>
https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li...
> content/zygote/zygote_main_linux.cc:506: // leak descriptors to private files.
> On 2012/10/19 23:50:10, Ryan Sleevi wrote:
> > What files are being opened? What descriptors are you worried about? I'm not
> > sure I can make sense of this comment?
> 
> I modified it to make it more generic. What we don't want, is as you
explained,
> a module to load private data to memory or to open file descriptors and keep
> them open.
> 
>
https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li...
> content/zygote/zygote_main_linux.cc:509: crypto::EnsureNSSInit();
> On 2012/10/19 23:50:10, Ryan Sleevi wrote:
> > s/benefit/suffer/ :)
> > 
> > It's not clear to me what holes that are punched can be closed with this,
thus
> > it's not clear to me the benefit this brings, compared to the overhead.
> 
> When this function is called, NSS will make some system calls that we want to
> restrict (they bring attack surface).
> 
> For instance: advanced scheduling system calls and sysinfo / uname.
> 
> This is why crbug.com/156864 exists: the seccomp-bpf sandbox (that restricts
the
> kernel's API) was started before NSS.
> 
> In renderers, this also happens, except, unlike for PPAPI, in renderers we
> allowed sysinfo / uname and some of the advanced scheduling operations (this
is
> the "hole" I'm talking about). I'm now thinking of closing it.
> 
> I'm trying to understand the impact of EnsureNSSInit(). Am I correct that if
> anything calls, say, crypto::HMAC, this will have the same impact that
> EnsureNSSInit() ?

Yes.

However, it's presently *only* called by the Chromoting renderer process (rarely
used relative to other processes), the CDM out of process (not at all used yet?
but still, presumably very rarely relative to other processes), and the NACL
validator (as well, rarely used relative to other procseses)

I *don't* like that this adds overhead to every renderer process startup, unless
I'm misunderstanding. What I currently understand is that this code is run every
time the zygote spawns into another sub-process (meaning once per renderer).
Even if it was once-per-zygote, I'd be concerned about the extra overhead of
loading the various NSS structures, error tables, etc into memory, when the vast
majority of the time, they will not be used.

> 
> I would think calling it once and for all from the Zygote is a good thing. All
> processes forking from the Zygote (renderers, PPAPI), wouldn't have to go
> through it again.

If I'm understanding this, this means it's pre-fork init. I'm not sure how well
NSS will handle that (today, we're doing post-fork init AIUI).

I'm admit, my ignorance here is no doubt impeding the review here, I'm still
unclear as to how crbug.com/156864 relates to the seccomp-bpf sandbox. The
backtrace that fischman@ included pointed not at seccomp-bpf, but at urandom not
being primed for the process type.

I'd rather split the discussion / change into two: IF it's just that the CDM
needs urandom, then the CDM/OOP pepper plugin should set up that hole, and
possibly in the zygote. This is a known and easy to reason about change, and
that's why I don't see any hangups.

Then we can begin a separate discussion about how to close down the seccomp-bpf
policies, including possible changes to NSS to make it friendlier to sandboxing
under these policies. For example, the only reason it's calling sysinfo is
because it's unable to open the rand FD (which on CrOS we've modified to
intentionally crash, to catch situations where people forget to prime urandom).
So if we solve the random issue, perhaps we solve your issue with sysinfo /
uname.

jln (very slow on Chromium)

https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_linux.cc File content/zygote/zygote_main_linux.cc (right): https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_linux.cc#newcode493 content/zygote/zygote_main_linux.cc:493: #if defined(USE_NSS) On 2012/10/20 00:45:32, ddorwin wrote: > If ...

8 years, 2 months ago (2012-10-20 01:14:08 UTC) #6

Ryan Sleevi

On 2012/10/20 01:14:08, Julien Tinnes wrote: > https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_linux.cc > File content/zygote/zygote_main_linux.cc (right): > > https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_linux.cc#newcode493 ...

8 years, 2 months ago (2012-10-20 01:18:38 UTC) #7

Ryan Sleevi

As noted in http://src.chromium.org/viewvc/chrome?view=rev&revision=78633 , the original intent was that NSS would only be initialized ...

8 years, 2 months ago (2012-10-20 01:20:19 UTC) #8

jln (very slow on Chromium)

On 2012/10/20 01:06:30, Ryan Sleevi wrote: > > I'm trying to understand the impact of ...

8 years, 2 months ago (2012-10-20 01:40:33 UTC) #9

On 2012/10/20 01:06:30, Ryan Sleevi wrote:

> > I'm trying to understand the impact of EnsureNSSInit(). Am I correct that if
> > anything calls, say, crypto::HMAC, this will have the same impact that
> > EnsureNSSInit() ?
> 
> Yes.
> 
> However, it's presently *only* called by the Chromoting renderer process
(rarely
> used relative to other processes), the CDM out of process (not at all used
yet?
> but still, presumably very rarely relative to other processes), and the NACL
> validator (as well, rarely used relative to other procseses)
> 
> I *don't* like that this adds overhead to every renderer process startup,
unless
> I'm misunderstanding. What I currently understand is that this code is run
every
> time the zygote spawns into another sub-process (meaning once per renderer).
> Even if it was once-per-zygote, I'd be concerned about the extra overhead of
> loading the various NSS structures, error tables, etc into memory, when the
vast
> majority of the time, they will not be used.

Yes, you're misunderstanding that part: ZygoteMain() is called only once per
instance of Chrome.

Your point about the memory overhead is understood, but I would argue that it
may not be a problem since at least these structures will be shared between all
processes (at fork()).

Moreover, we currently DO call EnsureNSSInit() when creating every renderer it
seems, as David pointed out.

> > 
> > I would think calling it once and for all from the Zygote is a good thing.
All
> > processes forking from the Zygote (renderers, PPAPI), wouldn't have to go
> > through it again.
> 
> If I'm understanding this, this means it's pre-fork init. I'm not sure how
well
> NSS will handle that (today, we're doing post-fork init AIUI).

This is indeed pre-fork init.

> I'm admit, my ignorance here is no doubt impeding the review here, I'm still
> unclear as to how crbug.com/156864 relates to the seccomp-bpf sandbox. The
> backtrace that fischman@ included pointed not at seccomp-bpf, but at urandom
not
> being primed for the process type.

This is the result of a gdb attach. Maybe it's inaccurate ? fischman@ ?

There *definitely* is a sandbox-bpf violation happening.

> I'd rather split the discussion / change into two: IF it's just that the CDM
> needs urandom, then the CDM/OOP pepper plugin should set up that hole, and
> possibly in the zygote. This is a known and easy to reason about change, and
> that's why I don't see any hangups.

I don't know about the urandom issue (nor do I understand why it would happen,
any process forked from the Zygote should have access to this).

There is an additional problem with sandbox-bpf though.

What you say about them being related is interesting.

Note that the urandom "hole" is very fragile, it relies on hooking open() via
the dynamic linker. Maybe that fails for PPAPI plugins?

> Then we can begin a separate discussion about how to close down the
seccomp-bpf
> policies, including possible changes to NSS to make it friendlier to
sandboxing
> under these policies. For example, the only reason it's calling sysinfo is
> because it's unable to open the rand FD (which on CrOS we've modified to
> intentionally crash, to catch situations where people forget to prime
urandom).
> So if we solve the random issue, perhaps we solve your issue with sysinfo /
> uname.

I'm happy to consider the seccomp-bpf holes a separate issue and to treat them
as such.

But so far it seems to me that the correct fix for this bug is this CL, unless
of course everything only happens because urandom is not available for some
reason as you suggest it might.

But then again, if initializing NSS once and for all from the Zygote fixes it,
and if it is an improvement over doing it from every renderer independantly,
shouldn't we do that?

wtc

Review comments on patch set 2: I also have concerns about this CL because I ...

8 years, 2 months ago (2012-10-23 18:52:28 UTC) #10

jln (very slow on Chromium)

https://chromiumcodereview.appspot.com/11231021/diff/10001/content/zygote/zygote_main_linux.cc File content/zygote/zygote_main_linux.cc (right): https://chromiumcodereview.appspot.com/11231021/diff/10001/content/zygote/zygote_main_linux.cc#newcode494 content/zygote/zygote_main_linux.cc:494: // Do some extra NSS initialization. We don't want ...

8 years, 2 months ago (2012-10-23 19:06:50 UTC) #11

Ryan Sleevi

On 2012/10/23 19:06:50, Julien Tinnes wrote: > I think that is the most important part. ...

8 years, 2 months ago (2012-10-23 19:12:28 UTC) #12

jln (very slow on Chromium)

On 2012/10/23 19:12:28, Ryan Sleevi wrote: > On 2012/10/23 19:06:50, Julien Tinnes wrote: > > ...

8 years, 2 months ago (2012-10-23 23:35:49 UTC) #13

Ryan Sleevi

As mentioned on the bug, I think we want to change this to be after ...

8 years, 1 month ago (2012-11-15 01:34:14 UTC) #14

Ryan Sleevi

8 years, 1 month ago (2012-11-15 01:57:32 UTC) #15

Note - from studying
http://mxr.mozilla.org/security/source/nsprpub/pr/src/pthreads/ptthread.c , NSS
is fully capable of handling systems where it lacks permissions to schedule
threads.

If the seccomp-bpf sandbox wasn't exploding on the policy violation, and instead
just blocked it (logging silently?), initializing NSS post-sandbox should be
fine.

Expand Messages | Collapse Messages