|
|
Created:
8 years, 2 months ago by jln (very slow on Chromium) Modified:
8 years, 1 month ago CC:
chromium-reviews, joi+watch-content_chromium.org, darin-cc_chromium.org, jam, Jorge Lucangeli Obes, Ami GONE FROM CHROMIUM, Tom Finegan Base URL:
svn://svn.chromium.org/chrome/trunk/src Visibility:
Public. |
DescriptionLinux: Initialize NSS fully in the Zygote.
We fully initialize NSS in the Zygote. We do so after the setuid sandbox
initialization, in order to assert that initializing NSS won't do something
we don't like "behind our back".
This allows to initialize NSS only once for every renderer. This also
guarantees that we get initialized before the seccomp-bpf sandbox starts
which in turn will allow us to tighten it in the future.
BUG=156864
Patch Set 1 #
Total comments: 10
Patch Set 2 : #
Total comments: 6
Patch Set 3 : Update comment. #Patch Set 4 : rebase #Messages
Total messages: 15 (0 generated)
I appreciate the quick turn around on the change, but I'm still not sure I understand the issue(s), which is why I'm still somewhat opposed to this. I'm all for being able to tighten the sandbox, but to my knowledge, the only hole that has had to be punched for NSS is 1) Ensure the .sos can be loaded. Doesn't call any functions in them, just ensures they're loaded 2) Ensure NSS has access to /dev/urandom - which arguably is of benefit to any zygote'd process Your replies both on the bug and comments here suggest there are some greater holes that have been punched, and I'd like to understand those. NSS doesn't (IIRC) cache the urandom FD, and does periodically try to access it (admittedly, roughly around after 2^48 bytes of data have been read, although it can be induced to do so sooner), so I'm not sure this will in any way prevent #2 from being necessary. https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li... File content/zygote/zygote_main_linux.cc (right): https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li... content/zygote/zygote_main_linux.cc:502: crypto::DisableNSSForkCheck(); Is modifying the |env| like this permitted at this time? For some reason I was thinking this needed to be done before the first layer. https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li... content/zygote/zygote_main_linux.cc:504: // in turn attempts to open a file. drop the comment on 503-504. It doesn't matter what function - ANY attempt to open NSS will attempt to open /dev/urandom - which I'd argue the sandbox should allow (hence the libc overrides) Or are you seeing a different file? https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li... content/zygote/zygote_main_linux.cc:506: // leak descriptors to private files. What files are being opened? What descriptors are you worried about? I'm not sure I can make sense of this comment? https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li... content/zygote/zygote_main_linux.cc:509: crypto::EnsureNSSInit(); s/benefit/suffer/ :) It's not clear to me what holes that are punched can be closed with this, thus it's not clear to me the benefit this brings, compared to the overhead.
This works for me in both chrome and content_browsertests. I can't speak to the correctness, though. https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li... File content/zygote/zygote_main_linux.cc (right): https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li... content/zygote/zygote_main_linux.cc:493: #if defined(USE_NSS) If we take this patch, should we remove http://code.google.com/searchframe#OAMlx_jo-ck/src/chrome/renderer/chrome_ren... ? Should the #if check match that one?
Hopefully that's more clear. https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li... File content/zygote/zygote_main_linux.cc (right): https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li... content/zygote/zygote_main_linux.cc:502: crypto::DisableNSSForkCheck(); On 2012/10/19 23:50:10, Ryan Sleevi wrote: > Is modifying the |env| like this permitted at this time? > > For some reason I was thinking this needed to be done before the first layer. I didn't now that the env was affected, but I don't see any reason why that would be a problem, no. https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li... content/zygote/zygote_main_linux.cc:504: // in turn attempts to open a file. On 2012/10/19 23:50:10, Ryan Sleevi wrote: > drop the comment on 503-504. It doesn't matter what function - ANY attempt to > open NSS will attempt to open /dev/urandom - which I'd argue the sandbox should > allow (hence the libc overrides) > > Or are you seeing a different file? Done. https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li... content/zygote/zygote_main_linux.cc:506: // leak descriptors to private files. On 2012/10/19 23:50:10, Ryan Sleevi wrote: > What files are being opened? What descriptors are you worried about? I'm not > sure I can make sense of this comment? I modified it to make it more generic. What we don't want, is as you explained, a module to load private data to memory or to open file descriptors and keep them open. https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li... content/zygote/zygote_main_linux.cc:509: crypto::EnsureNSSInit(); On 2012/10/19 23:50:10, Ryan Sleevi wrote: > s/benefit/suffer/ :) > > It's not clear to me what holes that are punched can be closed with this, thus > it's not clear to me the benefit this brings, compared to the overhead. When this function is called, NSS will make some system calls that we want to restrict (they bring attack surface). For instance: advanced scheduling system calls and sysinfo / uname. This is why crbug.com/156864 exists: the seccomp-bpf sandbox (that restricts the kernel's API) was started before NSS. In renderers, this also happens, except, unlike for PPAPI, in renderers we allowed sysinfo / uname and some of the advanced scheduling operations (this is the "hole" I'm talking about). I'm now thinking of closing it. I'm trying to understand the impact of EnsureNSSInit(). Am I correct that if anything calls, say, crypto::HMAC, this will have the same impact that EnsureNSSInit() ? I would think calling it once and for all from the Zygote is a good thing. All processes forking from the Zygote (renderers, PPAPI), wouldn't have to go through it again.
On 2012/10/20 00:51:29, Julien Tinnes wrote: > Hopefully that's more clear. > > https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li... > File content/zygote/zygote_main_linux.cc (right): > > https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li... > content/zygote/zygote_main_linux.cc:502: crypto::DisableNSSForkCheck(); > On 2012/10/19 23:50:10, Ryan Sleevi wrote: > > Is modifying the |env| like this permitted at this time? > > > > For some reason I was thinking this needed to be done before the first layer. > > I didn't now that the env was affected, but I don't see any reason why that > would be a problem, no. > > https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li... > content/zygote/zygote_main_linux.cc:504: // in turn attempts to open a file. > On 2012/10/19 23:50:10, Ryan Sleevi wrote: > > drop the comment on 503-504. It doesn't matter what function - ANY attempt to > > open NSS will attempt to open /dev/urandom - which I'd argue the sandbox > should > > allow (hence the libc overrides) > > > > Or are you seeing a different file? > > Done. > > https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li... > content/zygote/zygote_main_linux.cc:506: // leak descriptors to private files. > On 2012/10/19 23:50:10, Ryan Sleevi wrote: > > What files are being opened? What descriptors are you worried about? I'm not > > sure I can make sense of this comment? > > I modified it to make it more generic. What we don't want, is as you explained, > a module to load private data to memory or to open file descriptors and keep > them open. > > https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li... > content/zygote/zygote_main_linux.cc:509: crypto::EnsureNSSInit(); > On 2012/10/19 23:50:10, Ryan Sleevi wrote: > > s/benefit/suffer/ :) > > > > It's not clear to me what holes that are punched can be closed with this, thus > > it's not clear to me the benefit this brings, compared to the overhead. > > When this function is called, NSS will make some system calls that we want to > restrict (they bring attack surface). > > For instance: advanced scheduling system calls and sysinfo / uname. > > This is why crbug.com/156864 exists: the seccomp-bpf sandbox (that restricts the > kernel's API) was started before NSS. > > In renderers, this also happens, except, unlike for PPAPI, in renderers we > allowed sysinfo / uname and some of the advanced scheduling operations (this is > the "hole" I'm talking about). I'm now thinking of closing it. > > I'm trying to understand the impact of EnsureNSSInit(). Am I correct that if > anything calls, say, crypto::HMAC, this will have the same impact that > EnsureNSSInit() ? Yes. However, it's presently *only* called by the Chromoting renderer process (rarely used relative to other processes), the CDM out of process (not at all used yet? but still, presumably very rarely relative to other processes), and the NACL validator (as well, rarely used relative to other procseses) I *don't* like that this adds overhead to every renderer process startup, unless I'm misunderstanding. What I currently understand is that this code is run every time the zygote spawns into another sub-process (meaning once per renderer). Even if it was once-per-zygote, I'd be concerned about the extra overhead of loading the various NSS structures, error tables, etc into memory, when the vast majority of the time, they will not be used. > > I would think calling it once and for all from the Zygote is a good thing. All > processes forking from the Zygote (renderers, PPAPI), wouldn't have to go > through it again. If I'm understanding this, this means it's pre-fork init. I'm not sure how well NSS will handle that (today, we're doing post-fork init AIUI). I'm admit, my ignorance here is no doubt impeding the review here, I'm still unclear as to how crbug.com/156864 relates to the seccomp-bpf sandbox. The backtrace that fischman@ included pointed not at seccomp-bpf, but at urandom not being primed for the process type. I'd rather split the discussion / change into two: IF it's just that the CDM needs urandom, then the CDM/OOP pepper plugin should set up that hole, and possibly in the zygote. This is a known and easy to reason about change, and that's why I don't see any hangups. Then we can begin a separate discussion about how to close down the seccomp-bpf policies, including possible changes to NSS to make it friendlier to sandboxing under these policies. For example, the only reason it's calling sysinfo is because it's unable to open the rand FD (which on CrOS we've modified to intentionally crash, to catch situations where people forget to prime urandom). So if we solve the random issue, perhaps we solve your issue with sysinfo / uname.
https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li... File content/zygote/zygote_main_linux.cc (right): https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li... content/zygote/zygote_main_linux.cc:493: #if defined(USE_NSS) On 2012/10/20 00:45:32, ddorwin wrote: > If we take this patch, should we remove > http://code.google.com/searchframe#OAMlx_jo-ck/src/chrome/renderer/chrome_ren... > ? Should the #if check match that one? This comment is misleading on Linux. On Linux the setuid sandbox is already engaged at that point since this is called from a process within the Zygote. See content/app/content_main_runner.cc, in RunZygote(), it calls ContentClientInitializer::Set(), which will in turn call CreateContentRendererClient which will create the observer. You're probably correct that I need to remove this on Linux, even though it would most likely exit quickly. Ryan: this seems to indicate that we're already paying the cost for this in every renderer anyway, isn't it?
On 2012/10/20 01:14:08, Julien Tinnes wrote: > https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li... > File content/zygote/zygote_main_linux.cc (right): > > https://codereview.chromium.org/11231021/diff/1/content/zygote/zygote_main_li... > content/zygote/zygote_main_linux.cc:493: #if defined(USE_NSS) > On 2012/10/20 00:45:32, ddorwin wrote: > > If we take this patch, should we remove > > > http://code.google.com/searchframe#OAMlx_jo-ck/src/chrome/renderer/chrome_ren... > > ? Should the #if check match that one? > > This comment is misleading on Linux. On Linux the setuid sandbox is already > engaged at that point since this is called from a process within the Zygote. > > See content/app/content_main_runner.cc, in RunZygote(), it calls > ContentClientInitializer::Set(), which will in turn call > CreateContentRendererClient which will create the observer. > > You're probably correct that I need to remove this on Linux, even though it > would most likely exit quickly. Ryan: this seems to indicate that we're already > paying the cost for this in every renderer anyway, isn't it? It looks so, but that looks to be a bug, not a feature. There used to be a switch that would only do that when instantiating chromoting.
As noted in http://src.chromium.org/viewvc/chrome?view=rev&revision=78633 , the original intent was that NSS would only be initialized *after* the sandbox was closed, and the only thing pre-sandbox that was needed was loading the NSS libraries (and punching the hole for urandom)
On 2012/10/20 01:06:30, Ryan Sleevi wrote: > > I'm trying to understand the impact of EnsureNSSInit(). Am I correct that if > > anything calls, say, crypto::HMAC, this will have the same impact that > > EnsureNSSInit() ? > > Yes. > > However, it's presently *only* called by the Chromoting renderer process (rarely > used relative to other processes), the CDM out of process (not at all used yet? > but still, presumably very rarely relative to other processes), and the NACL > validator (as well, rarely used relative to other procseses) > > I *don't* like that this adds overhead to every renderer process startup, unless > I'm misunderstanding. What I currently understand is that this code is run every > time the zygote spawns into another sub-process (meaning once per renderer). > Even if it was once-per-zygote, I'd be concerned about the extra overhead of > loading the various NSS structures, error tables, etc into memory, when the vast > majority of the time, they will not be used. Yes, you're misunderstanding that part: ZygoteMain() is called only once per instance of Chrome. Your point about the memory overhead is understood, but I would argue that it may not be a problem since at least these structures will be shared between all processes (at fork()). Moreover, we currently DO call EnsureNSSInit() when creating every renderer it seems, as David pointed out. > > > > I would think calling it once and for all from the Zygote is a good thing. All > > processes forking from the Zygote (renderers, PPAPI), wouldn't have to go > > through it again. > > If I'm understanding this, this means it's pre-fork init. I'm not sure how well > NSS will handle that (today, we're doing post-fork init AIUI). This is indeed pre-fork init. > I'm admit, my ignorance here is no doubt impeding the review here, I'm still > unclear as to how crbug.com/156864 relates to the seccomp-bpf sandbox. The > backtrace that fischman@ included pointed not at seccomp-bpf, but at urandom not > being primed for the process type. This is the result of a gdb attach. Maybe it's inaccurate ? fischman@ ? There *definitely* is a sandbox-bpf violation happening. > I'd rather split the discussion / change into two: IF it's just that the CDM > needs urandom, then the CDM/OOP pepper plugin should set up that hole, and > possibly in the zygote. This is a known and easy to reason about change, and > that's why I don't see any hangups. I don't know about the urandom issue (nor do I understand why it would happen, any process forked from the Zygote should have access to this). There is an additional problem with sandbox-bpf though. What you say about them being related is interesting. Note that the urandom "hole" is very fragile, it relies on hooking open() via the dynamic linker. Maybe that fails for PPAPI plugins? > Then we can begin a separate discussion about how to close down the seccomp-bpf > policies, including possible changes to NSS to make it friendlier to sandboxing > under these policies. For example, the only reason it's calling sysinfo is > because it's unable to open the rand FD (which on CrOS we've modified to > intentionally crash, to catch situations where people forget to prime urandom). > So if we solve the random issue, perhaps we solve your issue with sysinfo / > uname. I'm happy to consider the seccomp-bpf holes a separate issue and to treat them as such. But so far it seems to me that the correct fix for this bug is this CL, unless of course everything only happens because urandom is not available for some reason as you suggest it might. But then again, if initializing NSS once and for all from the Zygote fixes it, and if it is an improvement over doing it from every renderer independantly, shouldn't we do that?
Review comments on patch set 2: I also have concerns about this CL because I don't fully understand how NSS works across a fork() call. I summarized what I know about this subject below. http://codereview.chromium.org/11231021/diff/10001/content/zygote/zygote_main... File content/zygote/zygote_main_linux.cc (right): http://codereview.chromium.org/11231021/diff/10001/content/zygote/zygote_main... content/zygote/zygote_main_linux.cc:494: // Do some extra NSS initialization. We don't want to do this pre-sandbox Nit: change "Do some extra NSS initialization" to "Initialize NSS". The word "extra" seems to imply an incremental initialization, but crypto::EnsureNSSInit() fully initializes NSS. http://codereview.chromium.org/11231021/diff/10001/content/zygote/zygote_main... content/zygote/zygote_main_linux.cc:499: // of sandbox, e.g. seccomp-bpf. In general a multithreaded program on Unix should avoid forking without doing exec*(). I guess zygote calls fork() without calling exec*()? http://codereview.chromium.org/11231021/diff/10001/content/zygote/zygote_main... content/zygote/zygote_main_linux.cc:502: crypto::DisableNSSForkCheck(); I am not familiar with what it takes to make NSS work correctly across a fork() call. I remember it is supported, with two modes (the "PKCS #11 compliant" mode which does the fork check and the original mode). I would avoid initializing NSS before forking. Here is a NSS bug query for all bugs that mention "fork" in the summary: https://bugzilla.mozilla.org/buglist.cgi?list_id=4755732;short_desc_type=allw...
https://chromiumcodereview.appspot.com/11231021/diff/10001/content/zygote/zyg... File content/zygote/zygote_main_linux.cc (right): https://chromiumcodereview.appspot.com/11231021/diff/10001/content/zygote/zyg... content/zygote/zygote_main_linux.cc:494: // Do some extra NSS initialization. We don't want to do this pre-sandbox On 2012/10/23 18:52:28, wtc wrote: > > Nit: change "Do some extra NSS initialization" to > "Initialize NSS". The word "extra" seems to imply an > incremental initialization, but crypto::EnsureNSSInit() > fully initializes NSS. Done. https://chromiumcodereview.appspot.com/11231021/diff/10001/content/zygote/zyg... content/zygote/zygote_main_linux.cc:499: // of sandbox, e.g. seccomp-bpf. On 2012/10/23 18:52:28, wtc wrote: > > In general a multithreaded program on Unix should avoid > forking without doing exec*(). I guess zygote calls fork() > without calling exec*()? Yes, but we don't have threads at this point. If we had thread, it would be a major issue. https://chromiumcodereview.appspot.com/11231021/diff/10001/content/zygote/zyg... content/zygote/zygote_main_linux.cc:502: crypto::DisableNSSForkCheck(); On 2012/10/23 18:52:28, wtc wrote: > > I am not familiar with what it takes to make NSS work correctly > across a fork() call. I remember it is supported, with two > modes (the "PKCS #11 compliant" mode which does the fork > check and the original mode). I would avoid initializing NSS > before forking. > > Here is a NSS bug query for all bugs that mention "fork" in > the summary: > https://bugzilla.mozilla.org/buglist.cgi?list_id=4755732;short_desc_type=allw... I think that is the most important part. If you and Ryan can't be comfortable with this, then unfortunately we should switch to a model where we initialize NSS in the renderer / PPAPI process, but before we enable the sandbox.
On 2012/10/23 19:06:50, Julien Tinnes wrote: > I think that is the most important part. If you and Ryan can't be comfortable > with this, then unfortunately we should switch to a model where we initialize > NSS in the renderer / PPAPI process, but before we enable the sandbox. I think more importantly would be not requiring any initialization of NSS pre-sandbox (other than loading the .sos, which I'm not as concerned about because those are shared), which was how it was supposed to be working :) If we can figure out what in NSS Init is requiring the sandbox to be down, and fix that, then it should just be a matter of lazy initialization of NSS - post sandboxing - for any process that cares. Which is how I'd like it to work, and how I thought it was. Can we investigate what syscalls are being made that you'd like to remove, along with a backtrace?
On 2012/10/23 19:12:28, Ryan Sleevi wrote: > On 2012/10/23 19:06:50, Julien Tinnes wrote: > > I think that is the most important part. If you and Ryan can't be comfortable > > with this, then unfortunately we should switch to a model where we initialize > > NSS in the renderer / PPAPI process, but before we enable the sandbox. > > I think more importantly would be not requiring any initialization of NSS > pre-sandbox (other than loading the .sos, which I'm not as concerned about > because those are shared), which was how it was supposed to be working :) > > If we can figure out what in NSS Init is requiring the sandbox to be down, and > fix that, then it should just be a matter of lazy initialization of NSS - post > sandboxing - for any process that cares. Which is how I'd like it to work, and > how I thought it was. > > Can we investigate what syscalls are being made that you'd like to remove, along > with a backtrace? I'll reply on the bug.
As mentioned on the bug, I think we want to change this to be after forking, after the suid sandbox, and before the seccomp-bpf sandbox. Julien, is that possible?
Note - from studying http://mxr.mozilla.org/security/source/nsprpub/pr/src/pthreads/ptthread.c , NSS is fully capable of handling systems where it lacks permissions to schedule threads. If the seccomp-bpf sandbox wasn't exploding on the policy violation, and instead just blocked it (logging silently?), initializing NSS post-sandbox should be fine. |