Index: documentation/filesystem_access.txt |
diff --git a/documentation/filesystem_access.txt b/documentation/filesystem_access.txt |
new file mode 100644 |
index 0000000000000000000000000000000000000000..2fc702d277eb8f1c28383eb2eeaa99c69b1bde66 |
--- /dev/null |
+++ b/documentation/filesystem_access.txt |
@@ -0,0 +1,145 @@ |
+# Safe filesystem access |
+ |
+## Background |
+ |
+Native Client is a phenomenally useful library for restricted code execution |
+that is useful beyond the browser. There’s all sorts of cases where one might |
+want to run untrusted code in a system disconnected from Chromium or the PPAPI. |
+ |
+In these cases, it is additionally useful to allow restricted filesystem |
+access, but to still run untrusted code. Prior to this change, `sel_ldr` did |
+not allow for safe and restricted filesystem access. |
+ |
+## Requirements |
+ |
+* Very low overhead File I/O |
+* Give users the ability to have selective read-only or read/write access |
+* Give users control over the entire filesystem layout presented to the |
+ untrusted application. |
+ |
+## Feature user experience |
+ |
+This feature adds one additional commandline argument `-m` to `sel_ldr`. `-m` |
+takes a path to the directory to be used as the root by the untrusted |
+application. |
+ |
+**IMPORTANT WARNING**: The user should take great pains to make sure there are |
+no relative symlinks at all or any absolute symlinks to outside of the |
+`chroot` path inside the tree. It would be easy to get this wrong and may be |
+safer to disallow starting with a path that has any symlinks contained inside |
+at all. |
+ |
+All of the given requirements can be satisfied by this `chroot`-style |
+interface: |
+ |
+ * The only overhead for file I/O is in adding (and sanitizing) a path prefix |
+ to absolute paths passed through to the host. For relative paths, an |
+ additional `Getcwd` call is necessary. |
+ * Read-only or read/write access can be controlled using normal filesystem |
+ permissions for the user running the `sel_ldr` process. |
+ * Using host-side filesystem primitives such as Linux bind mounts, users can |
+ map disparate paths from the host into the untrusted process's root. |
+ |
+## Implementation |
+ |
+For the most part, this feature is simple and relatively straightforward to |
+add. When `-m` is given (even if `-a` is not), file operations will be enabled |
+and the provided chroot path will be added as a path prefix to all absolute |
+paths passed through to the host. Paths will be sanitized to make sure no |
+escapes are made with `..` path elements. |
+ |
+Relative paths are slightly more complex. Instead of prepending the path |
+prefix, the CWD is checked to make sure it resides in the chroot path, then it |
+is prepended, then everything after the chroot path in the resultant path is |
+sanitized. |
+ |
+Given that strategy, the following syscall changes were straightforward: |
+ |
+ * `NaClSysOpen` |
+ * `NaClSysStat` |
+ * `NaClSysMkdir` |
+ * `NaClSysRmdir` |
+ * `NaClSysUnlink` |
+ * `NaClSysTruncate` |
+ * `NaClSysLink` |
+ * `NaClSysRename` |
+ * `NaClSysChmod` |
+ * `NaClSysAccess` |
+ * `NaClSysUtimes` |
+ |
+### Path sanitization |
+ |
+Path sanitization happens lexically, in the sense that no disk I/O happens |
+while attempting to put the path in canonical form. Essentially, the path |
+elements are parsed, and double path separators, '.' path elements, and '..' |
+path elements are all handled and cleaned up prior to disk access. |
+ |
+While this approach (assuming the algorithm is correct) is a secure way to |
+eliminate parent folder references, it does result in a slight change of POSIX |
+semantics. "/a/.." does not always refer to the same inode as "/", even if |
+practically it almost always does. |
+ |
+### Symlinks |
+ |
+Ideally, symlinks should behave as if we had just chrooted into the path given |
+to `-m` (which means they would resolve relative to the untrusted root). |
+ |
+However, it's not straightforward to have `sel_ldr` call `chroot` itself to |
+make the kernel do appropriate symlink resolution in a cross-platform way. So |
+our other option to support symlinks is to do an Lstat on every path load and |
+try to resolve symlinks ourselves. This would be a pretty big performance hit |
+and a significant amount of complication needing auditing. We wouldn't be able |
+to just pass through sanitized paths to the host's open call without making |
+sure the untrusted app hadn't created a symlink pointing elsewhere to jump out |
+of jail. |
+ |
+So instead, for v1 we're just not supporting symlinks. `Readlink` and `Lstat` |
+don't allow the untrusted process access to any information the chroot |
+filesystem creator didn't give it access to, so those are allowed when `-m` is |
+given, but the symlink creation call will fail. |
+ |
+Further, the path passed to `-m` might have symlinks in it, so it is up to the |
+creator of the filesystem passed to `-m` to sanitize it for bad (or even |
+relative!) symlinks prior to safely running `sel_ldr`. This is a significant |
+amount of sharpness and potential insecurity this feature might bring, so it |
+might be worth just failing to start if symlinks exist at any subpath from the |
+path given to `-m`. Currently we're going to allow the sharpness of this |
+interface. |
+ |
+The following syscalls relate to symlinks. |
+ |
+ * `NaClSysLstat` - pass through (with appropriate path changes) |
+ * `NaClSysSymlink` - always fails |
+ * `NaClSysReadlink` - pass through (with appropriate path changes) |
+ |
+### Current working directory |
+ |
+Being able to read the current working directory (aside from `readlink`, |
+previously discussed) is the only syscall that has a path going from the host |
+back to the untrusted application. All other syscalls are one way - paths go |
+from the untrusted application out to the host. |
+ |
+The current working directory, unless we call the host's `chdir` at startup to |
+the path given to `-m`, might not be inside of the untrusted application's |
+filesystem. |
+ |
+This affects the following two syscalls: |
+ |
+ * `NaClSysChdir` - always adds the path prefix |
+ * `NaClSysGetcwd` - checks if the path is sanitized and starts with the path |
+ prefix. if it doesn't start with the path prefix this syscall fails. |
+ |
+### Filehandle syscalls |
+ |
+These syscalls require no changes and are safe to just enable like they would |
+be with `-a` when `-m` is passed: |
+ |
+ * `NaClSysGetdents` - `..` might reference an inode outside of the untrusted |
+ root, but that's okay because the untrusted application can't do anything |
+ with it. |
+ * `NaClSysFstat` - no different than `stat`, which is already allowed. |
+ |
+For both of these calls, we might need to make sure there's no file descriptors |
+in the untrusted application that reference files outside of the untrusted |
+root, but even if we don't there's very little the untrusted application could |
+do with the data these calls return. |