| Index: vpython/README.md
|
| diff --git a/vpython/README.md b/vpython/README.md
|
| new file mode 100644
|
| index 0000000000000000000000000000000000000000..29ddb1ae50ec5d1a15f7c1638ab8293966a01fa4
|
| --- /dev/null
|
| +++ b/vpython/README.md
|
| @@ -0,0 +1,235 @@
|
| +## vpython - simple and easy VirtualEnv Python
|
| +
|
| +`vpython` is a tool, written in Go, which enables the simple and easy invocation
|
| +of Python code in [VirtualEnv](https://virtualenv.pypa.io/en/stable/)
|
| +environments.
|
| +
|
| +`vpython` is a simple Python bootstrap which (almost) transparently wraps a
|
| +Python interpreter invocation to run in a tailored VirtualEnv environment. The
|
| +environment is expressed by a script-specific configuration file. This allows
|
| +each Python script to trivially express its own package-level dependencies and
|
| +run in a hermetic world consisting of just those dependencies.
|
| +
|
| +When invoking such a script via `vpython`, the tool downloads its dependencies
|
| +and prepares an immutable VirtualEnv containing them. It then invokes the
|
| +script, now running in that VirutalEnv, through the preferred Python
|
| +interpreter.
|
| +
|
| +`vpython` does its best not to use hacky mechanisms to achieve this. It uses
|
| +an unmodified VirtualEnv package, standard setup methods, and local system
|
| +resources. The result is transparent canonical VirtualEnv environment
|
| +bootstrapping. `vpython` is also safe for concurrent invocation, using safe
|
| +filesystem-level locking to perform any enviornment setup and management.
|
| +
|
| +`vpython` itself is very fast. The wheel downloads and VirtualEnvs may also be
|
| +cached and re-used, optimally limiting the runtime overhead of `vpython` to just
|
| +one initial setup per unique environment.
|
| +
|
| +### Setup and Invocation
|
| +
|
| +For the standard case, employing `vpython` is as simple as:
|
| +
|
| +First, create and upload Python wheels for all of the packages that you will
|
| +need. This is done in an implementation-specific way (e.g., upload wheels as
|
| +packages to CIPD).
|
| +
|
| +Once the packages are available:
|
| +
|
| +* Add `vpython` to `PATH`.
|
| +* Write an enviornment specification naming packages.
|
| +* Change tool invocation from `python` to `vpython`.
|
| +
|
| +Using `vpython` offers several benefits to direct Python invocation, especially
|
| +when vendoring packages. Notably, with `vpython`:
|
| +
|
| +* It is trivially enables hermetic Python everywhere.
|
| +* No `sys.path` manipulation is needed to load vendored or imported packages.
|
| +* Any tool can define which package(s) it needs without requiring coordination
|
| + or cooperation from other tools. (Note that the package must be made available
|
| + for download first).
|
| +* Adding new Python dependencies to a project is non-invasive and immediate.
|
| +* Package downloading and deployment are baked into `vpython` and built on
|
| + fast and secure Google Cloud Platform technologies.
|
| +* No more custom bootstraps. Several projects and tools, including multiple
|
| + within the infra code base, have bootstrap scripts that vendor packages or
|
| + mimic a VirtualEnv. These are at best repetitive and, at worst, buggy and
|
| + insecure.
|
| +* Depenencies are explicitly stated, not assumed.
|
| +
|
| +### Why VirtualEnv?
|
| +
|
| +VirtualEnv offers several benefits over system Python. Primarily, it is the
|
| +
|
| +By using the same environemnt everywhere, Python invocations become
|
| +reproducible. A tool run on a developer's system will load the same versions
|
| +of the same libraries as it will on a production system. A production system
|
| +will no longer fail because it is missing a package, or because it has the
|
| +wrong version.
|
| +
|
| +A direct mechanism for vendoring, `sys.path` manipulation, is nuanced, buggy,
|
| +and unsupported by the Python community. It is difficult to get right on all
|
| +platforms in all environments for all packages. A notorious example of this is
|
| +`protobuf` and other domain-bound packages, which actively fight `sys.path`
|
| +inclusion. Using VirtualEnv means that any compliant Python package can
|
| +trivially be included into a project.
|
| +
|
| +### Why CIPD?
|
| +
|
| +[CIPD](https://github.com/luci/luci-go/tree/master/cipd) is a cross-platform
|
| +service and associated tooling and packages used to securely fetch and deploy
|
| +immutable "packages" (~= zip files) into the local file system. Unlike "package
|
| +managers" it avoids platform-specific assumptions, executable "hooks", or the
|
| +complexities of dependency resolution. `vpython` uses this as a mechanism for
|
| +housing and deploying wheels.
|
| +
|
| +infrastructure package deployment system. It is simple, accessible, fast, and
|
| +backed by resilient systems such as Google Storage and AppEngine.
|
| +
|
| +Unlike `pip`, a CIPD package is defined by its content, enabling precise package
|
| +matching instead of fuzzy version matching (e.g., `numpy >= 1.2`, and
|
| +`numpy == 1.2` both can match multiple `numpy` packages in `pip`).
|
| +
|
| +CIPD also supports ACLs, enabling privileged Python projects to easily vendor
|
| +sensitive packages.
|
| +
|
| +### Why wheels?
|
| +
|
| +A Python [wheel](https://www.python.org/dev/peps/pep-0427/) is a simple binary
|
| +distrubition of Python code. A wheel can be generic (pure Python) or system-
|
| +and architecture-bound (e.g., 64-bit Mac OSX).
|
| +
|
| +Wheels are prefered over eggs because they come packaged with compiled binaries.
|
| +This makes their deployment simple (unpack via `pip`) and reduces system
|
| +requirements and variation, since local compilation is not needed.
|
| +
|
| +The increased management burden of maintaining separate wheels for the same
|
| +package, one for each architecture, is handled naturally by CIPD, removing the
|
| +only real pain point.
|
| +
|
| +## Wheel Guidance
|
| +
|
| +This section contains recommendations for building or uploading wheel CIPD
|
| +packages, including platform-specific guidance.
|
| +
|
| +CIPD wheel packages are CIPD packages that contain Python wheels. A given CIPD
|
| +package can contain multiple wheels for multiple platforms, but should only
|
| +contain one version of any given package for any given architecture/platform.
|
| +
|
| +For example, you can bundle a Windows, Linux, and Mac OSX version of `numpy` and
|
| +`coverage` in the same CIPD package, but you should not bundle `numpy==1.11` and
|
| +`numpy==1.12` in the same package.
|
| +
|
| +The reason for this is that `vpython` identifies which wheels to install by
|
| +scanning the contents of the CIPD package, and if multiple versions appear,
|
| +there is no clear guidance about which should be used.
|
| +
|
| +### Mac OSX
|
| +
|
| +Use the `m` ABI suffix and the `macosx_...` platform. `vpython` installs wheels
|
| +with the `--force` flag, so slight binary incompatibilities (e.g., specific OSX
|
| +versions) can be glossed over.
|
| +
|
| + coverage-4.3.4-cp27-cp27m-macosx_10_10_x86_64.whl
|
| +
|
| +### Linux
|
| +
|
| +Use wheels with the `mu` ABI suffix and the `manylinux1` platform. For example:
|
| +
|
| + coverage-4.3.4-cp27-cp27mu-manylinux1_x86_64.whl
|
| +
|
| +### Windows
|
| +
|
| +Use wheels with the `cp27m` or `none` ABI tag. For example:
|
| +
|
| + coverage-4.3.4-cp27-cp27m-win_amd64.whl
|
| +
|
| +
|
| +## Setup and Invocation
|
| +
|
| +`vpython` can be invoked by replacing `python` in the command-line with
|
| +`vpython`.
|
| +
|
| +`vpython` works with a default Python environment out of the box. To add
|
| +vendored packges, you need to define an enviornment specification file that
|
| +describes which wheels to install.
|
| +
|
| +An enviornment specification file is a text protobuf defined as `Spec`
|
| +[here](./api/env/spec.proto). An example is:
|
| +
|
| +```
|
| +# Any 2.7 interpreter will do.
|
| +python_version: "2.7"
|
| +
|
| +# Include "numpy" for the current architecture.
|
| +wheel {
|
| + path: "infra/python/wheels/numpy/${platform}-${arch}"
|
| + version: "version:1.11.0"
|
| +}
|
| +
|
| +# Include "coverage" for the current architecture.
|
| +wheel {
|
| + path: "infra/python/wheels/coverage/${platform}-${arch}"
|
| + version: "version:4.1"
|
| +}
|
| +```
|
| +
|
| +This specification can be supplied in one of three ways:
|
| +
|
| +* Explicitly, as a command-line option to `vpython` (`-spec`).
|
| +* Implicitly, as a file alongside your entry point. For example, if you are
|
| + running `test_runner.py`, `vpython` will look for `test_runner.py.vpython`
|
| + next to it and load the environment from there.
|
| +* Implicitly, inined in your main file. `vpython` will scan the main entry point
|
| + for sentinel text and, if present, load the specification from that.
|
| +* Implicitly, through the `VPYTHON_VENV_SPEC_PATH` environment variable. This is
|
| + set by a `vpython` invocation so that chained invocations default to the same
|
| + environment.
|
| +
|
| +### Optimization and Caching
|
| +
|
| +`vpython` has several levels of caching that it employs to optimize setup and
|
| +invocation overhead.
|
| +
|
| +#### VirtualEnv
|
| +
|
| +Once a VirtualEnv specification has been resolved, its resulting pinned
|
| +specification is hashed and used as a key to that VirtualEnv. Other `vpython`
|
| +invocations expressing hte same enviornment will naturally re-use that
|
| +VirtualEnv instead of creating their own.
|
| +
|
| +#### Download Caching
|
| +
|
| +Download mechanisms (e.g., CIPD) can optionally include a package cache to avoid
|
| +the overhead of downloading and/or resolving a package multiple times.
|
| +
|
| +### Migration
|
| +
|
| +#### Command-line.
|
| +
|
| +`vpython` is a natural replacement for `python` in the command line:
|
| +
|
| +```sh
|
| +python ./foo/bar/baz.py -d --flag value arg arg whatever
|
| +```
|
| +
|
| +Becomes:
|
| +```sh
|
| +vpython ./foo/bar/baz.py -d --flag value arg arg whatever
|
| +```
|
| +
|
| +The `vpython` tool accepts its own command-line arguments. In this case, use
|
| +a `--` seprator to differentiate between `vpython` options and `python` options:
|
| +
|
| +```sh
|
| +vpython -spec /path/to/spec.vpython -- ./foo/bar/baz.py
|
| +```
|
| +
|
| +#### Shebang (POSIX)
|
| +
|
| +If your script uses implicit specification (file or inline), replacing `python`
|
| +with `vpython` in your shebang line will automatically work.
|
| +
|
| +```sh
|
| +#!/usr/bin/env vpython
|
| +```
|
| +
|
|
|