BinaryPaths`: Find already installed binaries
For certain tools that are hard to automatically install—such as Docker or language interpreters—you may want to assume that the user already has the tool installed on their machine.
The simplest approach is to assume that the binary is installed at a fixed absolute path, such as `
/bin/echo` or `
/usr/bin/perl`. In the `
argv` for your `
Process`, use this absolute path as your first element.
If you instead want to allow the binary to be located anywhere on a user's machine, you can use `
BinaryPaths` to search certain directories—such as a user's `
$PATH`—to find the absolute path to the binary.
BinaryPaths` has a field called `
paths: tuple[BinaryPath, ...]`, which stores all the discovered absolute paths to the specified binary. Each `
BinaryPath` object has the fields `
path: str`, such as `
/usr/bin/docker`, and `
fingerprint: str`, which is used to invalidate the cache if the binary changes. The results will be ordered by the order of `
search_path`, meaning that earlier entries in `
search_path` will show up earlier in the result.
BinaryPaths` also has a convenience property called `
first_path: BinaryPath | None`, which will return the first matching path, if any.
In this example, the `
search_path` is hardcoded. Instead, you may want to create a [subsystem](🔗) to allow users to override the search path through a dedicated option. See [pex_environment.py](🔗) for an example that allows the user to use the special string `
<PATH>` to read the user's `
$PATH` environment variable.
Checking for valid binaries (recommended)
When setting up a `
BinaryPathsRequest`, you can optionally pass the argument `
test: BinaryPathTest`. When discovering a binary, Pants will run your test and only use the binary if the return code is 0. Pants will also fingerprint the output and invalidate the cache if the output changes from before, such as because the user upgraded the version of the tool.
Why do this? This is helpful to ensure that all discovered binaries are valid and safe. This is also important for Pants to be able to detect when the user has changed the binary, such as upgrading its version.
BinaryPathTest` takes the argument `
args: Iterable[str]`, which is the arguments that Pants should run on your binary to ensure that it's a valid program. Usually, you'll set `
You can optionally set `
fingerprint_stdout=False` to the `
BinaryPathTest` constructor, but usually, you should keep the default of `
ExternalTool`: Install pre-compiled binaries
If your tool has a pre-compiled binary available online, Pants can download and use that binary automatically for users. This is often a better user experience than requiring the users to pre-install the tool. This will also make your build more deterministic because everyone will be using the same binary.
First, manually download the file. Typically, the downloaded file will be an archive like a `
.zip` or `
.tar.xz` file, but it may also be the actual binary. Then, run `
shasum -a 256` on the downloaded file to get its digest ID, and `
wc -c` to get its number of bytes.
If the downloaded file is an archive, you will also need to find the relative path within the archive to the binary, such as `
bin/shellcheck`. You may need to use a tool like `
unzip` to inspect the archive.
With this information, you can define a new `
You must define the class properties `
default_version` and `
default_known_version` is a list of pipe-separated strings in the form `
version|platform|sha256|length`. Use the values you found earlier by running `
shasum` and `
wc` for sha256 and length, respectively. `
platform` should be one of `
macos_arm64`, and `
You must also define the methods `
generate_url`, which is the URL to make a GET request to download the file, and `
generate_exe`, which is the relative path to the binary in the downloaded digest. Both methods take `
plat: Platform` as a parameter.
Because an `
ExternalTool` is a subclass of [`
Subsystem`](🔗), you must also define an `
options_scope`. You may optionally register options by overriding the classmethod `
In your rules, include the `
ExternalTool` as a parameter of the rule, then use `
Get(DownloadedExternalTool, ExternalToolRequest)` to download and extract the tool.
DownloadedExternalTool` object has two fields: `
digest: Digest` and `
exe: str`. Use the `
.exe` field as the first value of a `
argv`, and use the `
.digest` in the `
input_digest`. If you want to use multiple digests for the input, call `
Get(Digest, MergeDigests)` with the `
Pex`: Install binaries through pip
If a tool can be installed via `
pip` - e.g., Pytest or Black - you can install and run it using `
When defining a `
PexRequest` for a tool, you must give arguments for `
main`, and usually `
internal_only` if the PEX is only used as an internal tool, rather than distributed to users (e.g. the `
package` goal). This speeds up performance when building the PEX.
main` argument can be one of:
ConsoleScript("scriptname")`, where `
scriptname` is a [console_script](🔗) that the tool installs
EntryPoint.parse("module")`, which executes the given module
EntryPoint.parse("module:func")`, which executes the given nullary function in the given module.
There are several other optional parameters that may be helpful.
The resulting `
Pex` object has a `
digest: Digest` field containing the built `
.pex` file. This digest should be included in the `
input_digest` to the `
Process` you run.
Instead of the normal `
Get(ProcessResult, Process)`, you should use `
Get(ProcessResult, PexProcess)`, which will set up the environment properly for your Pex to execute. There is a predefined rule to go from `
PexProcess -> Process`, so `
Get(ProcessResult, Process)` will cause the engine to run `
PexProcess -> Process -> ProcessResult`.
PexProcess` requires arguments for `
pex: Pex`, `
argv: Iterable[str]`, and `
description: str`. It has several optional parameters that mirror the arguments to `
Process`. If you specify `
input_digest`, be careful to first use `
Get(Digest, MergeDigests)` on the `
pex.digest` and any of the other input digests.
PythonToolBase` when you need a Subsystem
Often, you will want to create a [`
Subsystem`](🔗) for your Python tool to allow users to set options to configure the tool. You can subclass `
PythonToolBase`, which subclasses `
Subsystem`, to do this:
You must define the class properties `
default_version`, and `
default_main`, and can optionally define `
default_extra_requirements` and `
Then, you can set up your `
Pex` like this: