Third-party dependencies

How to use third-party Python libraries in your project.

Pants handles dependencies with more precision than traditional Python workflows. Traditionally, you have a single heavyweight virtual environment that includes a large set of dependencies, whether or not you actually need them for your current task.

Instead, Pants understands exactly which dependencies every file in your project needs, and efficiently uses just that subset of dependencies needed for the task.

❯ ./pants dependencies src/py/util.py
3rdparty/py#requests

❯ ./pants dependencies --transitive src/py/app.py
3rdparty/py#flask
3rdparty/py#requests

Among other benefits, this precise and automatic understanding of your dependencies gives you fine-grained caching. This means, for example, that if none of the dependencies for a particular test file have changed, the cached result can be safely used.

Teaching Pants your "universe"(s) of dependencies

For Pants to know which dependencies each file uses, it must first know which specific dependencies are in your "universe", i.e. all the third-party dependencies your project directly uses.

By default, Pants uses a single universe for your whole project, but it's possible to set up multiple. See the header "Multiple resolves" in the "Lockfiles" section.

Each third-party dependency you directly use is modeled by a python_requirement target:

python_requirement(
    name="django",
    requirements=["Django==3.2.1"],
)

You do not need a python_requirement target for transitive dependencies, i.e. requirements that you do not directly import.

To minimize boilerplate, Pants has target generators to generate python_requirement targets for you:

  • python_requirements for requirements.txt.
  • poetry_requirements for Poetry projects.

requirements.txt

The python_requirements() target generator parses a requirements.txt-style file to produce a python_requirement target for each entry.

For example:

flask>=1.1.2,<1.3
requests[security]==2.23.0
dataclasses ; python_version<'3.7'
# This will generate three targets:
#
#  - //:reqs#flask
#  - //:reqs#requests
#  - //:reqs#dataclasses
python_requirements(name="reqs")

# The above target generator is spiritually equivalent to this:
python_requirement(
    name="flask",
    requirements=["flask>=1.1.2,<1.3"],
)
python_requirement(
    name="requests",
    requirements=["requests[security]==2.23.0"],
)
python_requirement(
    name="dataclasses",
    requirements=["dataclasses ; python_version<'3.7'"],
)

If the file uses a different name than requirements.txt, set source like this:

python_requirements(source="reqs.txt")

📘

Where should I put the requirements.txt?

You can name the file whatever you want, and put it wherever makes the most sense for your project.

In smaller repositories that only use Python, it's often convenient to put the file at the "build root" (top-level), as used on this page.

For larger repositories or multilingual repositories, it's often useful to have a 3rdparty or 3rdparty/python directory. Rather than the target's address being //:reqs#my_requirement, its address would be 3rdparty/python:reqs#my_requirement, for example; or 3rdparty/python#my_requirement if you leave off the name field for python_requirements. See Target Generation.

Poetry

The poetry_requirements() target generator parses the Poetry section in pyproject.toml to produce a python_requirement target for each entry.

[tool.poetry.dependencies]
python = "^3.8"
requests = {extras = ["security"], version = "~1"}
flask = "~1.12"

[tool.poetry.dev-dependencies]
isort = "~5.5"
# This will generate three targets:
#
#  - //:poetry#flask
#  - //:poetry#requests
#  - //:poetry#dataclasses
poetry_requirements(name="poetry")

# The above target generator is spiritually equivalent to this:
python_requirement(
    name="requests",
    requirements=["requests[security]>=1,<2.0"],
)
python_requirement(
    name="flask",
    requirements=["flask>=1.12,<1.13"],
)
python_requirement(
    name="isort",
    requirements=["isort>=5.5,<5.6"],
)

Note that Pants does not consume your poetry.lock file. Instead, see the section on lockfiles below.

How dependencies are chosen

Once Pants knows about your "universe"(s) of dependencies, it determines which subset should be used through dependency inference. Pants will read your import statements, like import django, and map it back to the relevant python_requirement target. Run ./pants dependencies path/to/file.py or ./pants dependencies path/to:target to confirm this works.

If dependency inference does not work—such as because it's a runtime dependency you do not import—you can explicitly add the python_requirement target to the dependencies field, like this:

python_sources(
    name="lib",
    dependencies=[
        # We don't have an import statement for this dep, so inference
        # won't add it automatically. We add it explicitly instead.
        "3rdparty/python#psyscopg2-binary",
    ],
)

Use modules and module_mapping when the module name is not standard

Some dependencies expose a module different than their project name, such as beautifulsoup4 exposing bs4. Pants assumes that a dependency's module is its normalized name—i.e. My-distribution exposes the module my_distribution. If that default does not apply to a dependency, it will not be inferred.

Pants already defines a default module mapping for some common Python requirements, but you may need to augment this by teaching Pants additional mappings:

# `modules` and `module_mapping` is only needed for requirements where 
# the defaults do not work.

python_requirement(
    name="my_distribution",
    requirements=["my_distribution==4.1"],
    modules=["custom_module"],
)

python_requirements(
    name="reqs",
    module_mapping={"my_distribution": ["custom_module"]},
)

poetry_requirements(
    name="poetry",
    module_mapping={"my_distribution": ["custom_module"]},
)

If the dependency is a type stub, and the default does not work, set type_stub_modules on the python_requirement target, and type_stubs_module_mapping on the python_requirements and poetry_requirements target generators. (The default for type stubs is to strip off types-, -types, -stubs, and stubs-. So, types-requests gives type stubs for the module requests.)

Warning: multiple versions of the same dependency

It's invalid in Python to have conflicting versions of the same requirement, e.g. Django==2 and Django==3. Instead, Pants supports "multiple resolves" (i.e. multiple lockfiles), as explained in the below section on lockfiles.

When you have multiple targets for the same dependency and they belong to the same resolve ("lockfile"), dependency inference will not work due to ambiguity. If you're using lockfiles—which we strongly recommend—the solution is to set the resolve field for problematic python_requirement targets so that each resolve has only one requirement and there is no ambiguity.

This ambiguity is often a problem when you have 2+ requirements.txt or pyproject.toml files in your project, such as project1/requirements.txt and project2/requirements.txt both specifying django. You may want to set up each poetry_requirements/python_requirements target generator to use a distinct resolve so that there is no overlap. Alternatively, if the versions are the same, you may want to consolidate the requirements into a common file.

Lockfiles

We strongly recommend using lockfiles because they make your builds more stable so that new releases of dependencies will not break your project. They also reduce the risk of supply chain attacks.

Pants has two types of lockfiles:

  • User lockfiles, for your own code such as packaging binaries and running tests.
  • Tool lockfiles, to install tools that Pants runs like Pytest and Flake8.

With both types of lockfiles, Pants can generate the lockfile for you with the generate-lockfiles goal.

User lockfiles

First, set [python].enable_resolves in pants.toml:

[python]
enable_resolves = true

By default, Pants will write the lockfile to 3rdparty/python/default.lock. If you want a different location, change [python].resolves like this:

[python]
enable_resolves = true

[python.resolves]
python-default = "lockfile_path.lock" 

Then, use ./pants generate-lockfiles to generate the lockfile.

❯ ./pants generate-lockfiles
19:00:39.26 [INFO] Completed: Generate lockfile for python-default
19:00:39.29 [INFO] Wrote lockfile for the resolve `python-default` to 3rdparty/python/default.lock

📘

FYI: user lockfiles improve performance

As explained at the top of these docs, Pants only uses the subset of the "universe" of your
dependencies that is actually needed for a build, such as running tests and packaging a wheel
file. This gives fine-grained caching and has other benefits like built packages
(e.g. PEX binaries) only including their true dependencies.

Without lockfiles, Pants must "resolve" the unique dependencies for each task, which involves
often-slow steps like choosing which versions of transitive dependencies to install.

Instead, with lockfiles, Pants already did the resolve beforehand, so only installs the specific
subset of the lockfile relevant to the task.

Multiple lockfiles

While it's often desirable to have a single lockfile for the whole repository for simplicity and consistency, sometimes you may need multiple. This is necessary, for example, when you have conflicting versions of requirements, such as one project using Django 2 and other projects using Django 3.

Start by defining multiple "resolves", which are logical names for lockfile paths. For example:

[python]
enable_resolves = true
default_resolve = "web-app"

[python.resolves]
data-science = "3rdparty/python/data_science.lock"
web-app = "3rdparty/python/web_app.lock"

Then, teach Pants which resolves every python_requirement target belongs to through the resolve field. It will default to [python].default_resolve.

python_requirement(
    name="ansicolors",
    requirements=["ansicolors==1.18"],
    resolve="web-app",
)

# Often, you will want to set `resolve` on the 
# `poetry_requirements` and `python_requirements`
# target generators.
poetry_requirements(
    name="poetry",
    resolve="data-science",
    # You can use `overrides` if you only want to change
    # some targets.
    overrides={"requests": {"resolve": "web-app"}},
)

If you want the same requirement to show up in multiple resolves, use the parametrize mechanism.

# The same requirement in multiple resolves:
python_requirement(
    name="ansicolors_web-app",
    requirements=["ansicolors==1.18"],
    resolve=parametrize("web-app", "data-science")
)

# You can parametrize target generators, including 
# via the `overrides` field:
poetry_requirements(
    name="poetry",
    resolve="data-science",
    overrides={
        "requests": {
            "resolve": parametrize("web-app", "data-science")
        }
    },
)

Then, run ./pants generate-lockfiles to generate the lockfiles. If the results aren't what you'd expect, adjust the prior step.

Finally, update your first-party targets like python_source / python_sources, python_test / python_tests, and pex_binary to set their resolve field. As before, the resolve field defaults to [python].default_resolve.

python_sources(
    resolve="web-app",
)

python_tests(
    name="tests",
    resolve="web-app",
    # You can use `overrides` to change certain generated targets
    overrides={"test_utils.py": {"resolve": "data-science"}},
)

pex_binary(
    name="main",
    entry_point="main.py",
    resolve="web-app",
)

If a first-party target is compatible with multiple resolves—such as some utility code—you can either use the parametrize mechanism with the resolve field or create distinct targets for the same entity.

All transitive dependencies of a target must use the same resolve. Pants's dependency inference already handles this for you by only inferring dependencies on targets that share the same resolve. If you incorrectly add a target from a different resolve to the dependencies field, Pants will error with a helpful message when building your code with goals like test, package, and run.

Tool lockfiles

Pants distributes a lockfile with each tool by default. However, if you change the tool's version and extra_requirements—or you change its interpreter constraints to not be compatible with our default lockfile—you will need to use a custom lockfile. Set the lockfile option in pants.toml for that tool, and then run ./pants generate-lockfiles.

[flake8]
version = "flake8==3.8.0"
lockfile = "3rdparty/flake8_lockfile.txt"  # This can be any path you'd like.

[pytest]
extra_requirements.add = ["pytest-icdiff"]
lockfile = "3rdparty/pytest_lockfile.txt"
❯  ./pants generate-lockfiles
19:00:39.26 [INFO] Completed: Generate lockfile for flake8
19:00:39.27 [INFO] Completed: Generate lockfile for pytest
19:00:39.29 [INFO] Wrote lockfile for the resolve `flake8` to 3rdparty/flake8_lockfile.txt
19:00:39.30 [INFO] Wrote lockfile for the resolve `pytest` to 3rdparty/pytest_lockfile.txt

You can also run ./pants generate-lockfiles --resolve=tool, e.g. --resolve=flake8, to only generate that tool's lockfile rather than generating all lockfiles.

To disable lockfiles entirely for a tool, set [tool].lockfile = "<none>" for that tool. Although we do not recommend this!

Manually generating lockfiles

Rather than using generate-lockfiles to generate PEX-style lockfiles, you can manually generate lockfiles. This can be helpful, for example, when adopting Pants in a repository already using Poetry by running poetry export --dev.

Manually generated lockfiles must either use Pex's JSON format or use pip's requirements.txt-style format (ideally with --hash entries for better supply chain security).
For example:

freezegun==1.2.0 \
  --hash=sha256:93e90676da3... \
  --hash=sha256:e19563d0b05...

For manually-generated user lockfiles, set [python].resolves to the path of your lockfile(s). Also set [python].resolves_generate_lockfiles to False so that Pants does not expect its metadata header. Warning: it will likely be slower to install manually generated user lockfiles than Pex ones because Pants cannot as efficiently extract the subset of requirements used for a particular task; see the option [python].run_against_entire_lockfile.

For manually-generated tool lockfiles, set [tool].lockfile to the path of your lockfile, e.g. [black].lockfile. Also set [python].invalid_lockfile_behavior = "error" so that Pants does not expect metadata headers. Note that this option will disable the check for all lockfiles, including user lockfiles, which may not be desirable. Feel free to open a GitHub issue if you want more precise control.

Advanced usage

Requirements with undeclared dependencies

Sometimes a requirement does not properly declare in its packaging metadata the other dependencies it depends on, so those will not be installed. It's especially common to leave off dependencies on setuptools, which results in import errors like this:

import pkg_resources
ModuleNotFoundError: No module named 'pkg_resources'

To work around this, you can use the dependencies field of python_requirement, so that anytime you depend on your requirement, you also bring in the undeclared dependency.

# First, make sure you have a `python_requirement` target for 
# the undeclared dependency.
python_requirement(
    name="setuptools",
    requirements=["setuptools"],
)

python_requirement(
    name="mongomock",
    requirements=["mongomock"],
    dependencies=[":setuptools"],
)

If you are using the python_requirements and poetry_requirements target generators, you can use the overrides field to do the same thing:

python_requirements(
    name="reqs",
    overrides={
        "mongomock": {"dependencies": [":reqs#setuptools"]},
    },
)
setuptools
mongomock

Version control requirements

You can install requirements from version control using two styles:

📘

Version control via SSH

When using version controlled direct references hosted on private repositories with SSH access:

[email protected] git+ssh://[email protected]:/myorg/[email protected]

...you may see errors like:

 Complete output (5 lines):
  [email protected]: Permission denied (publickey).
  fatal: Could not read from remote repository.
  Please make sure you have the correct access rights
  and the repository exists.
  ----------------------------------------

To fix this, Pants needs to be configured to pass relevant SSH specific environment variables to processes by adding the following to pants.toml:

[subprocess-environment]
env_vars.add = [
  "SSH_AUTH_SOCK",
]

Custom repositories

There are two mechanisms for setting up custom Python distribution repositories:

PEP-503 compatible indexes

Use [python-repos].indexes to add PEP 503-compatible indexes, like PyPI.

[python-repos]
indexes.add = ["https://custom-cheeseshop.net/simple"]

To exclusively use your custom index, i.e. to not use the default of PyPI, use indexes = [..] instead of indexes.add = [..].

pip --find-links

Use the option [python-repos].find_links for flat lists of packages. Same as pip's --find-links option, you can either use:

  • a URL to an HTML file with links to wheel and/or sdist files, or
  • a file:// absolute path to an HTML file with links, or to a local directory with wheel and/or
    sdist files. See the section on local requirements below.
[python-repos]
find_links = [
  "https://your/repo/here",
  "file:///Users/pantsbuild/prebuilt_wheels",
]

Authenticating to custom repos

To authenticate to custom repos, you may need to provide credentials (such as a username and password) in the URL.

You can use config file %(env.ENV_VAR)s interpolation to load the values via environment variables. This avoids checking in sensitive information to version control.

[python-repos]
indexes.add = ["http://%(env.INDEX_USERNAME)s:%(INDEX_PASSWORD)[email protected]/index"]

Alternatively, you can hardcode the value in a private (not checked-in) .pants.rc file in each user's Pants repo, that sets this config for the user:

[python-repos]
indexes.add = ["http://$USERNAME:[email protected]/index"]

Local requirements

There are two ways to specify local requirements from the filesystem:

python_requirement(
    name="django",
    # Use an absolute path to a .whl or sdist file.
    requirements=["Django @ file:///Users/pantsbuild/prebuilt_wheels/django-3.1.1-py3-none-any.whl"],
)

# Reminder: we could also put this requirement string in requirements.txt and use the
# `python_requirements` target generator.
  • The option [python-repos].find_links
[python-repos]
# Use an absolute path to a directory containing `.whl` and/or sdist files.
find_links = ["file:///Users/pantsbuild/prebuilt_wheels"]
❯ ls /Users/pantsbuild/prebuilt_wheels
ansicolors-1.1.8-py2.py3-none-any.whl
django-3.1.1-py3-none-any.whl
# Use normal requirement strings, i.e. without file paths.
python_requirement(name="ansicolors", requirements=["ansicolors==1.1.8"])
python_requirement(name="django", requirements=["django>=3.1,<3.2"])

# Reminder: we could also put these requirement strings in requirements.txt and use the
# `python_requirements` target generator

Unlike PEP 440 direct references, [python-repos].find_links allows you to use multiple artifacts for the same project name. For example, you can include multiple .whl and sdist files for the same project in the directory; if [python-repos].indexes is still set, then Pex/pip may use artifacts both from indexes like PyPI and from your local --find-links.

Both approaches require using absolute paths, and the files must exist on your machine. This is usually fine when locally iterating and debugging. This approach also works well if your entire team can use the same fixed location. Otherwise, see the below section.

Working around absolute paths

If you need to share the lockfile on different machines, and you cannot use the same absolute path, then you can use the option [python-repos].path_mappings along with [python-repos].find_links. (path_mappings is not intended for PEP 440 direct requirements.)

The path_mappings option allows you to substitute a portion of the absolute path with a logical name, which can be set to a different value than your teammates. For example, the path
file:///Users/pantsbuild/prebuilt_wheels/django-3.1.1-py3-none-any.whl could become file://${WHEELS_DIR}/django-3.1.1-py3-none-any.whl, where each Pants user defines what WHEELS_DIR should be on their machine.

This feature only works when using Pex lockfiles via [python].resolves and for tool lockfiles like Pytest and Black.

[python-repos].path_mappings expects values in the form NAME|PATH, e.g. WHEELS_DIR|/Users/pantsbuild/prebuilt_wheels. Also, still use an absolute path for [python-repos].find_links.

If possible, we recommend using a common file location for your whole team, and leveraging Pants's interpolation, so that you avoid each user needing to manually configure [python-repos].path_mappings and [python-repos].find_links. For example, in pants.toml, you could set [python-repos].path_mappings to WHEELS_DIR|%(buildroot)s/python_wheels and [python-repos].find_links to %(buildroot)s/python_wheels. Then, as long as every user has the folder python_wheels in the root of the repository, things will work without additional configuration. Or, you could use a value like %(env.HOME)s/pants_wheels for the path ~/pants_wheels.

[python-repos]
# No one needs to change these values, as long as they can use the same shared location.
find_links = ["file://%(buildroot)s/prebuilt_wheels"]
path_mappings = ["WHEELS_DIR|%(buildroot)s/prebuilt_wheels"]

If you cannot use a common file location via interpolation, then we recommend setting these options in a .pants.rc file. Every teammate will need to set this up for their machine.

[python-repos]
# Each user must set both of these to the absolute paths on their machines.
find_links = ["file:///Users/pantsbuild/prebuilt_wheels"]
path_mappings = ["WHEELS_DIR|/Users/pantsbuild/prebuilt_wheels"]

After initially setting up [python-repos].path_mappings and [python-repos].find_links, run ./pants generate-lockfiles or ./pants generate-lockfiles --resolve=<resolve-name>. You should see the path_mappings key set in the lockfile's JSON.

Constraints files

Sometimes, transitive dependencies of one of your third-party requirements can cause trouble. For example, sometimes requirements do not pin their dependencies well enough, and a newer version of its transitive dependency is released that breaks the requirement.
Constraints files allow you to pin transitive dependencies to certain versions, overriding the version that pip/Pex would normally choose.

Constraints files are configured per-resolve, meaning that the resolves for your user code from [python].resolves and each Python tool, such as Black and Pytest, can have different configuration. Use the option [python].resolves_to_constraints_file to map resolve names to paths to pip-compatible constraints files. For example:

[python.resolves_to_constraints_file]
data-science = "3rdparty/python/data_science_constraints.txt"
pytest = "3rdparty/python/pytest_constraints.txt"
requests==22.1.0
urrllib3==4.2

You can also set the key __default__ to apply the same constraints file to every resolve by default, although this is not always useful because resolves often need different constraints.

only_binary and no_binary

You can use [python].resolves_to_only_binary to avoid using sdists (source distributions) for certain requirements, and [python].resolve_to_no_binary to avoid using bdists (wheel files) for certain requirements.

only_binary and no_binary are configured per-resolve, meaning that the resolves for your user code from [python].resolves and each Python tool, such as Black and Pytest, can have different configuration. Use the options [python].resolves_to_only_binary and [python].resolves_to_no_binary to map resolve names to list of Python requirement names.

For example:

[python.resolves_to_only_binary]
data-science = ["numpy"]

[python.resolves_to_no_binary]
pytest = ["pytest-xdist"]
mypy_extra_type_stubs = ["django-stubs"]

You can also set the key __default__ to apply the same value to every resolve by default.

Tip: use ./pants export to create a virtual environment for IDEs

See Setting up an IDE for more information on ./pants export. This will create a virtual environment for your user code for compatibility with the rest of the Python ecosystem, e.g. IDEs like Pycharm.