See the example-python repository for an [example GitHub Actions worfklow](🔗).
## Directories to cache
In your CI's config file, we recommend caching these directories:
$HOME/.cache/pants/setup`: the initial bootstrapping of Pants.
$HOME/.cache/pants/named_caches`: caches of tools like pip and PEX.
$HOME/.cache/pants/lmdb_store`: cached content for prior Pants runs, e.g. prior test results.
See [Troubleshooting](🔗) for how to change these cache locations.
Nuking the cache when too big
In CI, the cache must be uploaded and downloaded every run. This takes time, so there is a tradeoff where too large of a cache will slow down your CI.
You can use this script to nuke the cache when it gets too big:
Tip: check cache performance with `
Set the option `
[stats].log = true` in `
pants.ci.toml` for Pants to print metrics of your cache's performance at the end of the run, including the number of cache hits and the total time saved thanks to caching, e.g.:
You can also add `
plugins = ["hdrhistogram"]` to the `
[GLOBAL]` section of `
pants.ci.toml` for Pants to print histograms of cache performance, e.g. the size of blobs cached.
Rather than storing your cache with your CI provider, remote caching stores the cache in the cloud, using gRPC and the open-source Remote Execution API for low-latency and fine-grained caching.
This brings several benefits over local caching:
All machines and CI jobs share the same cache.
Remote caching downloads precisely what is needed by your run—when it's needed—rather than pessimistically downloading the entire cache at the start of the run.
No download and upload stage for your cache.
No need to "nuke" your cache when it gets too big.
See [Remote Caching](🔗) for more information.
## Recommended commands
### Approach #1: only run over changed files
Because Pants understands the dependencies of your code, you can use Pants to speed up your CI by only running tests and linters over files that actually made changes.
We recommend running these commands in CI:
Because most linters do not care about a target's dependencies, we lint all changed targets, but not any dependees of those changed targets.
Meanwhile, tests should be rerun when any changes are made to the tests _or_ to dependencies of those tests, so we use the option `
check` should also run on any transitive changes.
See [Advanced target selection](🔗) for more information on `
--changed-since` and alternative techniques to select targets to run in CI.
This will not handle all cases, like hooking up a new linter
For example, if you add a new plugin to Flake8, Pants will still only run over changed files, meaning you may miss some new lint issues.
For absolute correctness, you may want to use Approach #2. Alternatively, add conditional logic to your CI, e.g. that any changes to `
pants.toml` trigger using Approach #2.
GitHub Actions: use `
To use `
--changed-since`, you may want to use the [Checkout action](🔗).
By default, Checkout will only fetch the latest commit; you likely want to set `
fetch-depth` to fetch prior commits.
GitLab CI: disable shallow clones or fetch main branch
GitLab's merge pipelines make a shallow clone by default, which only contains recent commits for the feature branch being merged. That severely limits `
--changed-since`. There are two possible workarounds:
Clone the entire repository by going to "CI / CD" settings and erase the number from the "Git shallow clone" field of the "General pipelines" section. Don't forget to "Save changes". This has the advantage of cloning everything, which also is the biggest long-term disadvantage.
A more targeted and hence light-weight intervention leaves the shallow clone setting at its default value and instead fetches the `
main` branch as well:
git branch` commands are only included to print out all available branches before and after fetching `
### Approach #2: run over everything
Alternatively, you can simply run over all your code. Pants's caching means that you will not need to rerun on changed files.
However, when the cache gets too big, it should be nuked (see "Directories to cache"), so your CI may end up doing more work than Approach #1.
This approach works particularly well if you are using remote caching.
## Configuring Pants for CI: `
Sometimes, you may want config specific to your CI, such as turning on test coverage reports. If you want CI-specific config, create a dedicated `
pants.ci.toml` [config file](🔗). For example:
Then, in your CI script or config, set the environment variable `
PANTS_CONFIG_FILES=pants.ci.toml` to use this new config file, in addition to `
### Tuning resource consumption (advanced)
Pants allows you to control its resource consumption. These options all have sensible defaults. In most cases, there is no need to change them. However, you may benefit from tuning these options.
process_execution_local_parallelism`](🔗): number of concurrent processes that may be executed locally.
rule_threads_core`](🔗): number of threads to keep active to execute `
rule_threads_max`](🔗): maximum number of threads to use to execute `
Memory usage options:
pantsd`](🔗): enable or disable the Pants daemon, which uses an in-memory cache to speed up subsequent runs after the first run in CI.
pantsd_max_memory_usage`](🔗): reduce or increase the size of Pantsd's in-memory cache.
The default test runners for these CI providers have the following resources. If you are using a custom runner, e.g. enterprise, check with your CI provider.
|CI Provider||# CPU cores||RAM||Docs|
|GitHub Actions, Linux||2||7 GB||https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources|
|Travis, Linux||2||7.5 GB||https://docs.travis-ci.com/user/reference/overview/#virtualisation-environment-vs-operating-system|
|Circle CI, Linux, free plan||2||4 GB||https://circleci.com/docs/2.0/credits/#free-plan|
|GitLab, Linux shared runners||1||3.75 GB||https://docs.gitlab.com/ee/user/gitlab_com/#linux-shared-runners|
## Tip: store Pants logs as artifacts
We recommend that you configure your CI system to store the pants log (`
.pantd.d/pants.log`) as a build artifact, so that it is available in case you need to troubleshoot CI issues.
Different CI providers and systems have different ways to configure build artifacts:
Circle CI - [Storing artifacts](🔗)
Github Actions - [Storing Artifacts](🔗) - [example in the pants repo](🔗)
Bitbucket pipelines - [Using artifacts](🔗)
Jenkins - [Recording artifacts](🔗)
It's particularly useful to configure your CI to always upload the log, even if prior steps in your pipeline failed.