Debugging and benchmarking

Some techniques to figure out why Pants is behaving the way it is.

Benchmarking with hyperfine

We use hyperfine to benchmark, especially comparing before and after to see the impact of a change: https://github.com/sharkdp/hyperfine.

When benchmarking, you must decide if you care about cold cache performance vs. warm cache (or both). If cold, use --no-pantsd --no-local-cache. If warm, use hyperfine's option --warmup=1.

For example:

❯ hyperfine --warmup=1 --runs=5 './pants list ::`
❯ hyperfine --runs=5 './pants --no-pantsd --no-local-cache lint ::'

Profiling with py-spy

py-spy is a profiling sampler which can also be used to compare the impact of a change before and after: https://github.com/benfred/py-spy.

To profile with py-spy:

  1. Activate Pants' development venv
    • source ~/.cache/pants/pants_dev_deps/<your platform dir>/bin/activate
  2. Add Pants' code to Python's path
    • export PYTHONPATH=src/pants:$PYTHONPATH
  3. Run Pants with py-spy (be sure to disable pantsd)
    • py-spy record --subprocesses -- python -m pants.bin.pants_loader --no-pantsd <pants args>

The default output is a flamegraph. py-spy can also output speedscope (https://github.com/jlfwong/speedscope) JSON with the --format speedscope flag. The resulting file can be uploaded to https://www.speedscope.app/ which provides a per-process, interactive, detailed UI.

Additionally, to profile the Rust code the --native flag can be passed to py-spy as well. The resulting output will contain frames from Pants Rust code.

Identifying the impact of Python's GIL (on macOS)

Obtaining Full Thread Backtraces

Pants runs as a Python program that calls into a native Rust library. In debugging locking and deadlock issues, it is useful to capture dumps of the thread stacks in order to figure out where a deadlock may be occurring.

One-time setup:

  1. Ensure that gdb is installed.
    • Ubuntu: sudo apt install gdb
  2. Ensure that the kernel is configured to allow debuggers to attach to processes that are not in the same parent/child process hierarchy.
    • echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
    • To make the change permanent, add a file to /etc/sysctl.d named 99-ptrace.conf with contents kernel.yama.ptrace_scope = 0. Note: This is a security exposure if you are not normally debugging processes across the process hierarchy.
  3. Ensure that the debug info for your system Python binary is installed.
    • Ubuntu: sudo apt install python3-dbg

Dumping thread stacks:

  1. Find the pants binary (which may include pantsd if pantsd is enabled).
    • Run: ps -ef | grep pants
  2. Invoke gdb with the python binary and the process ID:
    • Run: gdb /path/to/python/binary PROCESS_ID
  3. Enable logging to write the thread dump to gdb.txt: set logging on
  4. Dump all thread backtraces: thread apply all bt
  5. If you use pyenv to mange your Python install, a gdb script will exist in the same directory as the Python binary. Source it into gdb:
    • source ~/.pyenv/versions/3.8.5/bin/python3.8-gdb.py (if using version 3.8.5)
  6. Dump all Python stacks: thread apply all py-bt

Did this page help you?