## Reminder: running the autoformatters and linters
Most of Pants' style is enforced via Black, isort, Docformatter, Flake8, and MyPy. Run these commands frequently when developing:
Tip: improving Black's formatting by wrapping in `
Sometimes, Black will split code over multiple lines awkwardly. For example:
Often, you can improve Black's formatting by wrapping the expression in parentheses, then rerunning `
This is not mandatory, only encouraged.
Comments must have a space after the starting `
#`. All comments should be complete sentences and should end with a period.
Comment lines should not exceed 100 characters. Black will not auto-format this for you; you must manually format comments.
### When to comment
We strive for self-documenting code. Often, a comment can be better expressed by giving a variable a more descriptive name, adding type information, or writing a helper function.
Further, there is no need to document how typical Python constructs behave, including how type hints work.
Instead, comments are helpful to give context that cannot be inferred from reading the code. For example, comments may discuss performance, refer to external documentation / bug links, explain how to use the library, or explain why something was done a particular way.
When creating a TODO, first [create an issue](🔗) in GitHub. Then, link to the issue # in parantheses and add a brief description.
### Use `
Use f-strings instead of `
.format()` and `
### Prefer conditional expressions (ternary expressions)
Similar to most languages' ternary expressions using `
?`, Python has [conditional expressions](🔗). Prefer these to explicit `
if else` statements because we generally prefer expressions to statements and they often better express the intent of assigning one of two values based on some predicate.
Conditional expressions do not work in more complex situations, such as assigning multiple variables based on the same predicate or wanting to store intermediate values in the branch. In these cases, you can use `
if else` statements.
### Prefer early returns in functions
Often, functions will have branching based on a condition. When you `
return` from a branch, you will exit the function, so you no longer need `
elif` or `
else` in the subsequent branches.
Why prefer this? It reduces nesting and reduces the cognitive load of readers. See [here](🔗) for more explanation.
### Use collection literals
Collection literals are easier to read and have better performance.
We allow the `
dict` constructor because using the constructor will enforce that all the keys are `
str`s. However, usually prefer a literal.
### Prefer merging collections through unpacking
Python has several ways to merge iterables (e.g. sets, tuples, and lists): using `
+` or `
|`, using mutation like `
extend()`, and using unpacking with the `
*` character. Prefer unpacking because it makes it easier to merge collections with individual elements; it is formatted better by Black; and allows merging different iterable types together, like merging a list and tuple together.
For dictionaries, the only two ways to merge are using mutation like `
.update()` or using `
**` unpacking (we cannot use PEP 584's `
|` operator yet because we need to support < Python 3.9.). Prefer merging with `
**` for the same reasons as iterables, in addition to us preferring expressions to mutation.
### Prefer comprehensions
[Comprehensions](🔗) should generally be preferred to explicit loops and `
filter` when creating a new collection. (See https://www.youtube.com/watch?v=ei71YpmfRX4 for a deep dive on comprehensions.)
Why avoid `
filter`? Normally, these are fantastic constructs and you'll find them abundantly in the [Rust codebase](🔗). They are awkward in Python, however, due to poor support for lambdas and because you would typically need to wrap the expression in a call to `
list()` or `
tuple()` to convert it from a generator expression to a concrete collection.
There are some exceptions, including, but not limited to:
If mutations are involved, use a `
If constructing multiple collections by iterating over the same original collection, use a `
for` loop for performance.
If the comprehension gets too complex, a `
for` loop may be appropriate. Although, first consider refactoring with a helper function.
### Prefer dataclasses
We prefer [dataclasses](🔗) because they are declarative, integrate nicely with MyPy, and generate sensible defaults, such as a sensible `
Dataclasses should be marked with `
If you want to validate the input, use `
If you need a custom constructor, such as to transform the parameters, use `
@frozen_after_init` and `
unsafe_hash=True` instead of `
## Type hints
Refer to [MyPy documentation](🔗) for an explanation of type hints, including some advanced features you may encounter in our codebase like `
Protocol` and `
### Annotate all new code
All new code should have type hints. Even simple functions like unit tests should have annotations. Why? MyPy will only check the body of functions if they have annotations.
Precisely, all function definitions should have annotations for their parameters and their return type. MyPy will then tell you which other lines need annotations.
Interacting with legacy code? Consider adding type hints.
Pants did not widely use type hints until the end of 2019. So, a substantial portion of the codebase is still untyped.
If you are working with legacy code, it is often valuable to start by adding type hints. This will both help you to understand that code and to improve the quality of the codebase. Land those type hints as a precursor to your main PR.
### Prefer `
cast()` to override annotations
MyPy will complain when it cannot infer the types of certain lines. You must then either fix the underlying API that MyPy does not understand or explicitly provide an annotation at the call site.
Prefer fixing the underlying API if easy to do, but otherwise, prefer using `
cast()` instead of a variable annotation.
Why? MyPy will warn if the `
cast` ever becomes redundant, either because MyPy became more powerful or the untyped code became typed.
### Use error codes in `
# type: ignore` comments
MyPy will output the code at the end of the error message in square brackets.
### Prefer Protocols ("duck types") for parameters
Python type hints use [Protocols](🔗) as a way to express ["duck typing"](🔗). Rather than saying you need a particular class, like a list, you describe which functionality you need and don't care what class is used.
For example, all of these annotations are correct:
Generally, prefer using a protocol like `
Sequence`, or `
Mapping` when annotating function parameters, rather than using concrete types like `
List` and `
Dict`. Why? This often makes call sites much more ergonomic.
The return type, however, should usually be as precise as possible so that call sites have better type inference.