Hey! These docs are for version 2.10, which is no longer officially supported. Click here for the latest version, 2.18!

Pants supports code generators that convert a protocol language like Protobuf into other languages, such as Python or Java. The same protocol source may be used to generate multiple distinct languages.

Pants will not actually write the generated files to disk, except when running `./pants export-codegen`. Instead, any targets that depend on the protocol targets will cause their code to be generated, and those generated files will be copied over into the "chroot" (temporary directory) where Pants executes.

Example: Protobuf -> Python

This guide walks through each step of adding Protobuf to generate Python sources. See [here](🔗) for the final result.

This guide assumes that you are running a code generator that already exists outside of Pants as a stand-alone binary, such as running Protoc or Thrift.

If you are instead writing your own code generation logic inline, you can skip Step 2. In Step 4, rather than running a `Process`, use [`CreateDigest`](🔗).

## 1. Create a target type for the protocol

You will need to define a new target type to allow users to provide metadata for their protocol files, e.g. their `.proto` files. See [Creating new targets](🔗) for a guide on how to do this.

You should define a subclass of `SourcesField`, like `ProtobufSourceField` or `ThriftSourceField`. This is important for Step 3.

Typically, you will want to register the `Dependencies` field.

### Target type already exists?

If Pants already has a target type for your protocol—such as Pants already having a `ProtobufSourceTarget` defined—you should not create a new target type.

Instead, you can optionally add any additional fields that you would like through plugin fields. See [Extending pre-existing targets](🔗).

### Add dependency injection (Optional)

Often, generated files will depend on a runtime library to work. For example, Python files generated from Protobuf depend on the `protobuf` library.

Instead of users having to explicitly add this dependency every time, you can dynamically inject this dependency for them.

To inject dependencies:

  1. Subclass the `Dependencies` field. Register this subclass on your protocol target type.

  2. Define a subclass of `InjectDependenciesRequest` and set the class property `inject_for` to the `Dependencies` subclass defined in the previous step. Register this new class with a [`UnionRule`](🔗) for `InjectDependenciesRequest`.

  3. Create a new rule that takes your new `InjectDependenciesRequest` subclass as a parameter and returns `InjectedDependencies`.

For example, in Pants's Protobuf implementation, Pants looks for a `python_requirement` target with `protobuf`. See [protobuf/python/python_protobuf_subsystem.py](🔗).

## 2. Install your code generator

There are several ways for Pants to install your tool. See [Installing tools](🔗). This example will use `ExternalTool` because there is already a pre-compiled binary for Protoc.

## 3. Create a `GenerateSourcesRequest`

`GenerateSourcesRequest` tells Pants the `input` and the `output` of your code generator, such as going from `ProtobufSourceField -> PythonSourceField`. Pants will use this to determine when to use your code generation implementation.

Subclass `GenerateSourcesRequest`:

The `input` should be the `SourcesField` class for your protocol target from Step 1.

The `output` should typically be the `SourcesField` class corresponding to the "language" you're generating for, such as `JavaSourceField` or `PythonSourceField`. The `output` type will understand subclasses of what you specify, so, generally, you should specify `PythonSourceField` instead of something more specific like `PythonTestSourceField`.

Note that your rule will not actually return an instance of the `output` type, e.g. `PythonSourceField`. Codegen rules only return a `Snapshot`, rather than a whole `SourcesField`. The `output` field is only used as a signal of intent.

Finally, register your new `GenerateSourcesRequest` with a [`UnionRule`](🔗).

## 4. Create a rule for your codegen logic

Your rule should take as a parameter the `GenerateSourcesRequest` from Step 3 and the `Subsystem` (or `ExternalTool`) from Step 2. It should return `GeneratedSources`.

The `GenerateSourcesRequest` parameter will have two fields: `protocol_sources: Snapshot` and `protocol_target: Target`. Often, you will want to include `protocol_sources` in the `input_digest` to the `Process` you use to run the generator. You can use `protocol_target` to look up more information about the input target, such as finding its dependencies.

The rule should return `GeneratedSources`, which take a [`Snapshot`](🔗) as its only argument. This should be a Snapshot of the generated files for the input target.

If you used `ExternalTool` in step 1, you will use `Get(DownloadedExternalTool, ExternalToolRequest)` to install the tool. Be sure to merge this with the `protocol_sources` and any other relevant input digests via `Get(Digest, MergeDigests)`.

For many code generators, you will need to get the input target's direct or transitive dependencies and include their sources in the `input_digest`. See [Rules and the Target API](🔗).

You will likely need to add logic for handling [source roots](🔗). For example, the code generator may not understand source roots so you may need to [strip source roots](🔗) before putting the sources in the `input_digest`. Likely, you will want to restore a source root after generation because most Pants code will assume that there is a source root present. In the below example, we restore the original source root, e.g. `src/protobuf/f.proto` becomes `src/protobuf/f_pb2.py`. See [`protobuf/python/rules.py`](🔗) for a more complex example that allows the user to specify what source root to use through a field on the `protobuf_library`.

Finally, update your plugin's `register.py` to activate this file's rules.

Tip: use `export-codegen` to test it works

Run `./pants export-codegen path/to/file.ext` to ensure Pants is correctly generating the file. This will write the generated file(s) under the `dist/` directory, using the same path that will be used during Pants runs.

## 5. Audit call sites to ensure they've enabled codegen

Call sites must opt into using codegen, and they must also specify what types of sources they're expecting. See [Rules and the Target API](🔗) about `SourcesField`.

For example, if you added a code generator that goes from `ProtobufSourceField -> JavaSourceField`, then Pants's Python backend would not use your new implementation because it ignores `JavaSourceField`.

You should check that everywhere you're expecting is using your new codegen implementation by manually testing it out. Create a new protocol target, add it to the `dependencies` field of a target, and then run goals like `./pants package` and `./pants test` to make sure that the generated file works correctly.

## 6. Add tests (optional)

Refer to [Testing rules](🔗).