Environments#

During the installation and use of Python, and it’s many packages, it is recommended to use an environment manager to avoid conflicting versions of packages when working on different projects. CAF recommends using conda (specifically conda-forge) for managing Python environments.

Note

Environments (within the context of Python and this document) are separate installations of Python which do not interact with one another.

Conda allows you to create separate environments, each containing their own files, packages, and package dependencies. The contents of each environment do not interact with each other.[1] CAF recommends creating different environments for different projects and in some cases tasks, dependent on the tools / packages being used.

Important

When working with Python you should always define the packages, and minimum version, required for your script or process in one, or more, requirements files. This is especially important when sharing the process with others.

Create an Environment#

Important

This section assumes that conda has been installed using miniforge, see Installation & Set-Up.

Our recommended method for creating environments is using the conda create command with one, or more, “requirements.txt” files.

conda create -n env_name --file requirements.txt --file requirements_dev.txt

The above command will install all packages found in “requirements.txt” and “requirements_dev.txt”, see Requirements File for more information on those files.

Note

Conda environments need to be activated with conda activate env_name to use them, many code editors (such as VS Code or PyCharm) will do this automatically.

Additional to the method above the create command can be used to create an environment without installation any packages:

conda create -n env_name

Packages can also be defined directly in the command itself:

conda create -n env_name python numpy pandas

This will create a new environment called “env_name” and install the most up to date versions of python, numpy and pandas, version restrictions can be defined using the same syntax as in the Requirements File e.g. “pandas>=2.1”. More information on the conda create command is available in the create command documentation.

Working with Conda Environments#

This section outlines a few useful commands available within conda, more information is available in the Conda Cheatsheet (conda documentation).

  • conda create: create an environment, see Create an Environment.

  • conda activate env_name: activate the conda environment with the name “env_name”.

  • conda info --envs: show a list of all conda environments on this machine, see conda info documentation.

  • conda install package: install “package” in currently activated environment, see conda install.

  • conda uninstall package: uninstall “package” from currently activated environment.

  • conda list: list all packages, in activated environment, with the versions and channels, see conda list.

  • conda remove -n env_name --all: uninstall all packages from “env_name” and completely remove the environment, see conda remove.

Warning

Any commands which act on the currently activated environment will act on the base environment if no other is activated.

Requirements File#

The recommended approach to defining a repositories requirements is within one, or more, requirements.txt files.[2] These are simply a list of all the packages required by the project and their relevant version restrictions, an example of a requirements.txt file is shown below, each requirement is listed on a separate line in the text file.

See also

Frequently Asked Questions for an explanation of why CAF recommends requirements.txt files.

In CAF we recommend splitting dependencies into multiple requirements files:

  • requirements.txt: main requirements file in the root of the repository which contains all the dependencies required to use and run the package.

  • requirements_dev.txt: list of all the dependencies required for developing the package e.g. testing, linting and building.

  • docs/requirements.txt: dependencies required for building the documentation e.g. sphinx.

  • examples/requirements.txt: additional dependencies required for running any example scripts which are provided in the repository (and documentation).

  • requirements_{option}.txt: optional dependencies which are required for certain specific features of the package, ‘{option}` should be replaced with a sensible name for the optional feature. These should be used sparingly as in most cases features of the package shouldn’t be hidden behind optional dependencies.

Note

Many Python packages (including all CAF packages) use the Semantic Versioning 2.0.0 (SemVer) scheme for writing version numbers. In SemVer the version is split into three main parts MAJOR.MINOR.PATCH, which are incremented to reflect the type of changes in the new version.

  • Increment MAJOR version when you make incompatible API changes, i.e. code written to use 1.7 of a package might not work with 2.0

  • Increment MINOR version when you add functionality in a backward compatible manner, i.e. any code written to work with 1.6 should still work on 1.7

  • Increment PATCH version when you make backward compatible bug fixes

SemVer provides more functionality for pre-release and build metadata which is described in detail in the Semantic Versioning Specification.

Example of a requirements.txt file#

pandas>=2.0
numpy>=1.4.2,<2
matplotlib==3.4
caf.space==1
caf.toolkit[hdf]

In the example above some of the possible version constraints are shown, the meaning of each line is outlined below:

  • pandas>=2.0: pandas should be greater than, or equal to, 2.0. 2.0, 2.0.1, 2.1 are all acceptable versions

  • numpy>=1.4.2,<2: numpy should be greater than, or equal to, 1.4.2 and less than 2. 1.4.2, 1.5, 1.6 are acceptable but 1.2, 1.4.1 and 2.1 are not.

  • matplotlib==3.4: matplotlib should be equal to 3.4 up to the minor version number, but the patch number can be anything. 3.4 and 3.4.3 are acceptable but 3.3 and 3.5 are not.

  • caf.space==1: caf.space should be equal to 1 up to the major version i.e. the minor and patch versions can be anything. 1.5 is acceptable but 0.5 and 2.1 are not.

  • caf.toolkit[hdf]: any version of caf.toolkit is acceptable but the optional dependencies labelled as ‘hdf’ should also be included.

Attention

The example requirements file doesn’t necessarily contain a compatible set of requirements.

Conda Limitations#

There are two cases where conda cannot handle the installation of some packages, in these cases installation will need to be done with pip install.

Tip

pip install can be ran inside a conda environment to install the packages their, but it should only be used when conda install won’t work.

  1. If a package, or version, isn’t available on conda-forge, but is available on PyPI then you will need to use pip install.

  2. If a package, or version, isn’t available on conda-forge or PyPI then you may need to install it directly from GitHub using pip install git+github_url, see Pip Install for details.

When packages cannot be installed with conda it is often useful to define a separate requirements file (requirements_pip.txt), which lists all these packages and versions. This should be used sparingly as installing packages from multiple sources is not encouraged.

Pip Install#

Tip

Pip can install from a requirements file using pip install -r requirements.txt.

Pip is the package installer for Python and, although it can’t manage environments, it does provide some functionality not available in conda. The four main pieces of extra functionality are installing from a git repository (e.g. GitHub), installing from a zipfile, editable installs or selecting optional dependencies. The pip install documentation provides an examples section which shows all the different methods for installing packages with pip.

Attention

Using pip install in any of the methods within this section only works for packages that have the build configuration setup correctly, within CAF the build configuration is defined in pyproject.toml.

Zipfile / Local Folder#

Pip can install from a local folder or a zipfile, both local and online, this is done by providing the URL or path to the file or folder.

pip install https://github.com/Transport-for-the-North/caf.toolkit/archive/refs/heads/main.zip

Above will install caf.toolkit from the a zipped version of the main branch on the GitHub repository, see Git Repository for a better method for installing from GitHub.

pip install Documents/my_package will install whatever package is found in the “my_package” folder.

Attention

Installing can only be done from a public URL, see Git Repository for a way to install from private GitHub repositories.

Git Repository#

When installing packages from a git (or other VCS) the recommended method is to use pip install with git directly this is done using the git+ before the repository URL.

pip install git+https://github.com/Transport-for-the-North/caf.toolkit.git

Above will install caf.toolkit from the default branch on GitHub, the following can be used to select a different branch.

pip install git+https://github.com/Transport-for-the-North/caf.toolkit.git@custom_branch

Note

Installing packages directly from the git repository requires git to be installed on your machine.

Editable Installs#

A package can be installed with pip in a way that allows the package to be edited without reinstallation, this is primarily used to allow for editing the dependency of a package alongside editing the main package itself.

Editable installs are done using the -e flag for pip install and can be used when installing from a local folder or from a git repository.

pip install -e Documents/my_package

Above will install “my_package” so that any changes to it are picked up immediately.

Attention

Only use editable installs for packages you plan to edit, instead install from git if not available on conda-forge or PyPI.

Optional Dependencies#

Some packages define some dependency as optional where they’re only needed for a specific feature of the package which many people may not need. This allows for a smaller subset of dependencies for most users.

Pip can install optional dependencies with any of it’s installation methods by including them after the package name within ‘[]’. The code below shows examples of installing optional dependencies.

pip install caf.toolkit[option_a]
pip install caf.toolkit[option_a] @ git+https://github.com/Transport-for-the-North/caf.toolkit.git
pip install my_package[option] @ Documents/my_package
pip install caf.toolkit[option_a, option_b]

Pyproject.toml#

Pyproject.toml is a configuration file used by many Python tools and contains different sections for each tool. Within CAF packages there are two main uses of the pyproject.toml:

  • The settings for building and releasing the package; and

  • Parameters for any linters, formatters, tests or other development tools which we use.

The build parameters are required for pip install to work on the package and for releasing the package on PyPI or conda-forge, they’re defined in the following two tables:

  • [build-system]: what requirements are needed to build the package.

  • [project]: this table contains the package metadata e.g. description, authors and dependencies.

Tip

The project dependencies aren’t defined explicitly within pyproject.toml, the file just includes links to any requirements.txt files.

In addition to the build parameters the configuration file also contains parameters for any development tools we use, these are all defined within the [tool] table in their own specific sub-tables e.g. [tool.pylint] and [tool.pytest], see the relevant tool for information on the specific parameters available.