An Improved Python Pip Workflow
Pip is a quite useful tool. Bundled with Python since 3.4, it’s the default way of adding and removing third
party dependencies to projects. Simply activate your virtualenv and run pip install. Sometime later, run
pip freeze > requirements.txt to create a requirements file that includes the versions of the packages you
installed, so when somebody else installs the dependencies, they install what you used and not some potentially
incompatible version. They can do so by running pip install -r requirements.txt
.
The problem
The process previously described, while simple, assumes that we have a single set of requirements. This may be acceptable if you don’t have a problem with installing your testing dependencies in your production environment, if you don’t have dependencies specific to an environment, or if you’re just writing a simple program or script.
If you DO care however, then this becomes a real PITA really fast.
The solution: pip mastery
Pip is a easy-to-use tool. So easy that you don’t even need to read the manual.
In the pip install
documentation page, in the Requirements File Format
section, there’s information about two little things we can add to our requirements.txt files:
-r
: to import all the requirements from a different requirements file, similar to inheritance.-c
: to refer to another file which contains contraints, like package version contraints.
Integrating these two things into our workflow, it’ll look like this:
Installing production dependencies
Create a lock.txt file if it doesn’t exist already.
Create a file production.txt and list the packages that you want to install, one package per line.
Refer to the lock.txt file as the constraint file.
-c lock.txt
Flask
gunicorn
SQLAlchemy
Run pip install -r production.txt
. Wait until packages are installed.
Run pip freeze > lock.txt
.
That’s it. If you need to install further production dependencies, then simply add them to the production.txt file and repeat the process.
Installing development dependencies
Create the lock.txt file if for some reason it doesn’t exist already.
Create a file development.txt and list the development packages you want to install, one package per line.
Refer to the production.txt file as required file, and to lock.txt as contraints file.
-c lock.txt
-r production.txt
flake8
pytest
Run pip install -r development.txt
. Wait until packages are installed.
Run pip freeze > lock.txt
.
As you see, the process is almost identical to installing production dependencies. To install further development dependencies, add them to development.txt and repeat the process.
Uninstalling a package
Well, ‘ere comes the ugly!
- Remove the packages from production.txt, development.txt, or whatever requirements file you’re using.
pip freeze | xargs pip uninstall -y
to uninstall all installed packages.pip install -r *development.txt*
(or production.txt).pip freeze > lock.txt
By doing things this way, you ensure that no dangling dependencies are left installed, and that further clean installs will work correctly after the package removal.
The other solution: Pipenv
There’s another solution that has been gaining mindshare since quite a while, even being featured by the PyPA itself. That’s of course, Pipenv.
So why I don’t recommend using Pipenv?
- Pipenv is VERY slow: I don’t know why this is the case, but everytime you install a new Python package, the default behavior is to update the Pipenv.lock file. And this sometimes takes tens of seconds. Unacceptable.
- Pipenv updates already installed packages when installing a new package: Maybe there’s yet another command or flag to prevent this, but the fact is that I see no reason why that should be the default behavior.
- Pipenv requires a separate package installation: This is more of a nitpick. But pip comes preinstalled, so it works in any computer
with Python, and you don’t have to add an additional
pip install pipenv
layer in Dockerfiles. - Pipenv install without arguments does something unexpected: When you run
npm install
, it installs from the lockfile if available. If you runyarn install
it uses the lockfile. If you runpipenv install
it ignores the lockfile and installs from Pipfile silently. You apparently have to runpipenv sync
to install from the lockfile… but honestly, I don’t know. It’s confusing. - Pipenv assumes that you always want a virtualenv: Which is not what you really want inside a Docker container. You have to add the –system flag for this, which is only said briefly in the advanced documentation.
- Pipenv assumes that you want your virtualenv in the home directory: If you want it inside the project directory, then you have to set
some obscure environment variable that’s even more hidden than the
--system
flag. Earlier I made a blog post and made sure of including that flag just so I could check it later if I forgot about it.
In summary, Pipenv brings a ton of opinions into a project just to solve two issues: create virtualenvs, and install Python packages
while also updating a lockfile. The former is solved by running python -m venv .venv
, and the latter by following the instructions in
this post.
Now, to be fair, it also does hash integrity checks, which our proposed alternative doesn’t do; Pip does support hash checking since version 8.0, so maybe we’re missing just a few more tweaks for feature parity…
Closing notes
This literal paragraph and section is here as to not finish the post with a rant. Please ignore, but not completely.