I usually create a requirements.txt when I start a new Python project. I will do something like this:
# Create a new projectmkdir new-projectcd new-project
# Create requirements.txttee -a requirements.txt <<EOFpandaspyarrowEOF
# Create a virtual environment and install dependenciespython -m venv .venvsource .venv/bin/activatepython -m pip install --upgrade pip wheel setuptoolspython -m pip install -r requirements.txtThis lets me get a new project up and running quickly. I have documented which packages I am using, which is good, but I have not documented which version of the package I am using. Since it is a new project, I want to use the latest version of every package. I could search PyPI for each package and find the latest version, but that takes a lot of time. Instead, I only document the package name in my initial requirements.txt. When I am ready to pin the versions, I use this bash script to check the version I have installed for each dependency listed in requirements.txt:
python -m pip freeze | grep -E $(cat requirements.txt | sed ':a;N;$!ba;s/\n/==|/g')The result will be something like this:
pandas==2.1.4pyarrow==14.0.2I can then paste this output into my requirements.txt file.
I could have also used pip freeze to get the version of the packages I have installed. The downside is that it will get the installed version of every package. Usually, I only want to pin the “top-level” dependency and not the sub-dependencies. For example, if I have pandas installed, I do not want to pin the version of numpy that pandas depends on. I only want to pin the version of pandas itself. I find that this makes it easier to update packages for my project in the future.