I have known about pre-commit for some time, but until recently, I never tried it. This week, I started to try out pre-commit, and I am finding it very useful. Here are a few pre-commits I have found useful for data science workflows.
Installation
# Install pre-commit with `uv`uv tool install pre-commit
# Set up pre-commitpre-commit install
Helpful pre-commit hooks
Copy the following to .pre-commit-config.yaml
in the root of your project. To run everything, run:
pre-commit run --all-files
# See https://pre-commit.com for more information# See https://pre-commit.com/hooks.html for more hooksrepos: # -------------------------------------------------------------------------- # General hooks # -------------------------------------------------------------------------- - repo: https://github.com/pre-commit/pre-commit-hooks rev: v3.2.0 hooks: - id: trailing-whitespace - id: end-of-file-fixer - id: check-yaml - id: check-added-large-files
# -------------------------------------------------------------------------- # Python # -------------------------------------------------------------------------- - repo: https://github.com/astral-sh/ruff-pre-commit rev: v0.11.1 hooks: # Run the ruff linter and fix autofixable issues. `I` will sort imports. # `F401` will remove unused imports. - id: ruff args: - --fix - --select=I,F401 # Run the ruff linter. - id: ruff # Run the ruff formatter. - id: ruff-format
- repo: https://github.com/astral-sh/uv-pre-commit rev: 0.6.8 hooks: # Keep your requirements.txt file up to date with your uv.lock file on # every commit. - id: uv-export
# -------------------------------------------------------------------------- # Jupyter # -------------------------------------------------------------------------- - repo: local hooks: # Always remove output from Jupyter Notebooks before making a commit. - id: clean-notebooks name: Remove output from Jupyter Notebooks entry: uvx --from nbconvert jupyter-nbconvert --clear-output language: system files: \.ipynb$ pass_filenames: true
# -------------------------------------------------------------------------- # R # -------------------------------------------------------------------------- - repo: local hooks: # https://github.com/posit-dev/air # Air is a formatter for R code. There is no official pre-commit hook # for air, so for this to work you must have air installed. In the # future air may have a pre-commit hook: # https://github.com/posit-dev/air/issues/269 - id: air name: Format R code with air entry: air format . language: system files: \.R$