There are a range of approaches to versioning Jupyter notebooks using git (e.g. here, here, and here) by removing any output before adding the notebooks to git. But they typically rely on adding a script to your executable path that can be invoked by a git filter to remove any output. Fortunately, Jupyter’s own nbconvert can achieve the same task, which

  • avoids adding scripts to your exectuable path,
  • ensures that removing the output is always compatible with the Jupyter and python versions you are using.

Here’s how to set it up: First, open your ~/.gitconfig and add the following lines.

[filter "jupyter_clear_output"]
    clean = "jupyter nbconvert --stdin --stdout --log-level=ERROR \
             --to notebook --ClearOutputPreprocessor.enabled=True"
    smudge = cat
    required = true

The lines define a git filter called jupyter_clear_output which applies the clean filter when changes are staged and the smudge filter when files are checked out. The smudge filter is trivial: it just reproduces the input. The clean filter invokes nbconvert, reading from stdin, writing to stdout, converting to the notebook file format, and clearing all output. The flag required = true ensures that the filter does not fail silently.

The final step is to register the .ipynb extension with the jupyter_clear_output filter. If you would like to enable the filter on a per-repository basis, simply add a .gitattributes with the following content to your repository.

*.ipynb    filter=jupyter_clear_output

If you want to enable the filter globally, add the line above to ~/.gitattributes and let git know about the attributes file by adding the following line to your ~/.gitconfig.

[core]
    attributesfile = ~/.gitattributes