Kiefer Co:
@kieferandco
KieferAndCo
ChairmanCo

My Favourite Humans

Kiefer's Deduplication Notebook

Notebook open in the browser

This is a Python Jupyter Notebook that I use for deduplicating folders.

Personally, I do manual backups of my old computers, phones, and other data-storing devices and that leads to a lot of files being saved multiple times. This notebook allows you to copy a directory to a new location, keeping only the oldest files while recording all file paths for archival purposes.

See and download the git repo here!

Launching the Notebook

To install Jupyter Notebook, run:

pip install notebook

To launch Jupyter Notebook, run in this repo's directory:

jupyter notebook

Using the Notebook

Once Jupyter notebook is open in your browser, navigate to the notebook and follow the relevant steps in the first cell.

In most cases, you'll want to follow the steps for Deduplicating a Directory.

Try it out with the test directory in this repo and take a look at the contents of the hash table file!

Sample Notebook Output

–Kiefer