A Guide to Retroactive Git-Crypt
Git-crypt is an incredible tool for convenient set-up-and-forget secrets in your repository. But what if you didn't consider it at your repo's conception? Or conversely, what if you reached a point when need to clean your repository of git-crypt?
Despite abundant vocal purist git proponents of "never rewriting published git history", there are some good reasons to do so. Moreover, this conundrum has quite a bit of demand for a good approach.
A notable mechanism offered by a predecessor in this search is using patch files, but my case ran into encoding issues.
If you are in the same boat, and have a good enough reason to seek a way to either a) convert a cleantext repository into a state equivalent to having started it with git-crypt the whole time or b) completely erase the effects of git-crypt on repository history, read on!
If you're in a hurry, here's a TL;DR!
- Encrypt a Cleantext Repo
- Decrypt an Encrypted Repo
export KEY=/tmp/git-crypt.key
git-crypt init
git-crypt export-key $KEY
git-crypt lock
# Insert git-crypt filter into .gitattributes where desired in past commit(s)
# eg, using git rebase -i
# Then, back at the tip of the branch
GIT_LFS_SKIP_SMUDGE=1 \
FILTER_BRANCH_SQUELCH_WARNING=1 \
git filter-branch --tree-filter \
'git ls-files | xargs git check-attr filter | \
grep "filter: git-crypt" | cut -f 1 -d ":" | \
xargs -I {} sh -c "git-crypt clean --key-file $KEY < {} | sponge {}" && \
git clean -fqX' \
--prune-empty -- --all
export KEY=/tmp/git-crypt.key
git-crypt export-key $KEY
git-crypt lock
GIT_LFS_SKIP_SMUDGE=1 \
FILTER_BRANCH_SQUELCH_WARNING=1 \
git filter-branch --tree-filter \
'git ls-files | xargs git check-attr filter | \
grep "filter: git-crypt" | cut -f 1 -d ":" | \
xargs -I {} sh -c "git-crypt smudge --key-file $KEY < {} | sponge {}" && \
git clean -fqX' \
--prune-empty -- --all
GIT_LFS_SKIP_SMUDGE=1 \
FILTER_BRANCH_SQUELCH_WARNING=1 \
git filter-branch --tree-filter \
'([ -e .gitattributes ] && sed -i "/filter=git-crypt/d" .gitattributes || true) && \
([ -e .gitattributes ] && [ ! -s .gitattributes ] && rm .gitattributes || true) && \
git clean -fqX' \
--prune-empty -- --all
This process will affect every commit in your repository, change ref hashes, and otherwise participates in all the dangers of manually rewriting git history.
Make backups! Septuple check before the force push!
Some Git Background: Filters
Git offers a mechanism to split Staging and Working states of our files, and filters provide deterministic conversion between the two. A smudge
filter converts files from staging to working (read: useful) state, while clean
should “clean” files back into pristine repo storage state.
Source: https://git-scm.com/book/ms/v2/Customizing-Git-Git-Attributes
In technical terms, filters can be any command into which file content can be piped. Filters are defined in git config and invoked for matching files in .gitattributes
.
As a simple example from the git manual, we could set a global filter called indent and invoke it on *.c
files:
git config --global filter.indent.smudge cat
git config --global filter.indent.clean indent
*.c filter=indent
Now, all our C files are piped through cat
on checkout (unchanged) and auto-indented on commit.
Another common example is LFS, which uses smudge
and clean
to convert LFS pointer files to/from their rejuvenated state (more details here)
Diving into Git-Crypt
Git-crypt uses filters for its transparent encryption magic: staged encrypted files and decrypted working directory. Lets take a look at a simple repo with git-crypt set up as per instructions:
...
[filter "git-crypt"]
smudge = \"git-crypt\" smudge
clean = \"git-crypt\" clean
required = true
[diff "git-crypt"]
textconv = \"git-crypt\" diff
When unlocked, git-crypt adds filters git-crypt smudge
and git-crypt clean
, and stores the encryption key under .git/git-crypt/keys/
.
Lets replicate this behavior with git-crypt locked - we will need it for rewriting history manually
export KEY=/tmp/git-crypt.key
git-crypt export-key $KEY
git-crypt lock
Now, the git-crypt
filter definition is gone, and files are no longer decrypted:
git config -l | grep git-crypt
# Empty - no filters
bat text/test-text-file_small.txt
[bat warning]: Binary content from file 'text/test-text-file_small.txt' will not be printed to the terminal ```
But we can still decrypt it by manually invoking the smudge filter:
git-crypt smudge --key-file $KEY < text/test-text-file_small.txt
Lorem ipsum dolor sit amet, consectetur adipiscing elit...
Note that we needed to supply our key manually - by default the filter looks for them under .git/git-crypt/keys/
, but git-crypt lock
cleaned them up.
Nice! Now, we can decrypt the git-crypt files in-place all at once by matching them against the git attributes:
git ls-files | xargs git check-attr filter | \
grep "filter: git-crypt" | cut -f 1 -d ":" | \
xargs -I {} sh -c "git-crypt smudge --key-file $KEY < {} | sponge {}"
Encryption works exactly the same way, we just use git-crypt clean
instead of git-crypt smudge
. Also, we technically didn’t need to isolate specifically encrypted files like we did - git-crypt is smart enough to pass through any file it doesn’t manage. But this way is cleaner, more complete, and should be a minor optimization for big repositories.
Now we have the tool to manually decrypt/encrypt all files in a given working directory state!
Retroactive Encryption
Lets take an unencrypted repository and rewrite history as if it always was encrypted using git-crypt starting from an arbitrary point in history. We do so in two steps: manually insert changes to .gitattributes
into our history, and then run our manual encryption command on ALL commits using git filter-branch
.
We will use a simple test repo as an example:
git clone https://github.com/SpaghettiPunch/git-test-files
Init git-crypt as usual, and save the key:
export KEY=/tmp/git-crypt.key
git-crypt init
git-crypt export-key $KEY
git-crypt status
not encrypted: .gitignore
not encrypted: README.md
not encrypted: csv/.keep
not encrypted: images/.keep
not encrypted: images/test-image-png_128x128.png
not encrypted: images/test-image-png_12x12.png
not encrypted: images/test-image-png_1x1.png
not encrypted: images/test-image-png_4032x3024.png
not encrypted: pdf/.keep
not encrypted: pdf/test-pdf-file_100_paragraphs.pdf
not encrypted: pdf/test-pdf-file_empty.pdf
not encrypted: text/.keep
not encrypted: text/test-text-file_100_paragraphs.txt
not encrypted: text/test-text-file_empty.txt
not encrypted: text/test-text-file_new.txt
not encrypted: text/test-text-file_small.txt
Now we edit our history to reflect our desired git-crypt attribute setup (in this simple case, only in the root commit):
git rebase -i --root
---
edit 67ff5a7 Initial commit
pick efcf199 Ignore swap files and DS_Store
pick ee6c681 Fix naming
pick 3e942b9 Update README
pick 9d0e84d Remove DS_Store
pick 4b5d933 Add a new text file
pick 2d9b4cd Edit the new text file
# Rebase 2d9b4cd onto f3032bb (7 commands)
---
Stopped at 67ff5a7... Initial commit
Add git-crypt filters for our desired files at this point in history:
*.txt filter=git-crypt diff=git-crypt
Note that git-crypt filters automatically pick up and modify the *.txt
files now:
git-crypt status text/
not encrypted: text/.DS_Store
not encrypted: text/.keep
encrypted: text/test-text-file_100paragraphs.txt *** WARNING: staged/committed version is NOT ENCRYPTED! ***
encrypted: text/test-text-file_empty.txt *** WARNING: staged/committed version is NOT ENCRYPTED! ***
encrypted: text/test-text-file_small.txt *** WARNING: staged/committed version is NOT ENCRYPTED! ***
Warning: one or more files is marked for encryption via .gitattributes but
was staged and/or committed before the .gitattributes file was in effect.
Run 'git-crypt status' with the '-f' option to stage an encrypted version.
git status
interactive rebase in progress; onto 17c8c58
Changes not staged for commit:
modified: text/test-text-file_100paragraphs.txt
modified: text/test-text-file_small.txt
Untracked files:
.gitattributes
If we commit these files, they would be encrypted in the current commit, but *.txt
files in future commits will be left untouched, since the clean filter never runs. Instead, we want to manually encrypt them ourselves in every commit.
Lock git-crypt to stop this behavior:
git-crypt lock -f
git status
interactive rebase in progress; onto 17c8c58
Untracked files:
.gitattributes
# Confirm our files are still unencrypted
cat text/test-text-file_small.txt
Lorem ipsum dolor sit amet, consectetur adipiscing elit...
git add . && git commit --amend --noedit
git rebase --continue
Feel free to continue editing historic .gitattribute
state as desired.
When we’re content with our history and are back at our branch tip, we can manually encrypt the adequate files in each commit using our external key as discussed in the earlier section:
GIT_LFS_SKIP_SMUDGE=1 \
FILTER_BRANCH_SQUELCH_WARNING=1 \
git filter-branch --tree-filter \
'git ls-files | xargs git check-attr filter | \
grep "filter: git-crypt" | cut -f 1 -d ":" | \
xargs -I {} sh -c "git-crypt clean --key-file $KEY < {} | sponge {}" && \
git clean -fqX' \
--prune-empty -- --all
We are forced to use the slower --tree-filter
instead of --index-filter
, since we need to operate on the checked out working directory. Note that we also need to add a git clean
for the rewrite to respect our .gitignore
files.
GIT_LFS_SKIP_SMUDGE=1
is an optimization to further speed up checkout (we can’t use LFS and git-crypt together anyway at the moment).
That’s it! We can use git-crypt as if it was enabled all along:
git-crypt status text/
not encrypted: text/.keep
encrypted: text/test-text-file_100_paragraphs.txt
encrypted: text/test-text-file_empty.txt
encrypted: text/test-text-file_new.txt
encrypted: text/test-text-file_small.txt
bat text/test-text-file_small.txt
[bat warning]: Binary content from file 'text/test-text-file_small.txt' will not be printed to the terminal
git-crypt unlock $KEY
cat text/test-text-file_small.txt
Lorem ipsum dolor sit amet, consectetur adipiscing elit...
git show
diff --git a/text/test-text-file_new.txt b/text/test-text-file_new.txt
index c611cb4..172d1bd 100644
--- a/text/test-text-file_new.txt
+++ b/text/test-text-file_new.txt
@@ -1 +1,3 @@
A new file added in a new commit
+
+This is an edit in a later commit
Note that using this method git-crypt also picked up the new file (text/test-text-file_new.txt
), which was added at a commit after the root, as we desired
Retroactive Decryption
Now, lets undo all that work. We start with a repository that has been historically encrypted and want to remove all trace of git-crypt. To do this, we reverse the order of operations from encryption: we decrypt all the files in all commits using git attribute matching first, and then clean up the attributes themselves.
We begin with the git-crypt enabled and unlocked repo from the previous section.
Once again, we need to lock git-crypt to stop it from affecting our files and make sure our key is available outside the repo:
git-crypt export-key $KEY
git-crypt lock
We can manually decrypt all the matching files in each commit now. We don’t clean up .gitattributes
just yet, since we use it to identify the encrypted files.
GIT_LFS_SKIP_SMUDGE=1 \
FILTER_BRANCH_SQUELCH_WARNING=1 \
git filter-branch --tree-filter \
'git ls-files | xargs git check-attr filter | \
grep "filter: git-crypt" | cut -f 1 -d ":" | \
xargs -I {} sh -c "git-crypt smudge --key-file $KEY < {} | sponge {}" && \
git clean -fqX' \
--prune-empty -- --all
This is the same command as in the encryption case, but smudge
instead of clean
Now that all the files are decrypted, we can clean up .gitattributes
in a new rewrite. In each commit, we clean up any git-crypt filter lines in the file if it exists, and can also delete the file if that was the only line, leaving it empty:
GIT_LFS_SKIP_SMUDGE=1 \
FILTER_BRANCH_SQUELCH_WARNING=1 \
git filter-branch --tree-filter \
'([ -e .gitattributes ] && sed -i "/filter=git-crypt/d" .gitattributes || true) && \
([ -e .gitattributes ] && [ ! -s .gitattributes ] && rm .gitattributes || true) && \
git clean -fqX' \
--prune-empty -- --all
Don’t forget to clean up:
rm -r .git/git-crypt
rm $KEY
Our repo history is free of encrypted files and git-crypt settings again!
git show
diff --git a/text/test-text-file_new.txt b/text/test-text-file_new.txt
index c611cb4..172d1bd 100644
--- a/text/test-text-file_new.txt
+++ b/text/test-text-file_new.txt
@@ -1 +1,3 @@
A new file added in a new commit
+
+This is an edit in a later commit
git-crypt status text/
not encrypted: text/.keep
not encrypted: text/test-text-file_100_paragraphs.txt
not encrypted: text/test-text-file_empty.txt
not encrypted: text/test-text-file_new.txt
not encrypted: text/test-text-file_small.txt