I’ve needed to move files or directories (along with their histories) from one Git repository into a new repository often enough now that I’m annoyed with myself each time I can’t remember how to do it. Hence, here are my notes on how to accomplish this.
I don’t take any credit for the actual commands mentioned here, everything has been gleaned from the amazing knowledge resource that is StackOverflow. In particular these answers were used when working out the solution presented below:
- How to split a git repository while preserving subdirectories?
- Splitting a set of files within a git repo into their own repository, preserving relevant history
- How to move a file from one git repository to another while preserving history
When one project is really two
Imagine you have a repository which has been growing and growing and at some point you realise that a part of the repository is really a project on its own. How to take this part (be it a file, set of files, or entire subdirectory) and create a new repository containing only these files and their respective histories (and no other)?
The trick is to think of the new repository as being the old repository, however with the files (and their histories) that you don’t want to keep removed from it.
This is the process to use:
- clone the original repository locally
- enter the clone and remove all files from git that aren’t wanted
Moving files and directories
First, clone the original repo:
Now remove the
origin remote reference (we want to detach the new
repository from the history of the original one):
Then it’s a simple matter of uttering the following incantation:
What this does is goes through all commits in the clone of the original repository looking for files which don’t match the files you want to keep and removes their entries in the index. Afterwards you’re left with just the commits for just the files you’re interested in.
To make sure that everything is cleaned up, you can also run Git’s garbage collector explicitly so that everything that isn’t required really has been purged:
Now rename the directory to something more appropriate for the subproject
that has been created and reassign the
origin remote pointer (assuming, of
course, that the remote bare repository has already been created):
Of course, the moved files need to be removed from the original repository and a commit message indicating where they ended up would be very helpful for possible repository archaeology in the future.
Moving just a directory
The definitive guide to moving a subdirectory is in the answer to this question on Stack Overflow: Detach (move) subdirectory into separate Git repository
To paraphrase that answer, here is how to extract just the given directory, pulling in all branches and tags.
If you don’t want all tags and branches you can just rewrite the current
HEAD by using this version of the command:
Making the complex simple:
It turns out that splitting a subdirectory of a project out into a new project is sufficiently common that there is also a Git command especially for it:
Again, clone the original repo. This is effectively a backup of your
repository, which is a good idea, because the
git subtree command is
destructive and will rewrite your history. As I saw on a T-shirt recently:
“No backup? No pity!”.
Now we split a subdirectory of the repository (called the “prefix” in
subtree terminology) into its own “project” and create a new branch with
just this subdirectory and its history.
If you check out the new branch
you’ll find only the files from the subdirectory that you just split from the original project. Assuming that you’ve already made a bare repository for the new project, you can now add the bare repository as an upstream reference and push this branch to the new project’s master branch:
The nice, clean, shiny project repository can now be cloned from upstream:
And that’s it! I hope that helped someone and that it helps my forgetful future self :-)