Moving Files and Directories to a New Repository in Git

3 minute read

I’ve needed to move files or directories (along with their histories) from one Git repository into a new repository often enough now that I’m annoyed with myself each time I can’t remember how to do it. Hence, here are my notes on how to accomplish this.

I don’t take any credit for the actual commands mentioned here, everything has been gleaned from the amazing knowledge resource that is StackOverflow. In particular these answers were used when working out the solution presented below:

When one project is really two

Imagine you have a repository which has been growing and growing and at some point you realise that a part of the repository is really a project on its own. How to take this part (be it a file, set of files, or entire subdirectory) and create a new repository containing only these files and their respective histories (and no other)?

The trick is to think of the new repository as being the old repository, however with the files (and their histories) that you don’t want to keep removed from it.

This is the process to use:

  • clone the original repository locally
  • enter the clone and remove all files from git that aren’t wanted

Moving files and directories

First, clone the original repo:

$ git clone file:///path/to/original/repository repo_clone

Now remove the origin remote reference (we want to detach the new repository from the history of the original one):

$ git remote rm origin

Then it’s a simple matter of uttering the following incantation:

$ git filter-branch --prune-empty --index-filter \
      'git ls-tree -z -r --name-only --full-tree $GIT_COMMIT | \
       grep -z -v "file1" | \
       grep -z -v "file2" | \
       xargs -0 -r git rm --cached -r'
  -- --all

What this does is goes through all commits in the clone of the original repository looking for files which don’t match the files you want to keep and removes their entries in the index. Afterwards you’re left with just the commits for just the files you’re interested in.

To make sure that everything is cleaned up, you can also run Git’s garbage collector explicitly so that everything that isn’t required really has been purged:

$ git gc --aggressive

Now rename the directory to something more appropriate for the subproject that has been created and reassign the origin remote pointer (assuming, of course, that the remote bare repository has already been created):

$ git remote add origin git@git.server.example.com:new_repo_name.git

Of course, the moved files need to be removed from the original repository and a commit message indicating where they ended up would be very helpful for possible repository archaeology in the future.

Moving just a directory

The definitive guide to moving a subdirectory is in the answer to this question on Stack Overflow: Detach (move) subdirectory into separate Git repository

To paraphrase that answer, here is how to extract just the given directory, pulling in all branches and tags.

$ git filter-branch --tag-name-filter cat --prune-empty --subdirectory-filter <dirname> -- --all

If you don’t want all tags and branches you can just rewrite the current HEAD by using this version of the command:

$ git filter-branch --tag-name-filter cat --prune-empty --subdirectory-filter <dirname> HEAD

Making the complex simple: git subtree

It turns out that splitting a subdirectory of a project out into a new project is sufficiently common that there is also a Git command especially for it:

$ git subtree split

Again, clone the original repo. This is effectively a backup of your repository, which is a good idea, because the git subtree command is destructive and will rewrite your history. As I saw on a T-shirt recently: “No backup? No pity!”.

$ git clone file:///path/to/original/repository repo_clone

Now we split a subdirectory of the repository (called the “prefix” in git subtree terminology) into its own “project” and create a new branch with just this subdirectory and its history.

$ git subtree split --prefix <dirname> --branch <new-project-name>

If you check out the new branch

$ git checkout <new-project-name>

you’ll find only the files from the subdirectory that you just split from the original project. Assuming that you’ve already made a bare repository for the new project, you can now add the bare repository as an upstream reference and push this branch to the new project’s master branch:

$ git remote add <new-project-origin> git@git.server.example.com:new_repo_name.git
$ git push -u <new-project-origin> <new-project-name>:master

The nice, clean, shiny project repository can now be cloned from upstream:

$ git clone git@git.server.example.com:new_repo_name.git

And that’s it! I hope that helped someone and that it helps my forgetful future self :-)

Categories:

Updated:

Leave a comment