I have a simpler version here 🙂 http://www.yeahshecodes.com/git/a-simple-git-subtree-tutorial
Defining the problem
We have to deploy code from within a VPN environment. This has meant we are reliant on a proxy to access external links. This has proven to be messy, and we often spend over a day trying to deploy a project from within the VPN and attempting to connect to external resources (npm, bower, github etc).
Additionally our vpn users have a data allowance which is capped.
Some problems we had to solve for:
- We need to be able to deploy code without relying on a proxy and without getting restricted by a data allowance.
- We need to be able to deploy to more than one environment, eg a staging environment and a production environment.
- We need to deploy a “dist” folder that is in the .gitignore file and not push to our repo on github, only to the server we want to deploy from.
- We did not want to use tcp to just copy files over, we wanted something more structured than that.
- Write an ansible script that will automate the process (not covered in this post)
Deciding on a solution
Git subtrees sounded like a good solution. I glanced over basic tutorials and skim read through some resources.
I went through a few attempts to use subtrees and realised no one pointed out how NOT to use subtrees, and this left me spending a considerable amount of time doing the wrong thing. So here is a quick run down of what not to do. And then what you can do instead.
Wrong approach
I have to emphasise that this proved to be a wrong approach to using subtrees. Move a specific folder into a subtree within the same repo AND to an external repo.
I added two destinations to my origin remote, so that when I committed I push my changes from a folder “dist” into a subtree on my current repo and a repo on a different server.
After a lot of struggling I realised this was the wrong approach.
Better approach
From origin repo push a specific folder into a subtree of an external repo only.
Have two remotes, one will be “origin” and one will be “server”. When I push the subtree, I want to push it to “server” when I commit my working/source code I want that to push to the origin (aka github).
Both remotes can be hosted on github, in our case we have 1 github hosted account and 1 bare git repo on a private server we manage.
I set up a bare repo on a private server, that could only be accessed when you were logged in on the vpn.
The approach in motion
Setting up a bare github repo
On a server go to the folder you want to keep your bare repos.
Hint: the command “pwd” will tell you the file path where you currently are.
Then run this command
1 | git init --bare REPONAME.git |
Replace REPONAME with the name you want to give the repo
This will create a folder in the folder you are currently located.
Eg /REPONAME.git/
Then go into that folder:
1 | cd REPONAME.git |
And run
1 | git symbolic-ref HEAD |
This sets the heads of the repo (else you will keep getting an error when you try use this bare repo).
Preparing to use git subtree
Now you have a location for the subtree. As mentioned you can use a github hosted repo as your location, copy that location.
In our case, because we are using a private server for our subtree we will use ssh to access the location.
1 | [ssh_username]@[ip_address_or_domain]:/file/path/to/repo/from/root/REPONAME.git |
You will need to replace the following values with your own:
- ssh_username
- ip_address_or_domain
- /file/path/to/repo/from/root/
- REPONAME.git
Add a new remote
1 | git remote add [remote-name] [location] |
eg
1 | git remote add server [email protected]:/file/path/to/repo/from/root/REPONAME.git |
Updating an existing remote
If you already have added the remote. And you want to set the url of that remote (not origin, a different one that is separate to your main project). Then you will use this command to set the the remote url:
1 | git remote set-url [remote-name] [location] |
e.g.
1 | git remote set-url server [email protected]:/file/path/to/repo/from/root/REPONAME.git |
I have chosen to call the remote “server” you can call it whatever you choose, just not origin, as origin is your original github project you are pushing from.
If you type this to see your current remotes
1 | git remote —v |
You will see your current origin values pointing to your current repo.
Add the server remote:
1 | git remote set-url server [ssh_username]@[ip_address_or_domain]:/file/path/to/repo/from/root/REPONAME.git |
Replacing the values as mentioned above.
Now you can insert again
1 | git remote —v |
And you will see your origin and your server locations.
Do not set the remote origin to two locations (this is a mistake I did at first). You should not use your existing origin for two different locations (in this case at least). Remote origin should only be pointing to your current github repo (not the subtree repo).
Some notes on subtrees
The environments are now prepared for subtrees.
In your project decide what folder you want to move into the subtree. In our case we wanted to use “/dist” in the root of our repo.
If you want to use a subtree in a sub folder structure, you need to reference the folder using that subtree path. Eg “files/app/dist”
I’ve seen a youtube video of a wordpress project referencing a plugin using a subtree. This meant they pointed to the wordpress plugins folder and then the specific plugin they wanted to subtree into the project. “wp-content/plugins/plugin-folder”
Using git subtree
If you type in this command, you will see the following help information
1 | git subtree -h |
usage: git subtree add  –prefix=<prefix> <commit>
  or: git subtree add  –prefix=<prefix> <repository> <ref>
  or: git subtree merge –prefix=<prefix> <commit>
  or: git subtree pull –prefix=<prefix> <repository> <ref>
  or: git subtree push –prefix=<prefix> <repository> <ref>
  or: git subtree split –prefix=<prefix> <commit…>
  -h, –help      show the help
  -q          quiet
  -d          show debug messages
  -P, –prefix …   the name of the subdir to split out
  -m, –message …   use the given message as the commit message for the merge commit
options for ‘split’
  –annotate …    add a prefix to commit message of new commits
  -b, –branch …   create a new branch from the split subtree
  –ignore-joins    ignore prior –rejoin commits
  –onto …      try connecting new tree to an existing one
  –rejoin       merge the new branch back into HEAD
options for ‘add’, ‘merge’, and ‘pull’
  –squash       merge subtree changes as a single commit
The main commands to keep in mind:
1 2 3 | git subtree add —prefix [path_to_folder] git subtree pull —prefix [path_to_folder] [remote] [remote-branch] git subtree push —prefix [path_to_folder] [remote] [remote-branch] |
Eg (remember I called the remote server above, you can call it anything you prefer):
1 2 3 | git subtree add —prefix dist git subtree pull —prefix dist server staging-dist git subtree push —prefix dist server staging-dist |
I have chosen to push to branches using the naming convention “origin-branch”-“folder”. “staging-dist” is the name of the subtree, but it is also the branch that the subtree will exist in.
So the branches I would subtree from would be staging (for staging deploys) and master (for production deploys)
One of the problems we had to solve using this approach was not deploying “dist” to our main repo. With subtrees, you can deploy code there without deploying that code to your main repo.
From within the project, “dist” is added to .gitignore
Then when your dist folder is ready for deploy (usually after running “grunt build”) you can force commit dist using -f.
First you will need to add your subtree (if you have not already)
1 | git subtree add —prefix dist |
Then add your dist files and commit them (do not push to your current branch, only push to the subtree)
1 2 | git add dist -f git commit -am “Added the dist files” |
Then push to the subtree
1 | git subtree push —prefix dist server staging-dist |
If you have an existing subtree with files you will need to pull first by running
1 | git subtree pull —prefix dist server staging-dist |
Again, just replacing the values as you need to (mentioned above).
Deploying from the subtree
So now we have code in a subtree, we just need to push those files to the location you need them. I won’t go into how to setup nginx to point to this location, that will be for another post, but if you do want to learn how to do that just google using nginx for file management.
If you want to automate that when you push to the subtree your files are pushed to the right location, you can read through this digital ocean post
I would not recommend that approach for a production project, but perhaps for a staging one. I will always enforce a manual deploy step for production using a system like jenkins.
For this project I created an ansible script that would be run by jenkins in order to deploy.
1 2 3 4 5 | - name: Checkout git repo into destination git: repo=/file/path/to/repo/from/root/{{repo_name}} dest=file/path/to/destination/from/root/{{repo_dest}} version={{github_branch}} force=yes |
This script makes sure the files in the repo, in the subtree branch (e.g. staging-dist) gets copied to the location required.
Questions?
This process took me a little while to wrap my head around. Especially going down the wrong path at first, and then going back and relooking the solution.
The solution should always feel like it fits. And this second approach fitted for our project.
This post focused on subtrees, but the overall solution required ansible, nginx, a private server, git and github.
If you have any questions (or suggested improvements) feel free to pop them in the comments section.
Onwards, to more solutions.