Git Life: Simplified with Rebase
How rebasing remove the clutter, mess, and pain of merging.
This article is written as an individual review assignment for PPL CS UI 2021.
They say, if you never used rebase
while working in a team with git, you are doing it wrong.
If you are unfamiliar with the git commands: push, pull, and merge
, sadly this article is not for you yet.
Here, we will cover the essence of git rebase, the difference between merge vs rebase, and the confidence of when to use rebase.
The Essence
What is Rebase?
In the simplest words, rebase
is the process of combining commits that exist in two separate branches into one straight line.
The following is the general process of how rebase
takes place:
The above diagram is achieved by executing git rebase staging
(or git pull --rebase origin staging
) command, while being on the PBI-1-feature
branch.
What rebase
does is that it changes the base commit of PBI-1-feature
branch from commit A to commit C of the staging
branch. This creates the “illusion” as if either
- we have moved the content (commits D and E) of
PBI-1-feature
branch to accommodate the latest changes (commits B and C) fromstaging
branch, or - we have created our
PBI-1-feature
branch from commit C instead of commit A from the beginning.
Behind the scenes, Git accomplishes this powerful “re-writing commit history” tool by overriding the content of PBI-1-feature
branch with brand new commits (commits D’’ and E’’) and applying them to the specified base. In an automated rebase, the new commits D’’ and E’’ will contain the exact same changes of commits D and E respectively to form one linear flow of A -> B -> C -> D'' -> E''
.
The Difference
Now that we are aware of rebase
, does it seem familiar with other commands that we know? If you have not noticed, git rebase
or git pull --rebase
command shares the same fundamental purpose with git merge
or git pull
command. Both of them solve the problem of integrating changes from one branch onto another — just in a totally different manner.
Merge vs Rebase
Supposed we have begun working on our PBI-1-feature
branch (see the diagram above), and we encounter one of these three scenarios:
- We want to merge our changes to
staging
branch through a merge request (or pull request), but there are merge conflicts. - Our work is dependent on our teammate’s work contained in commits B and C, therefore we want to get the latest changes from
staging
branch.
- [Advanced] Our work is a part of stacked merge requests, e.g.
staging -> PBI-1-feature -> PBI-2-feature2
. AfterPBI-1-feature
has been merged, we want to mergePBI-2-feature2
next. However,PBI-2-feature2
still carries the commits fromPBI-1-feature
when compared with staging, which typically results in merge conflicts.
First, we would want to update our local staging
branch with the changes on the public staging
branch through these set of commands:
git checkout staging (go to staging)
git pull origin staging (fetch latest changes)
git checkout PBI-1-feature (go back to our branch)
Now we are faced with two options: to merge or to rebase.
(Messy and Painful) Merge
The conventional and safer alternative is to merge by
git merge staging -or- git pull origin staging
which will result in a commit branch structure roughly like the following:
commit BC*:
Merge branch 'staging' into 'PBI-1-feature' -or-
<a custom commit, such as "Resolve conflicts from staging">Commit F
Commit E
Commit D
Indeed, the strongest advantage of merge is that it preserves the commit history.
However, this also means that we have
- An added clutter of our previous merged commits that is no longer necessary as they no longer reflect any changes,
- An added mess as there is an extra merge commit that essentially does nothing, and
- An added pain as we have to resolve all the conflicting changes inside the whole repository all at once
Not to mention, imagine having to repetitively pull the latest changes from staging
. Our commit history will have been polluted with merge commits, and our precious time will be wasted on the tedious task of resolving conflicts.
(Simpler and More Effective) Rebase
This is where rebase comes to the rescue! We can rebase by
git rebase staging -or- git pull --rebase origin staging
thus resulting in the “same” commit history portrayed in the below diagram:
During the rebase process, git will go through the newly fetched commits from staging
branch (commits B and C) one at a time. If git detects a conflict on a commit, git will ask for our help to apply the correct changes by outlining the commit and what files need to be resolved in the terminal.
At this point, you are given two options:
- Resolve the conflicts manually by
git add/rm <changed_files>
andgit rebase --continue
. We usually go with this option if the conflict exists on commits that are not our own but have changes on the same files we are working on. - Skip the conflict (auto-accepting incoming changes from
staging
) withgit rebase --skip
command.
Once all the commits have been resolved either automatically by Git or manually by us, the rebase process is finished.
The Confidence
With great power comes great responsibility.
Imagine a situation where you and person “A” have two different lives with a shared history, meaning both of you are writing your history together. At some point, both of you have a disagreement and would want to write the history with each of your desired solutions. As much as you are concerned, you can re-write the history with your solution, meanwhile person A also thinks the same way instead of resolving the conflict at hand. History inevitably diverges into two branches, each belonging to you and person “A”.
This also applies to Git.
The golden rule of rebase is to not use it on public branches unless given consent to do so.
Every person have their own local git history in their respective devices and shares the same public branches, e.g. staging
and master
. Needless to say, if you are re-writing the history for a public branch, it will result in your version of the public branch differing from the version other users are working with. Believe me, this will result in a very confusing situation.
So, in particular to the powerful rebase tool, the responsibility is that you can only re-write the commit history on the branch that you, and only you, are contributing on, and you will be living life just fine.
Golden rule of git rebase: Do NOT use it on public branches.
Key Selling Points
Usage of Merge
One sample case where merge comes in handy is applying squash and merge
option to a public branch. A squashed commit to represent the entire branch is arguably the best option if our main goal is to keep a public branch clean with a preserved commit history.
Usage of Rebase
We successfully get rid of the added mess and pain of merging
, while maintaining a clean commit history and dealing with conflicts more effectively. Seems pretty easy, right?
Other than that, as long as you apply the golden rule of rebasing, you are free to re-write history as much as you like.
Happy rebasing!