How to manage your repository size and prevent it from getting too large? Sometimes the size of a repository can get too large. This was the case with the Phimp.me and PSLabs Android repositories, which kept the history of apk files for testing in the Git history.
Did you ever notice that your repository size keeps increasing? Like your application size might be small, but the size of the repository is too large?
This can happen because of a large commit history. Whatever changes that are made to your repo are stored as history. This process of maintaining the history isn’t bad, but it does increase the size of the repo unnecessarily. This might create a problem for contributors who will have to download the bloated repo, while the actual size of the application might be quite small. Due to an enormous amount of contributions to the Phimp.me and PSLab the repository, their size increased too much. The Phimp.me repo grew to a staggering 600+mb, while the application is just around 20mb. This creates problems for new contributors because to work on an application that is just 20mb, they have to spend more than half a GB of data. Not only new contributors but sometimes existing contributors might also have to clone the repo again and to do so they would have to waste a lot of time and data. Therefore, a repo must be maintained in such a way that the size is not much bigger than the application on which the work is to be done.
To get the repo size back under control, we used a tool called bfg repo cleaner. The advantage of using this tool is that the commit history does not get erased, and all contributors get credit for the changes they made.
So the first step to do is to find the files in the git history which are causing the repo size to be increased. This can be done using a handy command given below:
git rev-list –objects –all | git cat-file –batch-check=’%(objecttype) %(objectname) %(objectsize) %(rest)’ | sed -n ‘s/^blob //p’ | sort –numeric-sort –key=2 | cut -c 1-12,41- | numfmt –field=2 –to=iec-i –suffix=B –padding=7 –round=nearest |
This command will give all the files comprising the git history in an order such that the files towards the last will the ones with the largest size. Here, you can see which large files are bloating the repo size. The files that are not being used any longer or are refreshed every time a change is made in the repo can be safely deleted.
For example, let’s take Phimp.me into consideration.
The files which were causing the increase in size were the redundant apk files, gradle files and old screenshots of the app that used to be in the readme but were not in use anymore. Using the above command, we got all these files and deleted them using bfg.
You can download the bfg.jar file from here. Once this is done, to delete files and folders using bfg, the commands needed are:
java -jar bfg.jar –delete-files <file name>java -jar bfg.jar –delete-folders <folder name> |
To make your work easier, you can maintain bfg as an alias for java -jar bfg.jar.
Using this command deletes the required files. Once that is done, force push the changes to your repo.
That’s it. This way you can easily maintain the size of your repo and keep deleting the extra files that cause a rrepo to bloat without erasing the commit history.
Here is the link to the entire discussion of the issue on the Phimp.me project where you might find several other insights into the process especially while doing it for repos with android apps.
After doing the entire process, the size of Phimp.me was brought down to 27mb from 600mb.
git clone https://github.com/fossasia/phimpme-android Cloning into ‘phimpme-android’… remote: Enumerating objects: 6304, done. remote: Counting objects: 100% (6304/6304), done. remote: Compressing objects: 100% (4674/4674), done. remote: Total 23695 (delta 4285), reused 3539 (delta 1622), pack-reused 17391 Receiving objects: 100% (23695/23695), 27.10 MiB | 6.19 MiB/s, done. Resolving deltas: 100% (14248/14248), done. |
Resources
- 1) https://rtyley.github.io/bfg-repo-cleaner/
- 2) https://github.com/fossasia/phimpme-android/issues/2820
Tags: Android, Java, Phimpme, PSLabs Android, Repo size, git, history, repo cleaner, bfg, commit history