Using GIT on FreeBSD; why and how

Hi gang!

Editorial

As I've mentioned a few (too many?) times in the past already I'm a pretty vivid Minecraft player and last weekend that got me into contact with some GitHub repositories. One project provided an easy way to compile it but unfortunately the process didn't work smoothly on FreeBSD. However, it was also easily fixed so I figured I'd try to do something back for both said project and FreeBSD in general. That led up to me doing a crash course in Git (and GitHub) to make this work (I had this on my todo for quite a while already).

The whole process was quite impressive and although you won't see me mention anything about one product being better than the other I do have to admit that I see many good uses for Git. Even for system administrators such as myself! I'm well aware that there's already some solid documentation available (the gittutorial(7) for example) and it's all provided out of the box. But I was still looking for something a bit more focused on FreeBSD and perhaps also something a little more descriptive. I've become quite enthusiastic about Git (I just finished porting my local Subversion repositories to Git this evening) so I figured I'd share some of the fun. However... As always I'll do my best not to let bias slip into this piece, but some slight favoritism might be unavoidable ;)

What is Git? (general introduction)
Git is a VCS program, in other words a Version Control System. Version control means that you can work on a project (consisting of a file or a whole collection of them) and while you progress in updating said project you make sure to save and describe your changes along the way. This doesn't merely allow you to keep track of what you did in the past, it also allows you to undo any changes.

For example... Let's say we're working on a shell script and we decided to use a VCS system. There are four points in time in which we saved our work:
  1. Added our script to the repository.
  2. Found a bug and fixed it.
  3. Added some much required documentation.
  4. Added a new feature: the script can now actually be stopped! o_O
So far, so good. Unfortunately we also discover a week after the last step that the bug we found in step 2 wasn't actually a bug, it was a feature! "oops". In the mean time we continued developing and even added a major new feature in step 4 which we don't want to lose. Now what?

You could consider reverting all your changes up to point 2 but then you'd lose an awful lot of work. But a good VCS system can help deal with such disasters. Because they keep track of the changes between every step they can also save and re-apply those changes elsewhere. For example: if you save the differences between steps 2 and 3 you'd effectively end up with all your documentation. Between steps 3 and 4 you'd get the new feature.

So all you'd basically need to do is go back to undo the bug, then re-apply the rest of the changes. Not the easiest of tasks (no matter the kind of VCS you're using) but still doable. But better yet: when done right you could even have prepared yourself for such a disaster by creating (and maintaining) multiple versions of your work, often referred to as branches.

So instead of merely trying to fix a bug "just like that" you could start by creating a new branch and applying your fixes in there. Then you'd switch back to the main branch and add your documentation and the new feature for the project. Then when all things work out in the end you could merge both branches together to form the end result we have above. Or... in our example you come to the conclusion that it wasn't a bug at all, so instead of merging you simply get rid of the "test branch" and continue with the main. Or if you did perform the merge already you could undo it after which only the bug fix gets removed while all your other work would stay intact.

The one thing which makes Git so interesting for all this is because it is extremely easy to set up such a working repository of your own.

But let's start with the beginning...

Installing Git and setting up our first repository

Git is available through the Ports collection as devel/git and has quite a few configurable options. I strongly suggest to enable HTMLDOCS, especially if you run a webserver. Not only is the documentation (such as the user guide) quite good, this will also allow you to provide it on your local network. If you're currently using Subversion then you may also benefit from enabling SVN support.

Then some options which I'd personally turn off: Gitweb and CVS. There's nothing wrong with these options, but in my example we're going to use Git on the commandline, so there's no need for a web interface (see gitweb(1)). And even though I really enjoyed working with CVS back in the days it's all Subversion now. And I'd rather not add features which I know I'll never use. But as always: it's all up to personal preference.

Awesome documentation

One thing which Git definitely does better than Subversion is making its documentation a lot more accessible. It's my main gripe with Subversion: after not having used it for a while I start with svn(1) only to be told:
Run `svn help' to access the built-in tool documentation.

But.. but... I like having manual pages available! This allows me to set up the command line on my first virtual console and keep the manual page open in the second. Not to mention that man svn is a lot easier to type than: svn help checkout | less.

Fortunately for us Git does this "a little bit" different.

First is the main manual page, git(1), which will refer you to all the other available information. From the tutorial to the previously mentioned user guide.

The second advantage is that Git provides manual pages for all its commands. For example, if you want to check the status of your current repository you'd use: git status. Therefor its manual page is git-status(1), easy huh? This is fully comparable to pkg(8); if you want to learn more about the available options when using pkg info then you'd check pkg-info(8), couldn't be easier! And easy is good because this also helps making this a lot more accessible.

Our first repository

So here comes the big change between Subversion and Git: unlike Subversion where everyone accesses the same repository Git actually works "decentralized". Or in plain English: Git sets up local repositories for you to use. If you need to commit a change to a remote repository (in Git terms this is called doing a 'push') then you would basically push the changes from your own local repository. So first you'd commit any changes to your local repository, then you'd push all of that onto the remote repository.

Another big difference, one I actually quite like, is that you don't need any fancy commands to set this up. Just find a directory which you'd like to put under version control. It doesn't even matter if the directory isn't empty. Then run: git init, and you're done. Want to know more about what this command does? Remember: git-init(1) will tell you all you need to know. But so will my tutorial ;)

The init command creates an empty repository which is then immediately available to use. This is what sets this apart from Subversion where you'd have to create the repository, and then check out ('initialize') your local environment. With Git you simply start working right away. If you run git status you'll notice a few things.. It will mention that you're on branch master, it tells you that you haven't got any commits yet and finally if there are already some files in the directory then those will be listed in red as 'Untracked files':
Code:
unicron:/home/peter/snapshot $ git status
On branch master

No commits yet

Untracked files:
  (use "git add <file>..." to include in what will be committed)

        banned-ips.json
        banned-players.json
        crash-reports/
        eula.txt
        logs/
        minecraft_server.18w07c.jar
        minecraft_server.18w08b.jar
        ops.json
        runserver
        server.properties
        usercache.json
        whitelist.json
        world/

nothing added to commit but untracked files present (use "git add" to track)
So here we are, our own version controlled Minecraft server :cool:

The fun part is that we can easily tell Git to track the config files (such as server.properties, the *.json files and finally runserver which is a custom script I made), and then also totally ignore the jar files. We'll just leave the other files for what they are. At least for now.

First the easy part: adding new files to our repository. Couldn't be easier: git add server.properties *.json runserver. If we run git status again we'll see something completely different (I used the --short option to keep things easier to look at):
Code:
unicron:/home/peter/snapshot $ git status -s
A  banned-ips.json
A  banned-players.json
A  ops.json
A  runserver
A  server.properties
A  usercache.json
A  whitelist.json
?? crash-reports/
?? eula.txt
?? logs/
?? minecraft_server.18w07c.jar
?? minecraft_server.18w08b.jar
?? world/
As you can see it has added all the files with a A in front yet doesn't know what to do with the others. But we're not done yet. Right now all we did is tell Git that we want it to track these files. But we haven't actually added them yet, for that to happen we'll need to commit our changes. So: git commit. You'll be taken into an editor (vi by default) and asked to add a commit message. The file you're editing will show you an overview of all the currently available files; both tracked and untracked. Type a description, save your edit and then we're done. Now we added our first files to our repository.

If we run git status again you'll notice that we're still on branch master, and the only thing Git talks about are all the other untracked files. There are several ways how we can handle those. First we can simply ignore this and also tell Git not to show this to us. The option for untracked files is -u and it can use three values: no (don't show untracked files), normal (default, show untracked files in the current directory) and all (show all untracked files throughout your entire project, which can be extremely slow).

This feature is obviously ideal for developers who work with a project where every file is a part of the whole thing. But for us it's only a bit of a hindrance, but one which is very easily remedied. We could use: git status -uno after which no untracked files will be shown.

Another option is to tell Git to permanently ignore these files. We have several options for this, and the easiest is .gitignore (see also gitignore(5)). If you check its manualpage we'll learn 3 things: there are patterns being used (somewhat comparable to regular expressions), we can also use $PROJECT_DIR/.git/info/exclude to ignore entries throughout the entire project and finally there's also a configuration option available called core.excludesFile.

Now, for starters I only want to ignore the server jarfiles, so I'm going to add this to .gitignore:
Code:
/minecraft_server.18w???.jar
So now we get to see something different:
Code:
unicron:/home/peter/snapshot $ git status -s
?? .gitignore
?? crash-reports/
?? eula.txt
?? logs/
?? world/
Both jarfiles are gone, but now we have the .gitignore file itself to deal with ;) When working with a repository which is actually a "usable project" then I often simply add the ignore file to the repository too so that this behavior will also be used on other clients. If you don't want this then you could consider editing the previously mentioned exclude file.

Git configuration

We could also re-configure Git by changing it's current configuration. This is explained in more detail in git-config(1).

Git knows 3 different configuration levels and each supersedes the other:
  • Local; this is basically the configuration as defined by the repository itself. The used filename is: $PROJECT_DIR/.git/config.
  • Global; this configuration file determines how Git should behave for the current user. Keep in mind though that a local configuration will supersede a global one. The used filename is: $HOME/.gitconfig.
  • System; this configuration file determines the behavior for the entire system. So all users (and processes) which run Git. As before it will be overruled by another level mentioned above. The used filename on FreeBSD is: /usr/local/etc/gitconfig.
The best part is that you can use git config -e to edit the configuration file. It will use the local configuration by default, but you can also specify --global or --system. Now, I personally prefer this option because I'm used to messing around in config files, but if you don't like the idea you could also simply set (or remove) the individual options yourself.

For example, if we don't want Git to show any untracked files then we could consider to set the right option for it. Easily done. If you check the git-config(1) manualpage you'll soon discover status.showUntrackedFiles. If we want to set this option to no we'd use: git config --add status.showUntrackedFiles no, and done. The change will be applied immediately:
Code:
$ git status
On branch master
nothing to commit (use -u to show untracked files)
And if you look at the actual configuration file (.git/config) then you'll quickly see why it's also easy to edit this manually:
Code:
$ less .git/config
[core]
        repositoryformatversion = 0
        filemode = true
        bare = false
[status]
        showUntrackedFiles = no
So far, so good. We have a local repository and if other people on the same server have access to our home directory then they can easily make a copy for themselves. For example, consider this command: git clone /home/peter/snapshot mcsnap.

If you use Git then you don't check out a repository as with Subversion, instead you actually create a new one for your own. You're literally forking the whole project:
Code:
% git clone /home/peter/snapshot mcsnap
Cloning into 'mcsnap'...
done.
% cd mcsnap/
% git log
commit b88a24c97937d84d5eb09df6b672627a7900a005 (HEAD -> master, origin/master, origin/HEAD)
Author: ShelLuser <pl@intranet.lan>
Date:   Wed Feb 28 21:03:25 2018 +0100

    Setting it up for the server jarfiles to be ignored.

commit 4d0d108dfdb1639cc4d6f343b5f9888c9120fef6
Author: ShelLuser <pl@intranet.lan>
Date:   Wed Feb 28 20:43:26 2018 +0100

    First init of the snapshot server.
% ls -F
banned-ips.json         runserver*              whitelist.json
banned-players.json     server.properties
ops.json                usercache.json
(for the record: this was obviously not done as root, just using the C shell)

So not only did I copy all the files, I also copied the entire backlog. This can both be a pro as well as a con, for obvious reasons.

Making changes

So how do we deal with changes? It's one thing to share something with others so that they can grab it for themselves. But lets say for the sake of argument that I don't have write access to /home/peter/snapshots, but I do have an important change which I think the repository owner should consider.

Well, easy!

First I make the change. I decided to edit ops.json and remove an entry. If we look at git status it will now tell us that we have a modified file. Now, there are 2 good ways to add the changes to your repository. If, in the situation above, the whole directory is a part of the repository then you can simply add all files: git add .. This will check the current directory (and subdirs.) and then adds all changed files.

But what about my first example? Where the repository is actually part of an environment which is actively in use (including several untracked files)? Also easy! Just tell Git that it should only add updated files: git add -u.

Now we'll commit our changes and then we'll see something like this:
Code:
% git status
On branch master
Your branch is ahead of 'origin/master' by 1 commit.
  (use "git push" to publish your local commits)
So what is happening here? As I showed you above I cloned this repository from an original: /home/peter/snapshot (called origin above). I then made several changes but those were only commited to my own, local, repository. This is also clearly visible when checking the log using git log:
Code:
% git log
commit 2eb95bf7ddcc87d6f4c32fc7f15c58626ede149f (HEAD -> master)
Author: TheOtherShell <peter@localhost>
Date:   Wed Feb 28 21:57:56 2018 +0100

    Removed AyanamiKun as operator.

commit b88a24c97937d84d5eb09df6b672627a7900a005 (origin/master, origin/HEAD)
Author: ShelLuser <pl@intranet.lan>
Date:   Wed Feb 28 21:03:25 2018 +0100

    Setting it up for the server jarfiles to be ignored.

commit 4d0d108dfdb1639cc4d6f343b5f9888c9120fef6
Author: ShelLuser <pl@intranet.lan>
Date:   Wed Feb 28 20:43:26 2018 +0100

    First init of the snapshot server.
So my repository is now newer than the original. If I had access then I could push my changes onto the remote. But I don't. So we need to do something else: instead of pushing our changes we're going to do this the other way around: we'll save our changes and then request the remote operator to pull these instead.

How? By using the git request-pull command! Which I'll explain in part II (message was too long ;) ).

I don't know about you but I really like it when an environment uses to-the-point yet still related terms like this. Pulling and pushing, easy! It always reminds me a bit of Java: we have Java as an environment, it uses jar files which basically store its programs and an often used programming methodology (coding standard) is to use so called 'beans' (JavaBeans). So the beans go in the jar which then make up for the Java program (to put this in an extremely simplified example).
 
Part II

Requesting a pull

So back to our updates... In my example above we're both working on the same server, so we both have readonly access to both our repositories. Therefor I'm using this to generate the request: git request-pull master /home/peter/temp/mcsnap | mail -s "Got an update for you!" peter (mailing myself, that's a new one :D ).

I noticed plenty of confusion about pull requests, and I suspect GitHub to be a part of that. A pull request is nothing more than what its name implies: a request saying so much as "Ey! Please check out my repository to see if you like it!". Nothing more, nothing less.

So basically I get a message which tells me that my alter-ego has made some changes for my repository and that these are available in their repository, which can be accessed through /home/peter/temp/mcsnap. So let's give this a try...

I'm ready to check out what disaster my alter-ego managed to cook up. I know the repository URL: /home/peter/temp/mcsnap, I also know their commits are newer than mine, so.. First I'm going to make it easier on myself. I'll add a remote entry called shell which will point to the updated repository of my alter-ego. I use: git remote add shell /home/peter/temp/mcsnap. Note that this is optional, but it will make it easier to access. A mere git remote will show that something has been added.

So now I'm ready to try and fetch the updates:
Code:
$ git fetch --dry-run shell
From /home/peter/temp/mcsnap
* [new branch]      master     -> shell/master
This looks safe enough, so let's do this. I issue the same command again (this time without the dry run) and what do you know? Nothing happened! :D This is true Unix behavior we're working with: no news is good news.

So now that I fetched the updates I need to issue another command to activate them:
Code:
$ git checkout shell/master
Note: checking out 'shell/master'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

HEAD is now at 2eb95bf Removed AyanamiKun as operator.
git-checkout(1) is used to switch between branches or to restore files in your current working directory (undoing changes, so it's nothing like Subversions checkout command).

So here we are. I now have the same status as the remote repository. But, and this is important. while my own work is still fully safe:
Code:
$ git branch --list
* (HEAD detached at shell/master)
  master
Remember: master is my own setup, whereas shell/master is what I fetched from the remote repository.

From here on I can do anything I want. First I'm going to keep these changes as a test (experimental) branch, and all I have to do is follow the instructions which Git already gave me: git checkout -b test. Now I have 2 branches: test, which is the work of my alter-ego. And master which is obviously the fully sane setting ;)

And because we both work with our own local repository there's no risk of work suddenly ending up in /dev/null. Even if I don't like the changes and decide to delete them then TheOtherShell will still have access to all his hard work, which I think is only fair and really makes sense.

Setting up a shared repository
So now we've seen how you can easily pull stuff from other repositories, even into an existing one. All it takes is read (or readonly) access. But what about setting up a full repository which can actually be used by remote clients (so also used to push updates)?

First it's important to note that Git itself does not bother itself with authentication. So if you want more fine grained control then you'll need to set that up yourself. Keep this well in mind!

For example: on my LAN environment I have set up a few repositories and made these available using Git daemon. I then instructed my firewall to only give my (local) workstation access to the Git daemon service. So only my workstation as well as some local users (members of the git group) can actually push updates.

For everything else I've set up Apache to host the repositories as well, but obviously through HTTP and readonly.

I've also added a bit extra with specific permission bits and stuff like that, but that's beyond the scope of this tutorial.

Just be very careful when you set this up because, once again: Git doesn't know anything about authentication. Therefor I would only suggest using Git daemon for a local (trusted!) LAN. Otherwise use a web server and provide read/write access through use of SSH for everyone else. This will also give you much better control over who can access what repository.

Setting up Git daemon
This is actually quite easy but to get the most out of this you'll need to configure it. The only problem is that the port maintainer didn't account for this so you'll have to make sure that if you use specific commandline options then you also need to include those from the original rc script (/usr/local/etc/rc.d/git_daemon).

In my situation I don't want to give away specific directory locations so all repositories will be hosted from one specific root location. Here is how you can set this up (in /etc/rc.conf):
Code:
# Git
git_daemon_enable="YES"
git_daemon_user="git"
git_daemon_group="git"
git_daemon_directory="/usr/local/var/db/git"
git_daemon_flags="--syslog --reuseaddr --detach --base-path=/usr/local/var/db/git"
What this does: this will ensure that the Git daemon will only be working with /usr/local/var/db/git and nothing else. So if I have the following repository: /usr/local/var/db/git/scripts, then this can be accessed for fetching/pushing using: git://git.intranet.lan/scripts. No need for messy path names, just the root directory will suffice.

So how to get a repository into this shared setup? Also simple... Always keep in mind that pretty much everything around Git involves cloning and creating local repositories, and this is no different.

First I'm going to switch my previous repository back to reflect my own branch: git checkout master. So I'm switching my working copy back to the master branch, which was also the main branch (with my own additions).

Then I'll be making a clone of my repository, but a so called bare one. In other words: I only want the repository directory (.git) and not so much the rest of the files. For that I use this command: git clone --bare . ../snapshot.git. So what this does is that it will actually copy the .git directory to ../snapshot.git, which will become the actual repository.
Code:
unicron:/home/peter $ ls -d snap*
snapshot/       snapshot.git/
unicron:/home/peter $ ls snapshot.git/
HEAD            config          hooks/          objects/        refs/
branches/       description     info/           packed-refs
See? So now there's something you need to know about Git daemon: it will only provide access to a repository if that has been explicitly unlocked for sharing. git-daemon(1) will tell you more about this as well, but for now all we need to do is create a so called semaphore: touch snapshot.git/git-daemon-export-ok. This will ensure the daemon that this repository may be shared.

So now all we have to do is actually move this into the right directory. When moving then you might also want to make sure that the UID which executes the daemon has the required permissions. In my case I also need to do: # chown -R git snapshot.git.

Now, let's try this out:
Code:
$ git clone git://git.intranet.lan/snapshot.git snap12
Cloning into 'snap12'...
remote: Counting objects: 13, done.
remote: Compressing objects: 100% (11/11), done.
remote: Total 13 (delta 3), reused 0 (delta 0)
Receiving objects: 100% (13/13), done.
Resolving deltas: 100% (3/3), done.
So far, so good. I checked out the repository again (in a new directory) and it seems to be working. If I ask more information about the remote host then you'll see that it knows all it should:
Code:
$ git remote show origin
* remote origin
  Fetch URL: git://git.intranet.lan/snapshot.git
  Push  URL: git://git.intranet.lan/snapshot.git
  HEAD branch: master
  Remote branches:
    master tracked
    test   tracked
  Local branch configured for 'git pull':
    master merges with remote master
  Local ref configured for 'git push':
    master pushes to master (up to date)
Pretty cool right? So now I'm going to try and push a few changes, and what do you know:
Code:
$ git push
fatal: remote error: access denied or repository not exported: /snapshot.git
This is another reason why I've become so fond of Git in the past days. It doesn't "just" allow people to mess with your stuff unless you explicitly tell it to. In this case I need to tell Git that it should accept updates for this repository. How to do that? Using the same configuration options I mentioned earlier; those which are used throughout the environment.

When we take another closer look at git-daemon(1) you'll notice mentioning of the receive-pack option, this will tell the Git daemon to allow incoming updates. Once again I need to urge you to be careful because Git doesn't discriminate, once this option has been enabled then everyone can perform updates. In my case that's exactly what I want, and thus I go into the repository: cd /usr/local/var/db/git/snapshot.git, and then issue this command: git config --local --add daemon.receivepack true.

And once I got that out of the way I can fully use the repository, even from a remote location and even to push updates.

But one final trick remains... Remember how my alter-ego had cloned my repository and applied changes (which I placed into a separate branch of its own)? So now would be a good time for him to 'connect' his repository to the main one and then push his updates through.

Also very easily done:
Code:
% git remote add public git://git.intranet.lan/snapshot.git
% git push public
To git://git.intranet.lan/snapshot.git
! [rejected]        master -> master (fetch first)
error: failed to push some refs to 'git://git.intranet.lan/snapshot.git'
hint: Updates were rejected because the remote contains work that you do
hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first integrate the remote changes
hint: (e.g., 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.
So first I added a new remote entry and called it public, I then tried to push my changes onto the remote repository but that didn't work out for reasons mentioned above. When I perform a new pull and then check the logs I can quickly see what is going on here:
Code:
commit f313bdf9286e6519d09edeba8e4f48b14a398121 (HEAD -> master)
Author: TheOtherShell <peter@localhost>
Date:   Thu Mar 1 01:26:51 2018 +0100

    Removed AyanamiKun as operator

commit c4c93766f7440e9b54d45c1a4896d5246235d47d (public/master)
Author: ShelLuser <pl@intranet.lan>
Date:   Thu Mar 1 01:08:19 2018 +0100

    Made a few changes in the properties file.

commit b88a24c97937d84d5eb09df6b672627a7900a005 (origin/master, origin/HEAD)
Author: ShelLuser <pl@intranet.lan>
Date:   Wed Feb 28 21:03:25 2018 +0100

    Setting it up for the server jarfiles to be ignored.
Now, this may seem extremely complicated at first, and in a way it is. But what I'm trying to show you guys is how extremely well Git manages to keep your data safe. All of it. So here HEAD -> master is now the main (local) branch I'm working with, it's here where I applied those unwanted changes.

But I'm also tracking public/master which is a branch from the public repository, the one I should be able to push onto, but because several commits have already been made that has now becoming an issue.

And finally we still have origin/master, this is actually the original repository (/home/peter/snapshot) which we initially cloned.

So basically, my alter-ego made a mess ;) But neither of us have lost any precious data or work.

If I try the same thing with my own repository (so /home/peter/snapshot, which is also the one I based the remote repository on) then I get a completely different result. Now I can easily push my changes onto the remote because the copies didn't really differ all that much. I add the remote repository git://git.intranet.lan/snapshot.git using the alias inet and then push my changes, but also set the upstream for my current branch to this remote repository:
Code:
$ git push --set-upstream inet master
Branch 'master' set up to track remote branch 'master' from 'inet'.
Everything up-to-date
In other words: I tell my local repository that it should use this remote location as a master (the upstream). And from this point on I can work with both my local repository as well as push updates onto the remote (inet) and all it took were a few specific Git commands.

Summing up
If you're working with local projects (for example a collection of shell scripts, or maybe your configuration files in /usr/local/etc) and you want to keep more control over those files then Git might be just the thing you need. Simply initialize your repository using git init, then add one or more files (or the whole directory) using git add and finally commit your changes with git commit. After that you can keep track of everything you change. Or maybe do what I do: use Git to 'push' these files onto other servers.

The sky really is the limit here. For example, I currently keep all custom kernel configuration files stored in one Git repository, where I use different branches to keep track of 9, 10 and 11, this way I can always check older configuration files without having them clutter up my home directory.

Just remember that everything is a repository within Git. So once you have read-only access to a repository then you can clone it and do with it as you please. And because everyone can pull updates from everyone else (provided you have read access of course) this approach can also be used to send updates to a repository. The original owners remain in full control, you don't get to touch their code directly, but you can still participate and help them out!

Most of all I think it's very important to keep in mind that Git will do its utmost best to keep your data safe. Even if you make a complete mess of things (see my last example above) and have to deal with "griefers" then Git will still manage to protect the integrity of your work.

But most of all, the option which really won me over, is the documentation. The Git suite provides manual pages for just about everything, so just like with FreeBSD itself you can basically teach yourself Git while not ever having to leave your command line.

Of course it's not perfect. There are some caveats here and there. For example: everything being a repository can have it's drawbacks. When using Git within NetBeans for example the plugin seems to insist on only tracking changes for my local (private) repository. But if I want to know more about the status of the publically shared one then I'll have to perform some extra manual steps, that's not always desirable (but this may be configurable).

And of course there is the size of the repositories. If you're only interested in keeping track of the source code then Git might not be the best of ideas for you, because you're not just getting the source code: you're downloading the entire repository backlog as well.

I can well imagine that this can be very useful for developers because they can look back at a projects history and maybe learn from that. But for someone who is only interested in getting the latest source code to build against.. it could become an extra hindrance: you basically have to download more "bloat".

So from that perspective I also believe that there's definitely still something to be said for using Subversion as well, especially with bigger projects such as the FreeBSD base system (but that's also on my todo list: trying to grab CURRENT using both Git and Subversion and then comparing those against each other).

Still, having said all that.

At the time of writing I'm preparing to move away from devel/subversion for my local needs and instead will rely on SVNLite from here on (the WITH_SVN option in /etc/src.conf) for keeping track of the FreeBSD source code and documentation. For everything else I've pretty much switched to Git ;)

Very extensive, but at the same time also very easy as well.

Hope this can be useful for some of you!
 
Last edited:
I have only skimmed your extensive guide and bookmarked it for later. I will just add that for the past few months I have been doing ports work with Git and committing with git svn dcommit. This setup with Subversion managing the central repository and Git as the client is a decent workflow.

The good:
  • local branches (my main motivation for using Git)
  • committing with git svn dcommit is nice because it basically just does what you want (sends all Git commits on the current Git branch that haven't been pushed to the Subversion repository; git svn dcommit and be done)
  • Subversion as the source of truth (repository history does not change)
  • developers can choose whichever client they prefer (git or svn)
The bad:
  • Subversion history can be lost when committing with git svn dcommit and directories are moved (this is the biggest drawback for me)
  • dealing with Subversion branches with Git is apparently difficult (I haven't had to deal with this)
  • your local history can change because you have to always rebase and never merge (I don't mind this so much)
https://wiki.freebsd.org/GitWorkflow
 
git is fine when you have a lot of developers checking code in/out and you want everyone to have that ability. If you want a central repository and better control over who gets to commit, review of the commit, and when, subversion is the best bet, especially for a lone developer or small outfit.
 
drhowarddrfine You can have complete control over who can commit to a certain branch if you use gitlab or github. Typically if you want access control, you split your repository (this can be a good thing in many cases) according into multiple repositories (and give access to them selectively; you can even give readonly access (you can achieve this with unix permissions if you allow to git usage over ssh)). There are also the git subtree commands which can make it easy to do this and get the history of files into a new repository.

It also makes it easy to work across multiple machines (I use it for my Emacs configuration; it also makes it trivial to test your code in different environments) and makes it trivial to back up your code. I would also recommend git because it's supported everywhere (with official support even in Visual Studio).
 
Preetpal What I'm trying to say is, if you want a central repository and the fewest number of people mucking around with it, subversion is the way to go. It's why FreeBSD and many other large projects use it. When you have small groups and lone developers sharing code in a more intimate way, git might be better to use.
 
Part III (also last part)

Git for system/network administrators
Disclaimer: The steps taken in this tutorial are potentially dangerous because if your permission settings are improperly set up then you risk exposing whatever you put into your repository with others. And considering the fact that we're talking about configuration files... I'll briefly address this issue further below, but please keep in mind that the main focus of this tutorial is Git, and not so much system security.

Editorial
I'm a command line nut. This is also where my nick is based on. As a result I spent hours on fine tuning my servers in my networks. Yet sometimes it can happen that a specific configuration file gets so many changes over time that when looking back you wouldn't mind having another peek at that specific setup your had when (for example) "IRC server X was still part of the network". Of course that situation was 2 years ago and we don't keep backups for that long.

But a VCS wouldn't have much of an issue with keeping a retention so long. Even 5 years shouldn't be too much of a problem, especially not for something fully dedicated to configuration files.

This has been on my Subversion todo list for years, but I never got around of actually doing it. Until this week...

The Git system configuration repository
The idea is very simple: one main repository branch (master) will be used for the main repository files; documentation files which roughly explain how to use all this. Trust me: you may not need this now, but in 6 months time (or when your PFY gets involved) you'll be glad you did.

Next we'll set up several branches where each branch will be a configuration setting of its own. This will eventually allow us to easily switch between environments with merely one single command. Don't worry if this doesn't make sense to you right now, we'll get there!

Step 1 => Creating the main repository
Start by initializing a secured repository. In other words:
  • mkdir etc && chmod 700
  • cd etc && git init
Now we have an empty repository which is only accessible by the current user. Start by editing .git/description and roughly explain what this repository will be about. Or don't, there's also something to say for security through obscurity, just don't rely on it too much.

First we make a README file which briefly explains the basic idea behind this repository. Example:
Code:
Server configuration repository
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This repository consists of several branches which each contain
a specific configuration aspect of the server. What follows is
the main overview and a brief description per branch.

## How to access a configuration branch
Use: git checkout -b <local name> origin/<branch>

## How to get an overview of all available (remote) branches
Use: git branch -r

## Summary of currently available branches

HEAD     - The main branch; contains the README files (aka origin/master).
pdns     - PowerDNS configuration.
unreal   - Unreal IRCd configuration.
As you can see I'm getting a little ahead of myself but that's ok. The main idea is to keep this file updated whenever we add a new configuration branch. Just add this for now: git add . && git commit -m "Initial setup".

Step 2 => Making the repository available for local administrators (and securing it)
I have a main repository location in /usr/local/var/db/git so that's what we'll be using. The repositories can only be accessed by members of the git group, including the Git daemon (restricted to the local network) and the Apache webserver.

So: git clone --bare . ../config.git, then secure this directory and move it into the main Git base directory. With securing I'm mainly referring to: # chown root:wheel config.git && chmod -R o-rwx config.git. So right now this repository can only be accessed on the local machine by the admins (read-only). Git (git:git) has no access nor has Apache (www:www).

Customize as needed.

In my situation I opened this up for Apache, applied HTTP authentication and relied on curl to do the rest for me on the client (Curl is used by Git). Also did something with extended permission bits.

Step 3 (optional, but suggested) => Making repository access slightly easier
The only problem we have now is the requirement to remember /usr/local/var/db/git/config.git which is quite a mouth full. Maybe easy for now, harder in 6 months time. So I'm going to configure a Git remote entry called repo which can be used to gain access to all this: git config --global --add remote.repo.url /usr/local/var/db/git/config.git. By using --global I make sure that this setting is only for the local user. Of course it might make sense to use --system instead if you're going to share this setup with others.

Only problem is that Git doesn't parse its configuration when you're not in a Git repository. So you can't simply do: git clone repo etc out of the blue. But you can do this:
  • git init etc && chmod 700 etc && cd etc
  • git pull repo
Code:
% git pull repo
remote: Counting objects: 16, done.
remote: Compressing objects: 100% (15/15), done.
remote: Total 16 (delta 4), reused 0 (delta 0)
Unpacking objects: 100% (16/16), done.
From /usr/local/var/db/git/config
* branch            HEAD       -> FETCH_HEAD
Just use ls and you'll see your README file reappear.

If you want to go easy then simply use this new repository from here on. But to do this the right way we'll 'link' our main project (~/etc) to repo (if you followed this step) or /usr/local/etc/var/db/config.git (if you didn't).

You added the repo entry above: git push --set-upstream repo master.
If didn't add repo: git push --set-upstream /usr/local/var/db/git/config.git master.

Step 4 => Adding a configuration branch
Now we're making progress!

We're going to add /usr/local/etc/Unreal. This will be made available as repo/unreal (or origin/unreal).

First the usual steps:
Code:
% git init
Initialized empty Git repository in /usr/local/etc/Unreal/.git/
% git add .
% git commit -sm "Added initial configuration."
18 files changed, 2787 insertions(+)
create mode 100644 aliases/aliases.conf
create mode 100644 aliases/anope.conf
create mode 100644 aliases/atheme.conf
      ...<SNIP>...
create mode 100644 ircd.rules
create mode 100644 spamfilter.conf
create mode 100644 unrealircd.conf
So here I quickly added all the local configuration files. I also signed off on this (-s), this adds an extra mention in the logs which will help me separate an official addition ("milestone") from a regular maintenance add.

If you still need to configure this, or if you're in the process of doing so, then I strongly suggest to tag this first commit for easier access at a later time: git tag start.

Now comes the cool part: pushing this right into a new branch on repo: git push -u repo +HEAD:refs/heads/unreal. It's unreal indeed :cool: If you didn't add repo as I suggested then you'll have to swap that out for that boring long directory name which I'm no longer going to repeat.
Code:
% git push -u repo +HEAD:refs/heads/unreal
Counting objects: 20, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (19/19), done.
Writing objects: 100% (20/20), 26.70 KiB | 701.00 KiB/s, done.
Total 20 (delta 7), reused 0 (delta 0)
To /usr/local/var/db/git/config.git
* [new branch]      HEAD -> unreal
Branch 'master' set up to track remote branch 'unreal' from 'repo'.
So what happened here? Easy: Git being as flexible as Subversion can be stubborn sometimes.

When performing push or pull commands then Git gives you full control over what to do through the use of refspecs. See git-push(1) for a good explanation, but roughly it boils down to telling Git what local branch or commit needs to be used. In this case that's HEAD, mentioned before the colon. Then we specify where it needs to be sent off to on the remote.

Fun fact: the official name within Git of your main ("master") branch is actually: refs/heads/master. Check the .git directory for yourself if you need to. So a new branch simply has to be placed there as well, that's how I came up with the refspec used above.

Because I also used -u Git automatically set this remote branch (repo/unreal) as the main upstream for our local repository. So right now we can continue to configure stuff, commit and then optionally push it to the main repository.

Step 5 => Cleaning up (optional)

Now, obviously we don't want to end up with getting .git directories popping up everywhere. So I suggest that you remove these from your configuration directories once you're fully done with configuring (don't forget about commiting and pushing!). Or keep them... it's totally up to you of course.

Just remember that added tags aren't always pushed onto remote repositories, but that shouldn't be a problem.

Step 6 => Accessing our new toys!

So let's give this setup a try. Go back to your main project (~/etc) and start by performing an update (we added something new after all): git fetch:
Code:
% git fetch
remote: Counting objects: 20, done.
remote: Compressing objects: 100% (19/19), done.
remote: Total 20 (delta 7), reused 0 (delta 0)
Unpacking objects: 100% (20/20), done.
From /usr/local/var/db/git/config
* [new branch]      unreal     -> origin/unreal
Now, if you use ls you'll notice that your README files are still in place. So first use: git branch -r which will show you exactly what is available on the main repository.

Let's check out Unreal for now:
Code:
% ls
.git/           README          README.config
% git checkout -b irc origin/unreal
Branch 'irc' set up to track remote branch 'unreal' from 'origin'.
Switched to a new branch 'irc'
% ls
.git/                   badwords.quit.conf      spamfilter.conf
aliases/                help.conf               unrealircd.conf
badwords.channel.conf   ircd.motd
badwords.message.conf   ircd.rules
This is also the reason why it's important to keep this project directory secured. After all: you're basically adding a new copy of those configuration files. Right now they're located in their original location (/usr/local/etc/Unreal, the Git repository in that directory (.git), our main repository (repo) and finally the one we just updated.

Let's check Tripwire (I set this one up earlier):
Code:
% git checkout tripwire
Switched to branch 'tripwire'
Your branch is up to date with 'origin/tripwire'.
% ls
.git/                   site.key                twpol.txt
local_unicron.key       twconf.txt
One project location (no need to go to dozens of arcane filesystem locations) yet full access to everything I need by merely typing one single command. I mean... git checkout origin/tripwire is easier than cd /usr/local/etc/tripwire.

So next months things are going to be very different for me. I can reconfigure stuff remotely by using a local repository, and then when I get back to my normal location I can then pull those changes and inspect and/or apply them at a later time.

Git isn't just a nice toy for developers :)
 
Last edited:
i don't understand. was all that to set up a new git server or just a repository on an existing server?

i've never used git but i know it has a web interface and or a desktop gui interface. i'm unsure why so much configuration was necessary. i'm pretty sure if you see git kernel developer repository and ask around they are all using web browsers and graphical front ends and do as little hacking as possible (they did/do this for cvs too).

(i still use rcs at home btw, have not tried 'git' yet, but will if i'm involved in a project)
 
i don't understand. was all that to set up a new git server or just a repository on an existing server?
Well, read the articles ;)

It is a bit of everything. The first 2 part article mostly explains the basics, such as setting up a new Git repository and it demonstrates the specific feature set which you get from Git being decentralized. In my example: how you can share a repository with another (local) user and check out some of their (suggested) changes while fully protecting the integrity of both your own project as well as their changes.

This is a bit special because with the traditional models you'd normally branch on the master repository itself, set up and test your changes and if the leaders don't like it you'd risk losing all your work. Not necessarily your code but most definitely its history (if you kept any). You can't easily use one code base and maintain it with two SVN servers, yet that model is possible with Git.

The second parts ("Git for Systems administrators") is very special because it demonstrates something completely impossible with the traditional methods.

Basically I have several configuration settings such as /usr/local/etc/apache24, /usr/local/etc/squid, /usr/local/etc/openvpn and even /usr/local/etc and /etc themselves and I keep all of those under Git control, yet without cluttering the directories too much (only one .git file is required).

What makes this so special (note: in comparison with the traditional models) is the hierarchy. A local repository (for example /usr/local/etc/squid) is just that: local. With the explicit difference that despite being local it's actually also a part of /root/etc (which is the main repository). But it's still an individual setup. It keeps its changes local and can push them onto a remote repository, but only in a separate branch of the remote.

So basically all my local repositories (such as /etc, /usr/local/etc, etc.) are separate branches of the main repository which resides in /root/etc. They all share the same configuration where ignored files and tags are concerned. On top of that each individual repository pushes its changes onto a remote (central) repository where it's also only monitoring a separate branch.

That is something which provides a lot of advantages and which is totally impossible to set up with Subversion (you can't simply create a new repository which uses another repository as upstream for example).

The best part is that I can now simply clone this shared repository on another server and use that as basis for my set up there. Or... in my example: to have access to the configuration of my main server while I am on a remote location which doesn't always provide options for me to log on using SSH.

i've never used git but i know it has a web interface and or a desktop gui interface. i'm unsure why so much configuration was necessary.
Because the GUI doesn't provide all the features which the command line provides. And also because my setup above is anything but standard. Well, apart from the casual examples in the first part(s) of my guide.

The GUI is fine tool for casual use. Clone a repository, make changes, push changes, etc. But it can quickly become a hassle when you want more. This is also one of the reasons why Git Bash exists on Windows.

i'm pretty sure if you see git kernel developer repository and ask around they are all using web browsers and graphical front ends and do as little hacking as possible (they did/do this for cvs too).
Sure, to use a hierarchy you want as little hacking as possible. Same applies to my examples above. At best all I currently use is git commit after I made a change, git status to check if I accidentally missed anything and of course git push.

But using a hierarchy is not the same as setting it up, and that's what's explained above.
 
i still use rcs at home btw, have not tried 'git' yet, but will if i'm involved in a project
Then you will love devel/git-lite. The stuff offered on the web is just a subset, but in my opinion not the most useful subset. In contrast to devel/rcs you can try things in a branch, merge it to what is usually called the master, or you do not merge that but delete it. The howtodo of ShelLuser exceeds that simple use cases, but I think it gives a perfect motivation to have a look on devel/git-lite.
 
Back
Top