Other Two questions about GIT

PMc

Daemon

Reaction score: 685
Messages: 1,381

Now as the doom draws nearer, and after I did a rough analysis on ways how I might manage my things in the future (local patches to base+ports, make+kernel configs, ports options etc.), I decided to just hop into the Git matter entirely.

The first target was my collection of old+local software: maintenance/backup/rc.d scripts, but also a couple of small coding objects, many of them rather abandoned. All of that was half-heartedly revisioned still with CVS (or not at all). I didn't see a good way to go from CVS to Git - and I don't know Git well enough yet to roll my own. So I converted to SVN first (which need python27, so this was somehow the latest exit), then from SVN to Git (with the KDE thing).

So, question one: how do you cope with such an assortment of little projects (some only two files, some with ten directories, all practically unrelated to each other) in Git? In CVS and SVN one does just throw them into the repo and all is fine. In Git the only suitable way (that I found) is to make a separate repo for each of them - which means that on the backend there are a dozen subdirectories and the hooks are to be edited in each of them individually. (But anything else seems to be less clean, orphaned branches for no reason and such).

Which brings me to the second question: how does one find things in Git? Is there
A) an inventory for ALL commit objects (and where they belong), to find accidentially lost branches and orphaned things?
B) most important: a way to show what each commit actually did?

Concerning B), In SVN I have these reports with log -v:
Code:
r1046 | svn | 2021-03-16 21:24:11 +0100 (Tue, 16 Mar 2021) | 2 lines
Changed paths:
   D /p2/rails-fin/trunk/.browserslistrc
   M /p2/rails-fin/trunk/Gemfile
   M /p2/rails-fin/trunk/Gemfile.lock
   D /p2/rails-fin/trunk/app/javascript
   A /p2/rails-fin/trunk/app/packs (from /p2/rails-fin/trunk/app/javascript:1045)
   A /p2/rails-fin/trunk/app/packs/entrypoints (from /p2/rails-fin/trunk/app/javascript/packs:1039)
   D /p2/rails-fin/trunk/app/packs/packs
...
This is the most valuable thing I get there, and I am in desperate need of something similar from Git.
 

Zirias

Son of Beastie

Reaction score: 1,698
Messages: 2,868

So, question one: how do you cope with such an assortment of little projects (some only two files, some with ten directories, all practically unrelated to each other) in Git? In CVS and SVN one does just throw them into the repo and all is fine. In Git the only suitable way (that I found) is to make a separate repo for each of them - which means that on the backend there are a dozen subdirectories and the hooks are to be edited in each of them individually. (But anything else seems to be less clean, orphaned branches for no reason and such).
Uhm I don't see how that decision relates to which SCM you use? But "the backend" isn't really a good term for GIT, there are just repositories, and if one is kind of "central", it is by your definition only. Which, of course gives you yet another choice: Isn't a local repository maybe enough for such tiny things? Do you really need to push them somewhere else?

I do have a few such repositories without any "remotes". Could always later decide to push them somewhere…

A) an inventory for ALL commit objects (and where they belong), to find accidentially lost branches and orphaned things?
git is completely decentralized. I think that answers the question. You won't "lose" any branches etc, but if you forget about one in some repository and never have a look, not sure how git should be able to help here…

B) most important: a way to show what each commit actually did?
There's a shitload of helpful stuff you can get out of git-log(1). I guess what you might be looking for here is git log --stat.

You might also be interested in creating git aliases for regularly used commands with lots of flags. e.g. I have git config --global alias.tree 'log --graph --oneline --all' which then makes git tree display a log that's helpful as an overview with some branching and merging happening ;)
 
OP
PMc

PMc

Daemon

Reaction score: 685
Messages: 1,381

Uhm I don't see how that decision relates to which SCM you use? But "the backend" isn't really a good term for GIT, there are just repositories, and if one is kind of "central", it is by your definition only. Which, of course gives you yet another choice: Isn't a local repository maybe enough for such tiny things? Do you really need to push them somewhere else?
*grin* Most people use a backend with Git. Often it is Github. Because that's precisely what a backend is for: you know where to find it, it is network accessible, and it probably has a backup/DR scheme in place.
Sure, this is not the concern of developers - it is systems management. And sure, Git makes it possible to do without these - but where is the point in that? That one can quickly start doing ad-hoc things?

There's a shitload of helpful stuff you can get out of git-log(1). I guess what you might be looking for here is git log --stat.
Thank You, that is a beginning. Frankly, the manpage is terrible, some other pieces of info seems in some -M and -C options, and alltogether, well, I'm not really happy.
 

Zirias

Son of Beastie

Reaction score: 1,698
Messages: 2,868

*grin* Most people use a backend with Git. Often it is Github. Because that's precisely what a backend is for: you know where to find it, it is accessible, and it probably has a backup/DR scheme in place.
Even this isn't a "backend", but just another repository. You can have as many "remotes" as you like and pull however you wish. You might often want some authorative/central place, sure, but with git, that's always only "by definition". Like, the authorative source for FreeBSD code is the repos on git.freebsd.org. Well, I definitely know where my repositories are. I have some "only local" ones twice, on two different machines ... push/pull also works through ssh, for example.
Only push has a little restriction, working only on "bare" repositories (those don't have a working copy attached). Still a simple command to create them wherever you like as well…
And sure, Git makes it possible to do without these - but where is the point in that? That one can quickly start doing ad-hoc things?
You just have a full featured repository at your fingertips – that's the point.
Frankly, the manpage is terrible
The manpage is huge, that doesn't make it "terrible". After all, this command just has tons of features. I guess your problem right now might be the sheer amount of features git has to offer. Well, it will just take time. Having worked with git for a while, you will know THESE commands that are really helpful for YOUR usecases by heart ;)
 
OP
PMc

PMc

Daemon

Reaction score: 685
Messages: 1,381

Even this isn't a "backend", but just another repository. You can have as many "remotes" as you like and pull however you wish. You might often want some authorative/central place, sure, but with git, that's always only "by definition".
From the Git view, sure, it's all just a mesh of possible relations. But in order to maintain an intellegible site layout (or in order to find your things again when you come back from an ashram), some logical structure is needed - no matter whether the tool itself would need it or not. (Not to mention firewall rules and the like.)

Like, the authorative source for FreeBSD code is the repos on git.freebsd.org. Well, I definitely know where my repositories are. I have some "only local" ones twice, on two different machines ... push/pull also works through ssh, for example.
How else would it work? This is a kerberos site, and git_daemon is not kerberized. Kerberized WebDAV might be done, but that's a more elaborate exercise and certainly overkill here.

The manpage is huge, that doesn't make it "terrible". After all, this command just has tons of features.
That's it, exactly. Its different than the unix style where we have small tools with limited features, and group them together as needed. I really don't want to read about three dozen different ways to print a diff when I just want to see an affirming report about what I have currently got.

And it is not only that the features appear to rank-grow, it is also that things are done without much reporting. For instance: I edit something, I do a pull (svn habit: update before commit), it autocreates a merge commit -without questioning- and wants the message edited. I look at the diff of the merge commit - it is empty. I enter git-rebase (without parameters) in hope for some message or guidance - it just removes the merge commit -without questioning-, and reports "up to date". In the end the outcome is okay.
This is a piece for people who work fulltime in development, get used to it, and can put the builtin "dont-ask-questions-just-do-the-right-thing" AI to good use.

I guess your problem right now might be the sheer amount of features git has to offer.
At some occasion I might look into the low-level "plumbing" stuff - then it should become more obvious what it does and why it does the things the way it does.
 

Zirias

Son of Beastie

Reaction score: 1,698
Messages: 2,868

From the Git view, sure, it's all just a mesh of possible relations. But in order to maintain an intellegible site layout (or in order to find your things again when you come back from an ashram), some logical structure is needed
This is your job, and if you feel you need one central place for everything you develop with git, it can be done easily (while some other things can't be done with any traditional/centralized SCM).

firewall rules
???

How else would it work? This is a kerberos site, and git_daemon is not kerberized. Kerberized WebDAV might be done, but that's a more elaborate exercise and certainly overkill here.
That's definitely nothing that should ever be integrated with git. The only job of the daemon feature is to provide access to repositories directly via TCP if ever needed, authentication is out of scope. The only common use is indeed tunneled via SSH. OTOH, you can access a remote git repo via SSH without this daemon.
For anything more complex, git specifies a http-based protocol (and no, no convoluted WebDAV stuff like svn, just plain "REST" http verbs) and includes a simple CGI implementing it. In your webserver, you can configure any kind of authentication you like. It's IMHO a really good idea not to reinvent the wheel here, git's job is source code management, not authentication/hosting/etc.

If you want a "full-featured" host for git repositories, there are a couple of projects, e.g. www/gitea.

That's it, exactly. Its different than the unix style where we have small tools with limited features, and group them together as needed.
Uhm what? Querying a log is all the same job, and no, all these things git-log can do would never be possible by piping something together. Having a lot of useful features for a job isn't violating "unix style" at all. Maybe have a look at ifconfig(8) or ppp(8)

And it is not only that the features appear to rank-grow, it is also that things are done without much reporting. For instance: I edit something, I do a pull (svn habit: update before commit), it autocreates a merge commit -without questioning- and wants the message edited. I look at the diff of the merge commit - it is empty. I enter git-rebase (without parameters) in hope for some message or guidance - it just removes the merge commit -without questioning-, and reports "up to date". In the end the outcome is okay.
All of this is configurable in many ways. E.g. I use a config git config --global pull.rebase merges and git config --global rebase.autoStash true (but don't just copy that, it's important to understand the consequences). Just take your time to explore and understand the things, it's worthwile in many ways.

One very tiny example: I can finally have a branch on top of FreeBSD source where I have e.g. my kernel configs (and can track my changes). Updating the underlying source is now as simple as
Code:
git pull origin releng/13.0
git rebase releng/13.0
 

ShelLuser

Son of Beastie

Reaction score: 2,111
Messages: 3,792

So, question one: how do you cope with such an assortment of little projects (some only two files, some with ten directories, all practically unrelated to each other) in Git? In CVS and SVN one does just throw them into the repo and all is fine. In Git the only suitable way (that I found) is to make a separate repo for each of them - which means that on the backend there are a dozen subdirectories and the hooks are to be edited in each of them individually.
Depends. Yeah, normally you create a separate repo for every small section that you want to keep under version control, but Git has much more tricks up its sleeve ;)

What you can also do is use worktrees; instead of maintaining separate repositories you basically set up those as "external branches". This basically means that you can keep all databases (so the .git directories) in one place, and on the actual locations (so where your files are kept) you only have one .git file and that's it.

I even wrote a guide about this a few years ago though it's a bit chaotic and could use some work and rewrites.

Which brings me to the second question: how does one find things in Git?
Depends on what you plan to find. You can apply filters to your searches to find specific things, you can tag specific commits which can make it easier to find stuff again and of course... if you follow my method above then you can search multiple repositories at the same time, which also makes things easier.

You can look into details of a commit, using git-diff(1) can help there, of maybe git-show(1). However I can't recall a command from mind which would list all added, changed and optionally removed files as you showed. But then again it's been a while since I looked into that.

It does bring me to a very important issue here: Git is not CVS nor SVN, so you should definitely avoid the temptation to treat it as such. Because if you do you're probably in for a horrible time, there are several things which Git does differently, and that also requires a different mindset or workflow. Take for example the de-centralized database(s) vs. having everything together in one location.

And I think you also need to ask yourself if Git is really what you want to use, SVN still has its uses.
 
  • Thanks
Reactions: PMc
OP
PMc

PMc

Daemon

Reaction score: 685
Messages: 1,381

This is your job, and if you feel you need one central place for everything you develop with git, it can be done easily (while some other things can't be done with any traditional/centralized SCM).

[firewalls]
???
Yes, firewalls. See, this action that I did the last two days concerns the local system tools, which are called from rc and cron. They go onto every system - and I don't have a central management for them, I just have /opt/sbin, /opt/etc/rc,d, ... and /opt/src where the binaries are comiled (they need to be compiled for the respective arch).
Now some of my machines are local, some are publicly accessible, and some are rented elsewhere,

The aim of this action now was to stop copying /opt/src from machine to machine (then edit it locally and loose track of the changes) into a common repo, which does now collect the documentation/history. This common repo must be accessibly from everywhere: intranet, perimeter and cloud. And that's where the firewalls come into play.

That's definitely nothing that should ever be integrated with git. The only job of the daemon feature is to provide access to repositories directly via TCP if ever needed, authentication is out of scope. The only common use is indeed tunneled via SSH. OTOH, you can access a remote git repo via SSH without this daemon.
Indeed ssh is what now works.

If you want a "full-featured" host for git repositories, there are a couple of projects, e.g. www/gitea.
I am thinking about that - I would like to have the history browsable. And, since the thing now is on a public IP already, I might want to make parts of it public accessible (like sharing my rc.d scripts here without cut+paste).

Uhm what? Querying a log is all the same job, and no, all these things git-log can do would never be possible by piping something together. Having a lot of useful features for a job isn't violating "unix
style" at all. Maybe have a look at ifconfig(8) or ppp(8)
These two have indeed a similar monolithism problem. (There is a reason why my bugfix for ppp was never processed: people don't want to look into monoliths if they can avoid it.)

All of this is configurable in many ways. E.g. I use a config git config --global pull.rebase merges and git config --global rebase.autoStash true

Ah, yes. I see, that's cool. So one needs to first understand how it works, then decide what one wants, and then twiddle it to comfort. :)

One very tiny example: I can finally have a branch on top of FreeBSD source where I have e.g. my kernel configs (and can track my changes). Updating the underlying source is now as simple as
Code:
git pull origin releng/13.0
git rebase releng/13.0
Yeah, I got that also. :) Now I am thinking if tags could be employed to mark the previous installation or the installation of a certain machine... So e.g. while one is testing 13.0, a security fix for 12.2 could be "pushbutton-deployed" by rebuilding with the exact configs/patches as were used before.
 
OP
PMc

PMc

Daemon

Reaction score: 685
Messages: 1,381

And I think you also need to ask yourself if Git is really what you want to use, SVN still has its uses.
Good point. Now since FreeBSD changed, we have to learn Git anyway, so I thought it is not wrong to move the system tools into Git - they weren't managed properly before, they are not big, not delicate, and may occasionally relate to the base OS in some way.
But for my pure-application ruby stuff I don't see an advantage from Git - with SVN I often do cross-project copies and commits, and that would not become easier.
 

Zirias

Son of Beastie

Reaction score: 1,698
Messages: 2,868

I wonder what these uses actually are (except for the obvious "I know svn, learning git costs too much time", which isn't necessarily wrong). But only regarding functionality: What can you do with svn that you can't with git? Also, in my experience, git is most of the time faster and therefore more pleasant to use.
But for my pure-application ruby stuff I don't see an advantage from Git - with SVN I often do cross-project copies and commits, and that would not become easier.
Without knowing the exact scenario: some copying can for example be avoided using git submodules. But they do have their pitfalls as well. I regularly use git submodules to pull in my ton of GNU make framework into all my C projects ;)
 
OP
PMc

PMc

Daemon

Reaction score: 685
Messages: 1,381

I wonder what these uses actually are (except for the obvious "I know svn, learning git costs too much time", which isn't necessarily wrong). But only regarding functionality: What can you do with svn that you can't with git? Also, in my experience, git is most of the time faster and therefore more pleasant to use.
You probably can do anything. The question might be: is it worth the effort?

Without knowing the exact scenario: some copying can for example be avoided using git submodules. But they do have their pitfalls as well. I regularly use git submodules to pull in my ton of GNU make framework into all my C projects ;)
Well, these are my ruby-on-rails applications. A database frontend to keep track of my finance, a tool to create ipfw config rules (which can be accessed publicly), my private diary and a blog application.
Last year firefox reported that the cookie must be configured with "SameSite: Lax".
So I change this, obviousely all four at once. They are separate checkouts, completely independent of each other - but when committing and just giving all four pathnames, SVN sees that they live in the same repo and creates a common commit. Zero effort.
Recently I got that warning from Firefox again. I vaguely remember that there was something - I search the commitlog, and I see immediately that it has been fixed, and fixed everywhere. Again, zero effort. (Then I start really searching, finding some other cookie with nondefault opts deep in the code.)

Certainly yes, one can use worktrees or submodules or whatever to achieve something similar. But then there is a deploy to the server, and a more selective deploy to the public server, and to the database server, etc. And all these will likely expect that they have a single repo to clone. So one probably has to take care for that, and/or configure it (and then verify that it does not break when changing something). Not zero effort.

But, stay relaxed - everybody else does use Git for RoR. ;) (But they also do all deploy to heroku, and as it seems they use it to create nifty-thrifty web applications - not as a database GUI.)
 

Zirias

Son of Beastie

Reaction score: 1,698
Messages: 2,868

Oh, now I finally get you're talking about the ability to check out a subtree as a working copy. Well, ok, I personally consider that a misfeature as I find it very confusing and chaotic, but then, that's an opinion ;)

Indeed, that's one thing git can't do (and never could cause there's no way such a thing maps to the principle of decentralized repositories).
 
OP
PMc

PMc

Daemon

Reaction score: 685
Messages: 1,381

Oh, now I finally get you're talking about the ability to check out a subtree as a working copy. Well, ok, I personally consider that a misfeature as I find it very confusing and chaotic, but then, that's an opinion ;)
Okay, You have a point in that. It can be confusing if it is done improperly or doesn't reflect the usecase. Usually SVN repo has
/---trunk |- branches \- tags
Whereas mine has
/---project1---trunk | |---branches | \---tags |-project2---trunk | |---branches ...
So indeed technically it is subtree checkout.(*) Practically it is a joint repository.

If Git had appeared without SVN existing, I would not have done this. But given the specific usecase (a couple of quite similar codebases with identical internal structure) I find it very practical. Also there is only one backup to care for, and one repo to replicate into the perimeter - no need to look into that when adding an application.

(*) Actually SVN does always subtree checkout. Consequentially, handling many branches is quite annoying. But in this usecase I don't do that.
 
Top