HEADS UP: FreeBSD changing from Subversion to Git this weekend

PMc · Dec 21, 2020

Zirias said:
I think you might be interpreting a bit too much in a tool, sorry to say that

No problem with that - but I do not interpret this from the tool. I rather see the tool as one typical symptom (of many) for a longer-scale development that goes away from the traditional Berkeley culture and more and more to the Linux culture.

Maybe this is also a generational problem: I still know how to build a logic gate from two transistors - and then from there all the way up to the computers of today. And in the same fashion I want to connect all the things back to their root, recognizing the development in a hierarchical structure of building blocks put upon another.
People of today just cannot do that anymore, because it all became overwhelmingly broad. So it is more interesting to just select the piece that you want to work on.

it's used in enterprises quite successfully

Yes, I understand that. It is probably the same reason why I got fired for my 20th job anniversary. Statement was, skill is no longer welcome at the company, as skill can now be easily obtained from the cloud. So if they need more skill, they can just buy more cloud.

Zirias said:
. And IMHO, GIT is just the better tool nowadays.

Question is, what is actually better there? What are the gifts one would get in a practical means?

Because, by practical means, my experience is just that nothing works.
Imagine some arbitrary port from the ports system, where the source is maintained at github. Now imagine I have a problem with that port and want to look into the revision history to understand why it was written that way. On github there is a chooser for "branches" and "tags". All of these have arbitrary names, and none of them has any resemblance to the distributed versions that are downloaded by the ports system. So it is just not possible to read the revision history for the code that is actually distributed - like if there were no revision system at all.

zirias@ · Dec 21, 2020

PMc said:
Maybe this is also a generational problem: I still know how to build a logic gate from two transistors - and then from there all the way up to the computers of today.

Just for completeness: I know that as well (at least in theory, hehe) and I agree this is a very GOOD thing to know.

PMc said:
Question is, what is actually better there? What are the gifts one would get in a practical means?

Ask 10 developers and get at least 4 different answers. But it DOES have more (and useful) features. I personally like how quickly it operates on your local "clone" and how you can work with local branches efficiently.

PMc said:
On github there is a chooser for "branches" and "tags". All of these have arbitrary names, and none of them has any resemblance to the distributed versions that are downloaded by the ports system.

Or you can just look at, well, the history

I personally don't like projects with thousands of published branches. At my company, we have a "single branch" policy, and I like it very much. Other branches are only published for creating and reviewing pull requests, and HAVE to be deleted afterwards. There's no need to publish your local, possibly long-running, branch, unless you need to work together with someone else on it.

All I say here: GIT is very flexible, you define your workflow and policies, and you just use GIT as a tool accordingly. We will see what FreeBSD makes of it. I personally have a good feeling here

ralphbsz · Dec 21, 2020

(Talking about the Linux kernel)

PMc said:
Somewhere there was a change - nobody knows where, nobody knows what, nobody knows why - and version numbers are a chaotic heap, so you never know what you're actually running.

At least up to Linux kernel 2.4 including, version numbers were exceedingly clear and numerically increasing, and could be found in the name of tar files at kernel.org. That goes up to about 2005 or so. After that, I stopped compiling my own Linux kernels.

(about distributed version control enabling distributed development)

But, as You explained, we do not do this. We don't talk to each other anymore.

What I said is: We can now do development and re-merge development without central coordination. That doesn't mean that we have to stop using central coordination; git works perfectly well in a centralized model (with a single master repository) also. It also doesn't mean that one developer can't talk to another developer. It only means that they don't have to do these things.

Finally, you may find a commit log that actually identifies the auther of something. But that doesn't help you in any way. Because all you get is a cipher under which the author writes. The actual contact data is protected, and is only uncloaked to customers of the github corporation.
But even then, if you manage to find some customer of github, and manage to have some message dispatched to the author, you may most likely not get a reaction.

Do not confuse github (a commercial corporation, now owned by Microsoft) with git (a free/open piece of software). I have used git quite a bit (professionally for about 10 years, personally for about 5), and never created a github account, and never worked on any source that came from github. Every piece of code I've used in git has the clear-text e-mail address of the committer for each commit. To my knowledge, the FreeBSD source will not be controlled by the github copy (which is a secondary copy, not the master), so this issue doesn't arise.

Now the question whether developers respond to e-mails: That's up to them. In volunteer-run free/open software development, there is no way to force them. With commercial software, a customer has means (contract law) to force software vendors to fix bugs, and vendors have means to force developers to do so (as developers are employees, the paycheck is a pretty good carrot and stick). But the question of what version control system is being used doesn't change the relationship between user, bug and developer.

(About software quality)

It is all about the mindset - and the mindset is that we have tests in place as a rope to catch us when we fall. And you behave entirely different when you know that you are protected: you no longer strive to behave error-free; you create more crap, because you think nothing too bad can happen from it. High quality is already abandoned at that point.

Please read what I said. The thing that determines code quality is the mindset of the developer. That encompasses many things. For example having clear requirements (knowing what the code is supposed to accomplish). For example creating well-crafted and easily understandable and maintainable artifacts. And to run tests to validate that these goals are being met. Tests are part of whole picture. You can't "test quality into software", but without any tests, you don't even know whether you have met your quality goals.

And to be clear: When I say tests, I don't only mean automated tests (which are part of the source code). I think just as important is a dedicated team of testers, in a corporate setting disconnected from engineering (they report to a different VP, so there is no temptation to fake test results), good test plans, room in the schedule and budget for testing. You talk about counting the LOC of tests; I don't like that metric at all. In my mind the correct metric is: for every software engineer on the payroll, you should have about 2 testers on the payroll.

But also, the problem is: tests don't catch you when you fall! Tests can only protect from re-introducing problem that are already known (and fixed). Because, a test needs to be written first, and it can only be written if somebody has thought about and knows that there can be a possible malfunction that should be tested against.

Nonsense. If you write tests this way, your software process is broken. Tests are there to validate that requirements are being met. Example requirement: "This piece of software shall count the number of elephants in the zoo, and print a non-negative integer when run from the command line as ./count_elephant. If the computer is not installed in a zoo, it shall crash with a clear English error message. If the number of elephants is below 10, the count shall match exactly. If it is between 10 and 99, it shall match within 10%, and at 100 and above within 5%." How do you test this? Your test team sets up a fake zoo, catches a few elephants, and tries various scenarios (like 0, 9, 11 and 42 elephants), runs the program 100 times each, and performs statistics on the results. They could run regression ... making sure the accuracy doesn't get worse with new versions, and checks that the running time of the program is within reason. The could run the program at an aquarium, and check the spelink of the error message. This is testing. It has to be driven by requirements, not by last week's bug.

Old joke: A software company builds a bar. On the day before beta release, the tester walks into the bar, orders one beer, gets it and drinks it, all good. He orders zero beers, he orders 5 beers for his colleagues, he order sqrt(-1) beers, he orders qwertyuiop beers, he orders 6.02e23 beers, and in all cases he gets expected results. Signs off on public release. The next day, the first real customer comes in, asks what time it is, the bar explodes killing everyone in sight. Oops.

But: None of this discussion has anything to do with SVN versus git.

msplsh · Dec 21, 2020

PMc said:
On github there is a chooser for "branches" and "tags". All of these have arbitrary names, and none of them has any resemblance to the distributed versions that are downloaded by the ports system. So it is just not possible to read the revision history for the code that is actually distributed - like if there were no revision system at all.

This is the maintainer's fault. People can use SVN like this (I do, for internal stuff).

Jose · Dec 21, 2020

rigoletto@ said:
Could someone explain me why sometimes I run 'git pull' on some repository absolutely no local change (e.g. src happened today) and I receive a total mess of conflicts, and git ask to commit my local changes (or something like that)?

By default, git-pull(1) does a fetch followed by a merge. This is probably not what you want, and yes --rebase should probably be the default. You may also find the --ff-only option interesting.

FWIW, newer versions of Git now warn you about pull modes. Personally, I set

Code:

[pull]
        rebase = true

in my ~/.gitconfig.

Rewriting history in a shared remote branch is bad form, and will get you yelled at in most places I've worked.

Zirias said:
There's no need to publish your local, possibly long-running, branch, unless you need to work together with someone else on it.

I do this to back up local changes if my desktop is not getting backed up, or if I don't trust the backup.

BTW, the Github workflow with forks and multiple remotes is unnecessarily complicated for most people, and not needed to use Git for revision control.

PMc said:
SVN, on the contrary, does not need tags, because the revision number is enough to uniquely specify the entire distribution, and is also suitable to compare which one is newer. (Tags cannot be compared numerically.

Git commit objects contain the entire state of the tree at a particular point as well.

I'm not saying that these features will allow to you adapt your workflow to Git, or that even if they do that it will be easy and quick, but you might find git-log and git-bisect(1) interesting.

Jose · Dec 21, 2020

While I'm on my Git soapbox, let me advise you to never, ever use git-revert(1). It does not do what you would expect it to do.* A former colleague described what git-revert does in this memorable way: It creates an evil antimatter commit that will hunt down and destroy the changes you tried to revert, even if they are re-introduced much, much later. This can lead to some serious head-scratching when freshly-committed code disappears mysteriously.

* To me the most serious and trenchant criticism of Git is that many commands don't do what you would expect them to do based on your experience with other revision control systems. Git-revert is probably the most dangerous one, but git-checkout is likely the most annoying one. It does about 5 completely different things, none of which maps to what most of us think when we talk about "checking out the source tree".

zirias@ · Dec 21, 2020

Never even heard of git revert although I use GIT for years. Maybe for the better

Talking about do's and dont's with GIT, IMHO the most important one is: Never ever rewrite history on a public branch! Of course, GIT can be configured to reject any attempts to do so

PMc · Dec 21, 2020

ralphbsz said:
(Talking about the Linux kernel)

At least up to Linux kernel 2.4 including, version numbers were exceedingly clear and numerically increasing, and could be found in the name of tar files at kernel.org. That goes up to about 2005 or so. After that, I stopped compiling my own Linux kernels.

The event I was talking about happened in 1996. It concerned the return codes of the ping command. With ping, there is a deliberate difference between returncode 1 and 2. And people were relying on that for monitoring DSL lines.
Now the fun part was: that Linux was delivered in source and binary. And the 1 and 2 returncodes for ping were exactly swapped between source and binary. I doubt that any of the "exceedingly clear and numerically increasing" version numbers would allow one to detect this (and instead of calling it 'bazaar', I prefer to call it 'saustall', anyway).

So yes, if a central coordination only ascertains so much as that source and binary do not run away from each other, it is already valuable.

ralphbsz said:
Do not confuse github (a commercial corporation, now owned by Microsoft) with git (a free/open piece of software). I have used git quite a bit (professionally for about 10 years, personally for about 5), and never created a github account, and never worked on any source that came from github.

Well, to my knowledge, projects managed in git are hosted on github. (Or why else would there have been such a big hassle about these master keybase keys of web applications being automatically stored within the repo on github - ready for deploy, and ready for everybody to read?

)
Maybe this is not mandatory to do, but it seems to be what is expected and what everybody does.
In all these software projects, it is also expected for everybody who wants to report bugs or send in patches, to "simply sign up" to github first. But then, when you look at the fine print of that, it is actually a customer contract, including payment and pricing.
Thats why I call this the ivory-tower league.

ralphbsz said:
Nonsense. If you write tests this way, your software process is broken. Tests are there to validate that requirements are being met. Example requirement: "This piece of software shall count the number of elephants in the zoo, and print a non-negative integer when run from the command line as ./count_elephant. If the computer is not installed in a zoo, it shall crash with a clear English error message. If the number of elephants is below 10, the count shall match exactly. If it is between 10 and 99, it shall match within 10%, and at 100 and above within 5%." How do you test this? Your test team sets up a fake zoo, catches a few elephants, and tries various scenarios (like 0, 9, 11 and 42 elephants), runs the program 100 times each, and performs statistics on the results

Such may happen in contractual software development. But I'm quite certain, it doesn't happen in free software. Simply because nobody pays for it. What happens is that the developers have some automated test suite to run, after they implement a change. And that's it. These 200% testers then are the user-base. Which is also fine.
What certainly does happen is writings test to not re-introduce bugs that have already been fixed - and that is a good thing, and much better than not doing it.

PMc · Dec 21, 2020

msplsh said:
This is the maintainer's fault. People can use SVN like this (I do, for internal stuff).

You think it is a fault? Well, thank you, that's actually good news. (On my own, I just don't know if something would be considered a fault, or just culture...)

jb_fvwm2 · Dec 22, 2020

Does anyone know the command to git up a subtree, like /usr/src/sys/i386/conf so beginners could practice and while the destination is elsewhere?
...
That's easy in svn...

a6h · Dec 22, 2020

PMc said:
to my knowledge, projects managed in git are hosted on github. [...] Maybe this is not mandatory to do, but it seems to be what is expected and what everybody does.

It depends on the project. For example OpenBSD has a read-only public git of its CVS repo on the GitHub for general public. But it still uses CVS for diffs, and I bet it will remain the same.

ralphbsz · Dec 22, 2020

PMc said:
Well, to my knowledge, projects managed in git are hosted on github.

Not at all necessary. You can use git and never get anywhere near github. I worked on a very large project (millions of lines, hundreds of person-years) that used git (we transitioned from CVS, and then later transitioned to an in-house custom tool), not open source, the source was heavily restricted (inside the company only, and not visible to all employees), and the development used a centralized model (with one central server as the source of truth). I've also used it on medium-size projects within a work group, again without github being involved at all.

Git is a free and open tool for source control. Anyone can use it. Github is one of the users, there are many others.

zirias@ · Dec 22, 2020

jb_fvwm2 said:
Does anyone know the command to git up a subtree, like /usr/src/sys/i386/conf so beginners could practice and while the destination is elsewhere?
...
That's easy in svn...

That's technically impossible because there is never a standalone "working copy" with GIT but a clone of the whole repository, including all history etc.
This enables working quickly locally, even have your local branches nobody else will see, working offline, shaping a set of commits before pushing them to the "main" repository, and so on.

PMc said:
So yes, if a central coordination only ascertains so much as that source and binary do not run away from each other, it is already valuable.

That's an organizational problem not solved by ANY VCS. A good way to make sure such things don't happen are for example CI builds from your central repository.

PMc said:
[github] when you look at the fine print of that, it is actually a customer contract, including payment and pricing.

I don't know other countries, but at least in Germany, it wouldn't even be possible to sign a contract requiring payment that way. So I see no reason NOT to use github. It *does* have a nice web UI, of course including an issue tracker as well. You need a bit more tooling than just a VCS repository for a software project, so if you're not a large project like FreeBSD (or even Linux) operating your own infrastructure and tools, github has a nice and free offer (and others do as well, like e.g. bitbucket or gitlab – github is just the most widespread).

So, what if github would suddenly decide to only offer payed services? They probably wouldn't as their user base would drop immediately. But IF they would, well, you just move somewhere else? With GIT, you have copies of your whole repository. There's nothing simpler than moving your "central" GIT repository somewhere else.

Of course, if your project is NOT opensource, using github for it would require payment, and probably most companies use their own infrastructure instead. I personally use github for anything opensource, my own server (where I want to try adding www/gitea to have a nice web UI with additional tools as well) for closedsource stuff, and at work, we currently have a MS Azure DevOps server hosting the central GIT repos.

Lamia · Dec 22, 2020

I have been using 2020Q4 branch of the ports with Poudrière for quite sometime. Before now, it was 2020Q3.

And now that we are on git, I would not want to be repeatedly changing the branch.

My understanding is that the main/master branch is for the FreeBSD 13-current while the quarterly is for other versions.

I would want to use only one branch at all times. Kindly advise me on how to go about it.

zirias@ · Dec 22, 2020

Lamia said:
My understanding is that the main/master branch is for the FreeBSD 13-current while the quarterly is for other versions.

The main branch of ports is for any supported release, plus CURRENT and STABLE.

Lamia · Dec 22, 2020

Zirias said:
The main branch of ports is for any supported release, plus CURRENT and STABLE.

Thanks Zirias. I could build most ports on a quarterly but not the master few months ago.

zirias@ · Dec 22, 2020

Well, this shouldn't happen…
It's your choice whether you want to use quarterly snapshots, no matter which FreeBSD version you run

PMc · Dec 22, 2020

Zirias said:
I don't know other countries, but at least in Germany, it wouldn't even be possible to sign a contract requiring payment that way.

It certainly is possible in Germany, because I do it all the time. (For things I want to buy, things that offer value.)

Zirias said:
So I see no reason NOT to use github. It *does* have a nice web UI, of course including an issue tracker as well. You need a bit more tooling than just a VCS repository for a software project, so if you're not a large project like FreeBSD (or even Linux) operating your own infrastructure and tools, github has a nice and free offer (and others do as well, like e.g. bitbucket or gitlab – github is just the most widespread).

Yes, it is exactly the same as with facebook. It provides nothing that you couldn't do with some effort on your own, preserving your independence; but everybody decides to use it, and then everybody complains that they are so powerful.
(But when I say, people are not thinking one centimeter before their nosetip, then I am considered evil. :/ )

Zirias said:
So, what if github would suddenly decide to only offer payed services?

Then it would immediately become obvious that nobody actually needs them.
So, that is not the point - the point is simply that they have the right and the ability to do so. I for my part am wondering, all the time, why somebody would advertize git as a "distributed service", while at the same time collecting practically all software projects of the open souce onto one centralized server - and that one then run by a private company with only a single ambition: money-greediness.
(That again seems to count into the column with the centimeter and the nosetip, aka the golden rule: never reflect upon what you are doing.)

Zirias said:
They probably wouldn't as their user base would drop immediately.

Sure, they wouldn't. But what they easily might do, in cooperation with the government, is to exclude single projects that have an unwelcome agenda (where the definition of "unwelcome" might change with current political moods).

Zirias said:
But IF they would, well, you just move somewhere else? With GIT, you have copies of your whole repository. There's nothing simpler than moving your "central" GIT repository somewhere else.

I doubt that. Even with facebook, people seem unable to "move somewhere else".
This whole game is now about market domination, about "there can be only one".

zirias@ · Dec 22, 2020

PMc said:
It certainly is possible in Germany, because I do it all the time. (For things I want to buy, things that offer value.)

Definitely no. For a contract requiring payment online to be effective, it's not enough to have some "fine print". This would never hold before court, so it's moot.

And then, comparing github with facebook and (again) insisting everyone using GIT would use github suggests you don't have much experience with GIT and how it works yet…

There's no way to move data away from facebook. GIT is opensource, standardized, and every clone is a full repository, you just push it somewhere else with 2 simple commands and you're done.

shkhln · Dec 22, 2020

Zirias said:
insisting everyone using GIT would use github

It's like suggesting that everybody using CVS should move to SourceForge. Even if FreeBSD wanted to move to GitHub, their issue tracker is incapable of managing a 250000+ bug database. Self-hosted GitLab instance is probably not an option for storing that many issues as well.

PMc · Dec 22, 2020

Zirias said:
Definitely no. For a contract requiring payment online to be effective, it's not enough to have some "fine print". This would never hold before court, so it's moot.

Maybe. I don't care, because it is entirely pointless to bring some American company before court (unless you have the money to pay your lawyer's flights to US).

Therefore, the much more important point is to choose carefully with whom I want to do business. And this is what I am talking about: "simply signing up" to github means no more and no less than declaring that I do want to do business with them. (And for that it doesn't matter if the fine-print would hold before court or not.)
And github (just like facebook, and amazon, etc.) is a company I certainly do NOT want to do business with - because I am rather choosy about that.

zirias@ · Dec 22, 2020

Still pointless, cause THEY would have to bring YOU before court if you just don't pay.

So, if they ever want payment from you, they will make sure to communicate that in a way that's safe for them, IOW clearly inform and require action from your side to accept that. So, don't be paranoid here …

Anyways, whether you want to "trust" github or not has nothing to do with FreeBSD using GIT, cause they will use github only for a read-only mirror, which has already been there for a long time.

olli@ · Dec 22, 2020

PMc said:
Well, to my knowledge, projects managed in git are hosted on github.

No, there are quite many Git hosting services on the internet. GitHub might be the best known one, but there are many others. And of course you can run your own Git server locally if you want to.

The FreeBSD git repository is hosted by the FreeBSD project itself. It is mirrored to GitHub and GitLab, but this is purely for convenience. So, if GitHub goes away or changes its access policy in unacceptable ways, it will have no impact on FreeBSD itself whatsoever (except that those who access the repository via GitHub will have to switch over to somewhere else).

Personally I welcome the change from Subversion to Git. It improves collaboration and makes things easier for the developers, for example when several developers work on overlapping parts of the source tree. I expect code quality to improve, and of course reviews will still happen on Phabricator, which will even work better in combination with Git than it did with Subversion. This is all from the developers’ point of view, of course.

As far as users are concerned (i.e. non-developers), for 99 % of them there will be no change. freebsd-update(1) will continue to work. Those few who check out source code from the repository in order to “build world” will have to adapt their workflow, though, but it’s really not a big deal. Also, the author of net/svnup mentioned that he is working on a light-weight replacement that can be used with Git, so people who use svnup can switch over easily.

By the way, if everything else fails and you just want to get a source tree of -current or some other branch, you can download a .tar.gz of an arbitrary branch from cgit (these are generated on the fly). For this you don’t have to know how Git works at all.

kpedersen · Dec 22, 2020

olli@ said:
No, there are quite many Git hosting services on the internet. GitHub might be the best known one, but there are many others. And of course you can run your own Git server locally if you want to.

Absolutely true. However like those obsessed with Docker, it generally implies that they are really only interested in consuming from DockerHub. I don't think many would even know how to be self-sufficient and host their own server.

A little different with Git but I guess whatever server FreeBSD decides to host the repo on, they should just be prepared for an influx of "Why are you not using GitHub!? It is sooooo cool! You can have an avatar and everything!".

But yes, just because the direction of the most popular Git product is now governed by a ~~bunch of dickheads~~ less than trustworthy company, that shouldn't bias the decision based on the technical merits of Git... I suppose.

My biggest worry is that it now opens up FreeBSD to be abused by a bad element of the FOSS community who normally are kept away by the "complexity" of cvs, svn and mailing lists which would require some element of learning to interact with and something they tend to avoid. Though hopefully it will also attract many decent people too.

PMc · Dec 22, 2020

Zirias said:
Still pointless, cause THEY would have to bring YOU before court if you just don't pay.

So, if they ever want payment from you, they will make sure to communicate that in a way that's safe for them, IOW clearly inform and require action from your side to accept that. So, don't be paranoid here …

What makes You think I would have a mental defect just because I decide with whom I do business and with whom not?
Wasn't it always the fundamental right of the customer to decide if they want to do business with some counterparty?
Has this gone away? Are we now in China where everybody has to behave as BigBrother requires?