Solved Deprecating scp

Deprecating scp

  • bad

    Votes: 23 76.7%
  • good

    Votes: 7 23.3%

  • Total voters
    30

olli@

Daemon
Developer

Reaction score: 1,252
Messages: 1,140

Tried that, but didn't work: "/-x" jumps to the line that has "--xattrs, -X".
Oh, right, I didn’t notice that because the line about the -x option is just a few lines below that. I guess it depends on the window size and less options (I have LESS="-RMeiqa -#8 -j.5 -z-4").
Searching for "/filesystem" interestingly doesn't find anything (I wonder why not).
I wonder, too. It works for me. Maybe you had a typo?

For me, who knows how shells work, that's reasonably easy. Typically I get it on the second try. Again, for a simple tool like scp, it should work on the first try: if the command works for cp, it should work for scp, just with "host:" or "user@host:" prefixed.
Well … Arguably, someone who uses shell commands (and especially when writing scripts) should know how shells work. If he doesn’t, he should either learn, or use a graphical front-end that hides the technical details. Yeah, I know, in reality it’s not that easy, unfortunately.

Saying that scp should be just as easy to use as cp is a little bit too simple, I think. For example, you have to take care if a file name contains a colon, because the colon character is special to the scp command. You also need to know certain details about the remote system, for example how file names are interpreted on the remote side, especially when it’s not a UFS/FFS compatible filesystem.

Absolutely. This just has to be disallowed scp.
It’s not that easy, unfortunately.

The problem is that scp lets the remote shell parse the whole argument. This has several implications: it parses quoting, performs white-space splitting, parameter expansion (some people might have legitimate use for this!), command substitution, filename expansion (a.k.a. globbing), and so on. So the question is, which of these things do we want to disallow. And the next question is how to disallow them. Should we pre-parse the arguments, so we remove (or quote?) unwanted things, before passing the result to the shell? Or should we bypass the shell completely and implement our own parsing with quoting, white-space splitting and other things that may be used legitimately and that we need for compatibility? This is very non-trivial.

Another problem is that the remote shell might have a different feature set from the local shell. For example, the character sequence [[ is not special for FreeBSD’s /bin/sh (the command echo [[ just prints “[[”). However, it has a special meaning for other shells like zsh and bash (zsh will print “bad pattern: [[”). Solaris’ /bin/sh doesn’t know the $(...) syntax for command substitution. And bash performs white-space splitting on the result of parameter expansion by default, while zsh does not. These are just a few examples, there are many more. So, it depends on the remote shell which things need to be quoted. And even the quoting syntax itself can have subtle differences between shells. It’s really a can of worms.

That is why the OpenSSH developers decided not to try to “fix” scp, but declare it obsolete. It’s better to create a new tool that behaves similarly to scp, but doesn’t try to be compatible with it. In particular, it should not parse arguments on the remote side in any way.

The classic joke is a person who create a file named "-R", and then wonders why "rm -f *" recursively deletes subdirectories too, even though the user clearly didn't intend that.
Well, it’s clear for a human that the user didn’t intend that, but it’s not clear to the shell or to the rm command. The rm command doesn’t see what the user typed, it only sees what the shell passes to it. And most shells don’t have any knowledge about the rm command and what the -f option of that command does, because it’s usually an external command, not a shell-builtin.

By the way, the zsh fixes this particular problem: It recognizes the rm command (even when it’s used via an alias) together with a “dangerous” pattern like *. In this case it prints “sure you want to delete all <n> files in <directory>? (waiting ten seconds)”, then waits 10 seconds (you can abort with Ctrl-C, of course), flushes all input received within those ten seconds, and then waits for the user to type “y”. Of course, this behavior can be disabled if you don’t want it. Personally I keep it enabled, just in case. When I’m sure that I know what I’m doing, I press <Tab> on the pattern to expand it before pressing <Enter>, avoiding the waiting time and confirmation.

Agree. Something like scp is needed, just deleting the current one creates a hardship. But the new one has to be better, and that will make it different, so it also needs a different name.
Exactly.

By the way, a simple replacement for scp could be implemented as a shell script. It would just have to collect the source files and/or directories into a tar archive (or cpio, or whatever), transfer the archive through stdout/stdin of an ssh connection and extract the archive in the destination directory on the target machine. In fact I have already done things like that in shell scripts, when scp was unsuitable for various reasons. This is an excerpt from one of my scripts that copies a certain directory tree (slightly modified and simplified):
Code:
cd $SOURCEDIR
find . -depth -not -name '*~' -and -not -iname '*.bak' -print0 | cpio -o -0 | ssh $TARGET "cd \"$DESTDIR\" && cpio -idum"
 

wolffnx

Aspiring Daemon

Reaction score: 231
Messages: 677

Ok...for all the "this could we fix, this has to been change..etc..etc" , all this bullshit is for scp has a bug that can create/replace files or folders on the remote machine??
first I am the only that has access to my FBSD servers
second I had a word for you, use your brain and check for :

a) the file or folder exists on the remote machine?
b) same as a)
c) same as b)

is common sense for a system administrator , if a kid are shooting yourself in the foot is their problem
 

Jose

Daemon

Reaction score: 965
Messages: 1,170

Do you have any numbers to back the claim “tiny minority of scp users”?
I don't. I get the impression that most people on this thread use scp(1) just like I do, for a quick one-off copy between machines. I also get the distinct impression that most people have moved on to other tools for more complex remote copy tasks, including you.
Personally, when I use scp inside a script, I always try to write it in a way that it’ll work when used with file names that contain spaces or other special characters (double quoting). All of these scripts would break if you replaced the scp command with another command (but with the same name) that doesn’t use the rcp protocol by default. It would be confusing and a POLA-violation. And even when you adapt the scripts, they would now be unportable between systems that have the “real” scp and systems that have the “new” scp.
I saw nothing about having to change the quoting rules. All I saw was that copying "files between two remote hosts by way of the local machine" and backtick expansion don't work. The latter is an abomination anyway. Good riddance, I say
 

Jose

Daemon

Reaction score: 965
Messages: 1,170

Well … Arguably, someone who uses shell commands (and especially when writing scripts) should know how shells work. If he doesn’t, he should either learn, or use a graphical front-end that hides the technical details. Yeah, I know, in reality it’s not that easy, unfortunately.
You've reduced the size of the set of people that should use shells to about 5. I am not a member of that set.

View: https://www.youtube.com/watch?v=PQ8uUFjzyH0&t=275
 

olli@

Daemon
Developer

Reaction score: 1,252
Messages: 1,140

I don't. I get the impression that most people on this thread use scp(1) just like I do, for a quick one-off copy between machines. I also get the distinct impression that most people have moved on to other tools for more complex remote copy tasks, including you.
I still use scp, of course, and not just for simple cases. I only resort to alternatives when I need some functionality that scp does not support at all.

I saw nothing about having to change the quoting rules. All I saw was that copying "files between two remote hosts by way of the local machine" and backtick expansion don't work. The latter is an abomination anyway. Good riddance, I say
The point is that you cannot easily fix one thing (backticks) without breaking the other (quoting). This is the fault of the rcp heritage of scp.

You've reduced the size of the set of people that should use shells to about 5. I am not a member of that set.
When you use the scp command inside a shell session, it’s not far-fetched to assume that you are familiar with it. When you drive a car on the highway, you should know what happens when you turn the steering wheel or apply pressure to the brake pedal, and that you should avoid doing both at the same time at high speed.

However, I agree with ralphbsz that the behavior of scp is non-intuitive, and applying double-quoting can be tedious, and this may be true even for people who are otherwise familiar with “normal” shell commands. So to say, scp is a car where the steering wheel, pedals, switches and knobs may interact in strange ways that are not clearly documented in the manual.
 

Jose

Daemon

Reaction score: 965
Messages: 1,170

The point is that you cannot easily fix one thing (backticks) without breaking the other (quoting). This is the fault of the rcp heritage of scp.
I'm going to wait and see just how incompatible Jakub Jelen's hack is, and how much of an uproar it would cause to drop those features from default scp(1). Right now I'm not convinced that a new command is or is not the best approach. Ultimately it's not up to me, it's the Openssh developers that'll trade off heat from users against killing the monstrosity.


When you use the scp command inside a shell session, it’s not far-fetched to assume that you are familiar with it. When you drive a car on the highway, you should know what happens when you turn the steering wheel or apply pressure to the brake pedal, and that you should avoid doing both at the same time at high speed.
I'm not a fan of car analogies, but I'll play along. I don't expect the pedals to be reversed just because I'm wearing blue shoes. The shell's many gotchas feel that way sometimes.
 

olli@

Daemon
Developer

Reaction score: 1,252
Messages: 1,140

I'm not a fan of car analogies, but I'll play along. I don't expect the pedals to be reversed just because I'm wearing blue shoes. The shell's many gotchas feel that way sometimes.
I agree with you on that point.

It’s true that the bourne shell is quite complex, and some aspects are rather counter-intuitive. Some of this complexity has historic reasons, some could be called design mistakes. But it cannot be easily fixed because the shell is used in a billion places, literally. And the problem is even worse because there are so many bourne shell variants that are incompatible with each other, sometimes just in subtle ways that bite you when you don’t expect it. Yes, there is the POSIX standard, but most shells are not 100 % compatible (including FreeBSD’s /bin/sh which is derived from the “ash” shell), and many features of modern shells are not covered by the standard at all.

However, at least FreeBSD’s /bin/sh is very well documented in the sh(1) manual page. It should cover all details of the syntax, including quoting rules. If there is a detail missing from the manual page, a bug report should be submitted.

[Edit] PS: Of course there are shells and scripting languages that are even much worse, like csh/tcsh or perl. For example, the syntax rules of perl are so wierd that they cannot be completely expressed with BNF, meaning that they can only be implemented with an ad-hoc parser. In other words: It’s a real mess. (In contrast, the POSIX shell syntax can be expressed with BNF and can be implemented with an automaton based on a context-free formal grammar, like the ones you can generate with lex(1) and yacc(1).)
 

Jose

Daemon

Reaction score: 965
Messages: 1,170

ralphbsz

Son of Beastie

Reaction score: 2,340
Messages: 3,236

What you guys are saying: Shells are a mess. Completely true. There are no two shells that are compatible with each other (that includes versions of nominally the same shell), and there are none that actually implement the POSIX standard. I know that some software products that are partially implemented using shell scripts end up shipping their own shell (typically one that is exceedingly stable and doesn't change much, and has a very easy to deal with license, such as 30-year old Korn shells) in source form, and compile it on the target machine, so once installed, the software has a completely predictable shell (where at least all the broken-ness is predictable and stable).

But the reality is that for CLI users, the shell (as broken as it is) always stands between the human user and the computer. Unless we start toggling binaries into front panel switches (as we all used to do), we have to deal with a shell. So we grudgingly accept that, and learn the idiosyncrasies of the shell.

The problem I see with scp and friends is this: When I type a command such as "cp <extremely ridiculous source file name> <ever more ridiculous target file name>", I know that my local shell with do whatever horrible violence it usually does to the two file names, which means that things like dollar signs, spaces, and backticks will turn into a small bloodbath. Because I know that ahead of time, I can deal with it, for example by escaping and quoting whatever needs to be protected against the cruelty of the shell. But with scp, the remote file name will be first violenced by my local shell, and then violated some more by the remote shell. It is the second round of torture that I object to. Why? Because in my mind, scp should be a logical descendent of cp, with the same syntax, except that one of the files is on remote host. The file name on the remote host should be transported as an opaque string to the remote host, and not messed with there.

Most of the time, the second interpretation is a booby trap, which people regularly fall into, and has no productive use. Yes, I know there are some use cases that today work by explicitly relying on the remote shell being able to add functionality. I'm willing to admit that those won't be supported in the future. We could keep a version of the old protocol around for people who rely on those, or force them to switch to explicitly using ssh.

By the way, it is bad enough that rcp and scp picked "host:file" and "user@host:file" as the syntax for specifying the remote host name and user name. That was simply laziness on their part, and now we're stuck with it. People who write file-based interfaces have to recognize that any byte (including the characters ":" and "@") can be valid in file names. But with the Unix tradition of parsing command line arguments, this bad decision was hard to avoid. This is what happens when you don't design an OS, but rely on the quick hacks that Dennis and Ken made when writing a prototype. Oh well.

And this morning, I finally got tired of having files with "invalid" file names on my server, so I wrote a small Python script that finds any files whose names contain control characters (other than space) or non-Unicode characters (I'm still allowing valid UTF-8 characters). It turns out there were fewer of a dozen of them. In my spare time, I should get rid of all files names that cause problems with the shell, but today I'm busy.
 

kpedersen

Son of Beastie

Reaction score: 2,079
Messages: 2,941

And this morning, I finally got tired of having files with "invalid" file names on my server, so I wrote a small Python script that finds any files whose names contain control characters (other than space) or non-Unicode characters (I'm still allowing valid UTF-8 characters). It turns out there were fewer of a dozen of them. In my spare time, I should get rid of all files names that cause problems with the shell, but today I'm busy.
It might be terrible for backwards compatibility but if the next release of ufs or zfs disables the ability to use non alphanumeric characters... I wouldn't be upset XD

I think 'space' as an allowable filename character is dumb. Yes I am certain someone has a use-case for it but I am also certain they are wrong XD
 

richardtoohey2

Aspiring Daemon

Reaction score: 307
Messages: 627

I think 'space' as an allowable filename character is dumb. Yes I am certain someone has a use-case for it but I am also certain they are wrong XD
The problem is you might be dealing with people who work on platforms with spaces galore in the file names - so I constantly get Mac or Windows files with spaces in and I either have to remove their spaces or learn the escaping required and keep their filenames.
 

ralphbsz

Son of Beastie

Reaction score: 2,340
Messages: 3,236

I'm very schizophrenic about that.

On one hand, where I have a choice, I use very simple file names: all lower case, no hyphens (because in programming languages hyphens aren't allowed in identifiers), no spaces, no control characters, no non-ASCII characters.

On the other hand, I understand that sometimes one wants to encode real and correct information. So I have no problem with an mp3 file that might be 'Antonín Dvořák Symphony #9 "From the New World".mp3' (with unicode characters, accents, spaces, number sign, and quotes in it). In a directory of mp3 files, this probably makes more sense than 'dvorak_symph_9_new_world.mp3'. The same happens with files that are documents with human-readable titles, like "Cost estimate for new roof (Version 3).xls" or "Contract with plumber for $123.45.docx".

Ach, zwei Seelen wohnen in meiner Brust.
 

dd_ff_bb

Member

Reaction score: 69
Messages: 75

I personally like OpenBSD developers response:

Yes,we recognize it the situation sucks. But we don't want to break the easy patterns people use scp for, until there is a commonplace replacement. People should use rsync or something else instead if they are concerned.

As of now and in the foreseeable future openssh suite will be actively developed/maintained.

In my opinion real question : "is maintaining scp in freebsd too much work?" Because if FreeBSD developers/contributors answer is

"Yes - we don't have any resources to maintain scp and if drop scp from our base, we can focus on developing many other things" then there is no point on discussing.

If the answer is:

"No - scp doesnt take much of our time, we can easily continue to keep it in the base and we can also develop/maintain other solutions ( sftp, rsync etc...) " then again there is no point on discussing it, because many people including me is happy with scp and we know its shortcoming and choose to use it.

People always have option not to use scp or any other tool in that matter.
 

olli@

Daemon
Developer

Reaction score: 1,252
Messages: 1,140

Greenberg says that "the POSIX spec is highly non-deterministic" and I'm inclined to believe him.
Be careful not to confuse syntax with semantics.
That paper is about the fact that the POSIX shell semantics are ambiguous, and there are cases that the specification does not cover. This is indeed a problem (and I think I mentioned it above, too).
But still, the syntax of the POSIX shell can be represented by a formal grammar that is context-free, and thus can be implemented by a pushdown automaton (it’s not a regular grammar, though, so you can’t use a finite state automaton).
 

olli@

Daemon
Developer

Reaction score: 1,252
Messages: 1,140

[…] But with scp, the remote file name will be first violenced by my local shell, and then violated some more by the remote shell. It is the second round of torture that I object to. Why? Because in my mind, scp should be a logical descendent of cp, with the same syntax, except that one of the files is on remote host. The file name on the remote host should be transported as an opaque string to the remote host, and not messed with there.
I’m afraid I have to correct myself.

At first I was of the same opinion as you. But during the past two days paid attention to my usage of scp. Normally I don’ think much about it, so I was somewhat surprised when I noticed that I use scp in ways that require parsing and expansion on the remote side – even quite often, actually. Here are typical examples:

Code:
scp *.pdf somehost:do\*/ed\*/uni\*
In this case, there is a directory documents/education/university on the remote host, and I’m too lazy to type the full name. I have to admit that I do things like that often because I’m lazy. (Actually that example is not quite typical: If I really had a host named “somehost”, I would have created an alias “sh” or similar in my ~/.ssh/config. ;) )

Code:
scp somehost:bin/\*.py .
This command copies all files from my bin directory of a remote host ending with .py to the local directory. Again, this is not a rare exception; I do things like that frequently, without even thinking about it.

So, if scp was replaced with a command that didn’t perform any expansion on the remote side, the above usages would not work anymore. So, personally I think there should at least be an option that enables such behavior. Otherwise I would have to adapt my workflows, which would be annoying, but I would probably end up writing my own replacement for scp that handles my use cases.

PS: I’ll include two snippets from my signature collection here because they fit so well …
“UNIX was not designed to stop you from doing stupid things, because that would also stop you from doing clever things.” -- Doug Gwyn
“If you aim the gun at your foot and pull the trigger, it's UNIX's job to ensure reliable delivery of the bullet to where you aimed the gun (in this case, Mr. Foot).” -- Terry Lambert, FreeBSD-hackers mailing list.
 

Jose

Daemon

Reaction score: 965
Messages: 1,170

But still, the syntax of the POSIX shell can be represented by a formal grammar that is context-free, and thus can be implemented by a pushdown automaton (it’s not a regular grammar, though, so you can’t use a finite state automaton).
Sure so I can write a parser that processes the language easily. But if they semantics are ambiguous, what my parsed results do could be radically different from what your parsed results do even though both parsers process the language correctly according to the spec. We're back to blue shoes territory, or who's on first if you know the classic sketch.
 

ralphbsz

Son of Beastie

Reaction score: 2,340
Messages: 3,236

But during the past two days paid attention to my usage of scp. ...
Actually, that makes sense. Globbing has to happen on the remote node, but other expansion (backtick, $) probably does not. But once you open the door to globbing, other things will sneak through.

The ultimately correct answer (which msplsh hinted at) is to stop using tools like rsync or scp, and just use full-fledged cluster or distributed file systems: Just mount the file system from the other host, and operate locally. Instead of "scp foo.bar somehost:do\*/ed\*/uni*", you could just do "cp foo.bar /mnt/somehost/do<tab>ed<tab>uni<tab>", which is even more convenient for lazy people.


And in practice, that is very rarely done, because it is operationally so difficult, and comes with so many caveats and problems. It's particularly sad for me to say that, because I've spent the last 20 years working in large distributed storage system, all of which end up giving a file system abstraction: What I built for customers to use, and which I use in the office, is absolutely not what I have at home. I just added to my to-do list: Set up a really good NFS and Samba setup on my home server. I think this weekend I'll have some spare time (we had to cancel a big yard work project), so maybe I'll work on that.
 

olli@

Daemon
Developer

Reaction score: 1,252
Messages: 1,140

The ultimately correct answer (which msplsh hinted at) is to stop using tools like rsync or scp, and just use full-fledged cluster or distributed file systems: Just mount the file system from the other host, and operate locally.
Yes, that’s certainly a viable solution for certain use cases. But it requires admin privileges, or at least support from someone who has admin privileges. It’s not a solution for cases where a random user just wants to copy some files offhand from A to B.
 
Top