Solved How to 'cp -a' from a website

I want to copy a subdirectory tree from a website, effectively cp -a remote-host/dir ..

How would I do that?

I can download files individually via my browser but would like to duplicate the remote directory.
 
You can also use rsync if you have SSH access — it is usually a better equivalent to cp -a over the network:

rsync -avz user@remote-host:/path/to/dir .


For HTTP-only access, wget --mirror works, but as mentioned, directory listing must be enabled; otherwise there is no generic way to enumerate files over plain HTTP.
 
You know scp(1) exists? Other than that, wget(1) can download directories, but DirectoryIndex has to be enabled, there's no way to figure out the contents of a web directory through the regular HTTP(S) protocol.
It's a public website.

I had forgotten that wget does a recursive retrieval so I did that, but got a ton of html files which I don't want.

Not sure if there is a straightforward way of deleting them all.
 
I'm not bothered about copying a website.

What I wanted was to just get the files from

To clarify, that should recursively go through every link and fetch the data. The fact it is a "website" is not quite so important.

It should also have an ftp http mirror: http://ftp.us.debian.org/debian/dists/trixie/main/installer-amd64/current/images/netboot/

Though if this is Debian specifically, I think apt-mirror tends to be a good archiving solution.
 
Lftp is a nice program for mirroring websites. From the manpage:-

"lftp has built-in mirror which can download or update a whole directory
tree. There is also reverse mirror (mirror -R) which uploads or updates
a directory tree on server. Mirror can also synchronize directories be‐
tween two remote servers, using FXP if available."

Lftp runs as an interactive session. See the description of the 'mirror' command in the manpage; basically you mirror a remote directory to a local one.
 
I have now removed all the gunk from the download and copied all the files onto my PXE server from which I was able to install Debian with little effort.

Having the same facility for FreeBSD would be nice, and I wouldn't be surprised if someone has already put together such a package, although I have not yet come across such a thing.
 
I always use fetch to retrieve files and forget about wget. It would be nice if fetch could do recursive retrieval
fetch(1) doesn't do specifically recursive retrieval on its own. Read up the manpage. You can, however, write a .sh script that implements a recursive retrieval using fetch(1). For recursive retrieval, use wget.
 
Back
Top