Solved Moving file types to matching folders

Hi all, I was wondering if anyone could help me figure out a way to efficiently solve a problem.

I've just preformed a large data migration from my old system to a new dedicated NAS. In doing this, I created one large (nearly 1TB) compressed file of the root share.

As you can imagine, after collecting nearly 1TB of data, the structure has gone wacky from multiple users. I'd like to clean this up a bit.

tldr:
What is the most efficient way via bash to move all file types to their own folders?

EG.

-All data exists under /myextracteddata
-Run script recursively on myextracteddata finding all filetypes
-End up with a final structure similar to this

/newDataStructure/png
/newDataStructure/jpg
/newDataStructure/pdf

I've found a few solutions out there that I could easily modify but for the sake of doing it right I'd like to use this as an exercise to learn the best way via bash.

Also know that I know I said multiple users. In this situation it's not necessary to retain any hierarchy showing ownership of the files within the new location.

tyia!
 
I'm sure this need some tweaking, any improvements welcome. I've landed on this. I will test once my last archive is done extracting.

WORKING SCRIPT
Code:
#!/bin/sh
DIR="."

function list_files()
{
if !(test -d "$1")
then echo $1; return;
fi

cd "$1"
for i in *
do
if test -d "$i" #if dictionary
then
list_files "$i" #recursively list files
cd ..
else

extension="${i##*.}"
filename="${i%.*}"
echo "Filename: $filename"
echo "Extension: $extension"

newName="/share/homes/user/movedData/$extension/$filename$(($(date +%s%N)/1000000)).$extension"

if [ -d "/share/homes/user/movedData/$extension" ]; then
   echo "Moving $filename.$extension to /share/homes/user/movedData/$extension"
   mv "`pwd`/$i" "$newName"
   echo $newName
else
   echo; echo `pwd`/$i
   echo "/share/homes/user/movedData/$extension doesn't exist, creating!"
   echo "DONE"
   echo "Moving $filename.$extension to /share/homes/user/movedData/$extension"

   mv "`pwd`/$i" "$newName"
fi
fi

done
}

if [ $# -eq 0 ]
then list_files .
exit 0
fi

for i in $*
do
DIR="$1"
list_files "$DIR"
shift 1 #To read next directory/file name
done
 
Bash is not required, and this might be easier to do with a real programming language like Perl or Python, something that does not make it so difficult to deal with spaces in filenames. Which, with multiple users, are almost certain.

Anyway, think of it as a "foreach" loop:

Code:
foreach file {
  use [man=1]file[/man] to determine file type
  mv file to destination directory for that type
}

It might be possible to do this with a sneaky invocation of find(1), but that would be tricky to get right and easy to get wrong. With a program, you can easily have it just print the file move commands until they look right. Then change it to actually execute those commands.

Probably there are programs in ports that will do this, but I've never used them.
 
You can do it many way.
First idea: you'll find all files and prints only their extension, sort -u, maybe find /myextracteddata -type f -name '*.*' | sed "s@.*\.\([^\.]*\)@\1@" | sort -u
After you will foreach every extension, make directory and copy (move) files.

This code is NOT TESTED!!! Maybe first you want to echo the commands as wblock wrote.
Code:
for ext in `find /myextracteddata -type f -name '*.*' | sed "s@.*\.\([^\.]*\)@\1@" | sort -u`; do
  mkdir /new/${ext}
  cp `find /myextracteddata -type f -name "*.${ext}"` /new/${ext}
done
 
Thanks guys, I got this working. My new NAS box is using busybox so I had to made a few slight modifications, will post in a bit.
 
See post #2 for my final script. I'm sure there are more elegant ways to handle the duplicate file names but my hands were tied with busybox's limited argument input. So to get around this I just appended all file names with the current epoch datetime.

Always looking for improvements, thanks!
 
Back
Top