Shell pdfgrep and move

Hi, I am using pdfgrep to search within a directory of PDF files for a predetermined character string and then then either copy or rsync the files which contain those strings to another directory.

So far I have come up somewhat short, I am able to find and print the files, but have had little success in actually moving them to another directory. I would greatly appreciate any help in pointing out what I am doing wrong and or alternate suggestions.

This finds said strings in the PDFs but doesn't really work to copy the files which contain the strings elsewhere.

find . -type f -name '*.pdf' -exec pdfgrep -nHm 2 poliomyelitis {} \; -exec rsync -vr /tmp/test/ \;

Also tried

find . -type f -name '*.pdf' -exec pdfgrep -nHm 2 poliomyelitis {} + rsync -vr /tmp/test/

Printing the output to a list works fine otherwise

find . -type f -name '*.pdf' -exec pdfgrep -nHm 2 poliomyelitis {} + >> output.txt
 
Questions to be asked on your commands above:

1. Is there a need for using find? pdfgrep has recursive options and operates on "*.pdf" by default. See pdfgrep(1).

2. In order to use any utilities for copying/moving files they have to be feeded with filenames only. The output of pdfgrep is mixed with content. You need to parse the filenames first before further piping them.
 
Hello!

Alrighty. The following seems to give me only the file names, how can I pipe them further?

pdfgrep -r "poliomyelitis" *.pdf | cut -d: -f1 | sort -u
 
Just using backward quotes may be enough, as in cp `pdfgrep -r "poliomyelitis" *.pdf | cut -d: -f1 | sort -u` /tmp. The standard output from the pipe will be inserted as arguments for another command. For more complicated requirements I'd look into xargs(1).
 
Back
Top