Shell Script for splitting PDF pages vertically, removing blank pages and white borders

MasterOne

Active Member

Reaction score: 25
Messages: 193

So I'm trying to process PDF documents containing shipping labels the following way:
  1. Each A4 page in landscape orientation contains two A5 shipping labels side by side, so simply split each A4 page vertically into two A5 pages.
  2. Remove any blank pages.
  3. Remove white borders from each page.
I have managed this by making use of Ghostscript, graphics/mupdf and print/texlive-base the following way:

Code:
#!/bin/sh
LANG="en_US"
echo
for IN in *.pdf; do
  LABEL="${IN%.pdf}_label.pdf"
  mutool poster -x 2 "$IN" "$LABEL" && echo "Splitting $IN ... done" || { echo "Splitting $IN ... error"; exit; }
  PAGES=$(mutool info "$LABEL" | grep ^Pages: | tr -dc '0-9')
  mutool clean "$LABEL" "$LABEL" $(for i in $(seq 1 $PAGES); do [ $(echo "$(gs -o - -dFirstPage=${i} -dLastPage=${i} -sDEVICE=inkcov "$LABEL" | grep CMYK | awk 'BEGIN { sum=0; } {sum += $1 + $2 + $3 + $4;} END { printf "%.5f\n", sum } ') > 0.001" | bc) -eq 1 ] && echo $i; done) && echo "Cleaning  $IN ... done" || { echo "Cleaning  %IN ... error"; exit; }
  pdfcrop "$LABEL" "$LABEL" &> /dev/null && echo "Cropping  $IN ... done" || { echo "Cropping  $IN ... error"; exit; }
  echo
  #rm "$IN"
done

The only catch is, that with print/texlive-base quite a large package has to be pulled in for just one binary ( pdfcrop), which seems kind of overkill as I'm not using anything else from it.

I know that something like PDF Arranger (there is no print/pdfarranger port yet, though OpenBSD has one) can do the automatic cropping as well, but that doesn't help when trying to solve it for batch processing in a command line tool.

If anyone can think of any improvements to that script, please tell.
 
Last edited:
Top