Command line tools to edit pdfs

From mn/geo/geoit
Revision as of 12:55, 6 January 2017 by (talk | contribs) (PDF compression with Ghostscript)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Programs such as Imagemagick (the "convert" command), PDFjam, Ghostscript and PDFSAM provide options to edit PDFs, such as merging, splitting, rotating PDFs, or converting other file formats to PDF.

The pdfjam command

Join PDFs into one file

Install PDFjam (or log on to Sverdrup) to extract pages from a PDF into another PDF. See also To join several PDFs into one file, make sure you are in the folder containing the PDFs you'd like to merge.

cd pdffolder       # navigate to the appropriate folder

Then simply type

pdfjam file1.pdf file2.pdf file3.pdf -o file123.pdf

You should get a message like this (make sure the last line reads "Finished. Output was to..".

 pdfjam: This is pdfjam version 2.08.
 pdfjam: Reading any site-wide or user-specific defaults...
         (none found)
 pdfjam: Effective call for this run of pdfjam:
         /opt/app-sync/texlive/uio-texmf/bin/x86_64-linux/pdfjam --trim '23.6cm 5cm 14.8cm 0.5cm' --clip 'true' --outfile fig_Jul.pdf -- facet_Apr-Aug-Nov_sign_circ_C_.pdf - 
 pdfjam: Calling pdflatex...
 pdfjam: Finished.  Output was to 'fig_Jul.pdf'.

Alternative way to do this in Windows

In Windows, files can be merged using Adobe Acrobat Pro (on a remote desktop to, choose "Merge PDFs into one file". You may also use Adobe Acrobat Pro to save a PDF in Word format (save as/save as other/word).

Extract pages from PDF

To extract pages from a PDF into another PDF, type

pdfjam -o outfile.pdf infile.pdf wantedpage1,wantedpage2

See also stackexchange.

Join PDFs into one file - several sheets per page

After convert has joined several PDFs, PDFjam can be used to put several sheets on one page. Here, I prepare a 12-page PDFs into one-page PDF (3 columns, 4 rows), before printing:

pdfjam --nup 3x4 --landscape my_12_slides.pdf --outfile my_1_page_handout.pdf

Rotate pages

Contained in PDFjam is a lot of useful commands, such as pdf90 which rotates your file:

pdf90 operates on one or more PDF files, and (either with the '--batch' option or with '--outfile DIR' where 'DIR' is a directory) the resulting files have the suffix 'rotated90' applied to their names by default. To change the suffix, use the '--suffix' option, for example

pdf90 --suffix '-turned' --batch myfile1.pdf myfile2.pdf

will result in files named 'myfile1-turned.pdf' and 'myfile2-turned.pdf'. (From

Crop pdf

To crop your pdf to a smaller pdf, use the commands --trim 'left bottom right top' --clip true. "left" should be given on the form "xcm" and means the amount of centimeters you'd like to crop from the left side, and similar for the other numbers.

pdfjam --trim '1cm 5cm 36.5cm 0.5cm' --clip true large_figure.pdf --outfile smaller_figure.pdf

The convert command (program Imagemagick)

Install Imagemagick (or log on to Sverdrup), and use the "convert" command:

convert infile1.pdf infile2.pdf outfile.pdf

If the quality is reduced, use the -density flag (the higher number afterwards, the better. 600 is good). If you are converting from a JPEG/MIFF/PNG file to pdf, use -quality instead.

convert -density 600 infile1.pdf infile2.pdf outfile.pdf

Join picture files into one file

convert -quality 600 infile1.jpg infile2.jpg outfile.pdf

PDF compression with Ghostscript

Ghostscript is a powerful tool for pdf compression. The following code compressed a PhD thesis of 88 Mo into a 22 Mo pdf file. Just edit output.pdf and input.pdf to the name of your output and input file. Source: . Install Ghostcript binaries from the official gs website for Windows and Linux or from Brew //Macport for Mac.
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/printer -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf