In my work I recently had a situation where I needed to concatenate several PDF files into one big PDF file. That turns out to be a much harder problem than it sounds, and the Web isn't as helpful in solving it as should be the case. Here's a summary of what I've discovered on the subject, so that someone else who wants to append PDF files on a real computer won't have to go through what I went through. This page covers only software that will run under Linux and similar FLOSS environments.
It's hard because PDF is as much a programming language as a data file format. PDF is basically Postscript with compression and some document structuing conventions that supposedly make it easier to do operations like concatenation. But Adobe doesn't really want you to write PDF software; they want you to buy their PDF software; and that breeds the attitude among PDF software authors that since Adobe gets away with it, they, too, ought to be allowed to charge inflated prices for their work.
There is a lot of commercial software that claims to be able to concatenate PDFs. I've seen prices as high as 995 euro just for that one function. Since this software is only available for Windows anyway, it's not much help to me even if I had the money to buy it, which I don't.
The vendors of commercial PDF manipulation software pollute the Google results and make it practically impossible to find anything on the subject that isn't either overpriced commercial, or Windows-only, or both.
There are a great many free software projects aimed specifically at this problem, all of which seem to have been abandoned before they get to a working level. So you're pretty much down to hoping for PDF appending to be within the feature set of a more general-purpose package.
There is a utility called "psmerge" which purports to be able to merge Postscript (not PDF) files. You might think you could convert the PDFs to Postscript, psmerge them, and then convert back. I've never seen that work because psmerge only works for Postscript files that are unusually well behaved. Most Postscript found in the wild, including that output from utilities such as pdf2ps, does not work with psmerge, typically causing it to claim success but generate an empty document as output. I've no idea why someone doesn't just write a psmerge program that would actually work instead of putting the current insultingly bad one into all the Linux distributions; yes, I know it's a hard problem, but it's useful and important and not actually impossible. After all, printers somehow manage to print more than one Postscript document. At the very least, psmerge could produce an error message when it fails.
Anyway, even if that worked you probably wouldn't want it, because there's important information lost when you go from PDF to Postscript that can't easily be recovered, such as the stuff that makes PDFs searchable. There are also font issues, so that Postscript converted from PDF originals may possibly end up being a bunch of images of the text instead of the text itself, and thus, elephantine in size.
Most Linux users have Ghostscript, and Ghostscript is supposed to be able to append PDFs. The way it works is that it "prints" the files in sequence, but directs them to a loopback type of printer driver that actually generates a new PDF. Here's the command:
gs -q -sPAPERSIZE=letter -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=out.pdf in1.pdf in2.pdf in3.pdf ...
Make appropriate substitutions for different paper sizes (apparently nobody in North America has ever wanted to do this, because other examples on the Web invariably specify A4 paper), filenames, etc. I've found this technique to work quite well. Unfortunately, it caused Ghostscript to die with a segfault on some of the input I wanted to run it with, and nothing I could do would get it to process some of those files.
A lot of people swear by Multivalent, but it's "write once, run nowhere" Java. Despite going through multiple iterations of installing different JVMs, as well as that ridiculous CLASSPATH bullshit, I couldn't get it to run on my system at all.
If you already have a (La)TeX installation, you probably have texexec, and it can be used in a similar way to Ghostscript.
texexec --pdfarrange --result out.pdf in1.pdf in2.pdf in3.pdf ...
That at least didn't segfault no matter what input I threw at it. However, my input files were letter-sized landscape, and it insisted on putting them on letter-sized portrait pages so that the right-hand third of each page was cut off, no matter what options I gave for paper size and orientation. If you can live with the page size and orientation it produces, this may be an acceptable method, but it didn't work for my particular case.
Note that texexec has some other options for doing other things with PDFs - such as printing them n-up. Those features look cool and might be worth investigating, though texexec's performance on simple appending leaves a bad taste in my mouth. At least it doesn't segfault.
UPDATE (13 March 2006): This is a different problem, but I recently got very nice results making a 4-up "notes" printout from a presentation that was in a landscape PDF file, with this command:
texexec --paper=landscape --pdfcombine --combination=2*2 --result out.pdf slides.pdf
Note that that produces a footer on each page with the filename and date, which you might or might not want depending on your application. Without the "--paper=landscape", it puts the slides on a portrait page, leaving substantial margins below them, which might be what you'd want if you wanted space for handwritten notes (I didn't).
If you have LaTeX installed, you probably have pdftex and a package called pdfpages, and you can use those together with a hand-written driver file to append PDFs. This approach is more work than I'd really prefer, but it's the only one I found that actually did work in my situation, after spending a lot more time on the problem than I ought to have (and not being paid by the hour, so it hurt). If I were doing this task many times I could of course write a script to streamline it
Create a driver file like out.tex with contents like this:
\documentclass[landscape]{article}
\usepackage{pdfpages}
\begin{document}
\includepdf[pages=-]{in1}
\includepdf[pages=-]{in2}
\includepdf[pages=-]{in3}
\end{document}
Omit the [landscape] option if you want your output to be portrait, of course. Each page of the input will be scaled to fit on a page of the output - so for instance if your input is landscape and the output is portrait, you get each input page shrunk down and printed unrotated on an output page with a wide top and bottom margin. There are presumably other options you could use to select non-default paper sizes, only subranges of input pages, rotation, and so on. I have not read the documentation for the pdfpages package very carefully at all. Anyway, once the driver is created, run pdftex on the driver file. For me, it just works.
James from 70.51.173.202 at Sun, 18 Dec 2005 17:33:38 +0000:
Multivalent worked fine for me using Java 1.4:
java -classpath /path/to/Multivalent.jar tool.pdf.Merge *
If you don't want to add that option every time, just do setenv CLASSPATH $CLASSPATH:/path/to/Multivalent.jar or export CLASSPATH=$CLASSPATH:/path/to/Multivalent.jar
Angelo Mandato from 12.27.54.83 at Mon, 13 Feb 2006 15:48:03 +0000:
I made a graphical interface called PDF Blender that lets you easily merge multiple PDF documents using Ghostscript. http://www.spaceblue.com/pdfblender/ If you leave the -sPAPERSIZE=letter parameter out of the command line, Ghostscript will keep each PDF as is. I recommend a particular distribution of Ghostscript on the PDF Blender site.
Also, Take a look at iText (http://www.lowagie.com/iText/). There is a graphical interface available for iText called iTextFront (http://www.ujihara.jp/iTextFront/en/). Happy merging.
Stephen Gilbert from 153.2.246.30 at Thu, 16 Feb 2006 22:50:31 +0000:
If you find yourself with vendor-polluted Google results in the future, try adding "-site:.com" to your search. For example, this search:
concatenate pdfs -site:.com
in Google pulls up this page as the number one result, but also various other good possibilities.
Patrick from 64.103.37.71 at Thu, 16 Mar 2006 23:13:58 +0000:
psmerge is even worse than you imply---it can't possibly work for *any* postscript files, however well-behaved. Thanks for the page---saved me hours of work!
-P
Paul Duff from 137.222.102.145 at Wed, 12 Apr 2006 14:18:06 +0000:
Hi, thanks for the very useful tips! I decided to write a simple script for the pdfpages technique which worked well for me, so thought I'd share it. Call the script below as: scriptname <outputfile.pdf> <inputfile1.pdf> <inputfile2.pdf> ... <inputfile100.pdf> The script is for landscape but could easily be changed for portrait.
- Paul
#!/bin/bash
cat <<ENDTXT >__tmp__.tex
\documentclass[landscape]{article}
\usepackage{pdfpages}
\begin{document}
ENDTXT
outnm=$1
count=`echo "$# - 1" |bc`
for f in `seq $count`; do
shift
echo "\includepdf[pages=-]{$1}" >>__tmp__.tex
done
echo "\end{document}" >>__tmp__.tex
pdflatex __tmp__.tex
rm __tmp__.{tex,aux,log}
mv __tmp__.pdf $outnm
Daniel from 67.190.3.149 at Sat, 05 Aug 2006 04:35:49 +0000:
I used the following command with texexec and it worked perfect except it changed to legal size paper instead of letter. But I was able to use Preview to crop it back to normal anyways.
texexec --pdfarrange --paper=landscape --result trans.pdf Transcript003.pdf Transcript004.pdf
Dano from 128.206.99.202 at Wed, 06 Sep 2006 18:39:02 +0000:
I use linux.
pdf2ps x.pdf
pdf2ps y.pdf
pdf2ps z.pdf
cat x.ps y.ps z.ps >big.ps
ps2pdf big.ps
now you have a big.pdf with the contents of x,y,and z concatenated.
Matthew Skala from 67.158.78.28 at Wed, 06 Sep 2006 22:47:24 +0000:
See the comments on psmerge, above - if you convert the PDFs to Postscript and back, you may end up with a PDF that is less than optimal.
brad marshall from 12.150.181.20 at Tue, 17 Oct 2006 20:06:46 +0000:
PDFTK work for me right away. Downloaded from www.accesspdf.com. The contents of this website are priceless. It saved me hours (if not a couple weeks) of research and testing. Thanks you! >> Brad
Ulrich Diez from 217.248.202.63 at Wed, 18 Oct 2006 20:34:38 +0000:
I have two pdf-files (a.pdf/b.pdf)with hyperlinks.
(pdfLaTeX/hyperref).
When I try to concatenate via PDFTK,
- internal links in the file that results from
combining the two, will lead to wrong targets
unless I do some very obscure tricks already
within the TeX-source.
- bookmarks do not show up at all within the file
that results from combining the two.
- I'd like external-file-links from a.pdf to b.pdf
and vice versa to be converted into internal links
within the file that results from combining the
two.
I'm thankful for any comments/suggestions on solving
these problems.
Ulrich
Frank Higgins from 71.251.140.29 at Mon, 13 Nov 2006 14:25:19 +0000:
Thanks for posting this useful information. If you're using Debian Linux there is a package called pdfjam which is a collection of pdf handling routines. Included in it is a script pdfjoin, which is similar to the pdflatex scripts above. It worked fine for me.
You can obviously make your own command file to do some non-standard operations, such as using the page argument to select pages to cut - and then paste them back by rearranging the order in the file. For example, to reverse a 4 page document, you could:
...
\begin{document}
\includepdf[pages=4]{file.pdf}
\includepdf[pages=3]{file.pdf}
\includepdf[pages=2]{file.pdf}
\includepdf[pages=1]{file.pdf}
\end{document}
Thanks again for the page!
Fer from 171.64.38.72 at Tue, 14 Nov 2006 22:04:18 +0000:
Excellent tips! I suggest to use --noduplex optiom of texexec. The default behavior of,
texexec --pdfarrange --result file.pdf file1.pdf file2.pdf ...
is to pad the document with an empty page at the end if the number of pages is odd. Use the following to get rid of the extra page:
texexec --pdfarrange --noduplex --result file.pdf file1.pdf file2.pdf ...
Jan Marx from 87.193.6.121 at Wed, 15 Nov 2006 18:34:12 +0000:
If the Ghostscript solution gives you seemingly randomly rotated pages, add "-dAutoRotatePages=/None" ti the command line.
Andreas from 217.194.34.103 at Tue, 23 Jan 2007 14:56:26 +0000:
Hi all,
there is also a free and open source java tool that does exact
the thing of joining (and splitting) pdf files:
http://www.iis.ee.ic.ac.uk/~g.briscoe/joinPDF/
All you need is a java runtime environment (jre).
Just for your interest! :-)
Jason from 24.99.54.26 at Thu, 01 Mar 2007 05:19:17 +0000:
You can use ImageMagick's "convert" command:
convert *.pdf all.pdf
It will even create a lowres pdf from encrypted pdfs.
Tyler Mitchell from 70.77.143.64 at Sun, 08 Apr 2007 06:47:08 +0000:
Thanks for the summary - wish I had it three days ago :-) One tip to share along similar lines... Imagemagick's "convert" works well but will create a PDF that is essentially rasterised (i.e. made into an image) instead of maintaining the distinct text elements. The command "ps2pdf", uses ghostscript and does the same.
However! there are a set of tools called xpdf and they include a "better" implementation, with the command "pstopdf" - maintaining the text, image and vector graphic elements as in the original. This is useful if you want to keep files smaller and/or remix them with other PDFs later.
Hope that helps.
Tyler
King from 24.193.23.232 at Thu, 03 May 2007 08:45:50 +0000:
Does anyone know why I am missing all the odd pages.
My original pdf files each has page1 and page2 on single pdf, I have 18 pages, trying to make it as one pdf
I tried follow don't work.
1) texexec --pdfarrange --result=all.pdf --page=a5a4 --print=up *.pdf
2) texexec --pdfarrange --result=all.pdf --page=landscape *.pdf
3) texexec --pdfarrange --result=all.pdf --noduplex *.pdf
none of them get what I want.
I got what I want by using different command.
convert *.pdf all.pdf
I wonder if anyone know how do I get the same result with texexec ?
Thanks
Jason from 24.126.71.143 at Wed, 23 May 2007 22:00:21 +0000:
Multivalent worked well for me on Java 1.4.2_05.
>set classpath=%CLASSPATH%;.\Multivalent20060102.jar
>java tool.pdf.Merge file1 file2 file3
Chris Thorman Larrimore from 71.130.225.116 at Fri, 25 May 2007 17:54:50 +0000:
I tried most of these solutions and generally they worked (but gs had some problems with fonts). However, there was only one solution that *also* optimized the concatenated files to remove shared objects like fonts, and it was the freeware Combine PDFs tool. Unfortunately for me, it's GUI-based and Mac-only and while great for one-off stuff, it won't be useful in a batch situation.
Currently I'm exploring pdfoptimize tool from pdf-tools.com (not a free solution but I have a budget so I can consider it). Sadly, their tool had a bug where it was stripping images. I reported the bug and their team got right on it with a reply to me -- I'll try to remember to report back if it worked. They also mentioned they have a pdfcat tool which I have not tried yet but will try next; I'll try to remember to report back here what I learned from that, as well.
John Marks from 217.64.114.243 at Fri, 29 Jun 2007 11:31:30 +0000:
Your webpage has been most helpful. Thank you for taking the trouble to write it. I found it by a blind websearch using the terms "concatenate" + "PDF".
Copying your command:
gs -q -sPAPERSIZE=letter -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=out.pdf in1.pdf in2.pdf in3.pdf ...
into a batch file concat.bat located in the same directory as the input files, and using a text editor to edit the paths to suit a Windows platform:
"f:\User Program Files\gs\gs8.54\bin\gswin32" -q -sPAPERSIZE=a4 -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=JFM-app.pdf PARN-p0.pdf PARN-p1.pdf PARN-p2.pdf PARN-p3.pdf PARN-p4.pdf PARN-p5.pdf PARN-p6.pdf PARN-p7.pdf PARN-p8.pdf PARN-p9.pdf
did the business for me (using AFPL Ghostscript 8.54 on Windows XP). Now all I need to do is to update my gs installation on Kubuntu 7.04 to match...
My only suggestion is to add the words "combine PDF files" somewhere so that it can be found by those with a lesser command of technical English.
Daniel Weis from 66.234.242.30 at Tue, 03 Jul 2007 14:19:22 +0000:
Thanks a bunch!
If you use
<CODE>
texexec --pdfarrange --noduplex --paper=letter --result=result.pdf in1.pdf in2.pdf ...
</CODE>
You will merge perfectly (assuming the in pdfs are letter format...).
To change orientation, you can give the --paper=s6 command (I think) to get it landscape like you wanted...
Anyways, thanks for the help!
Dan
koensen from 88.76.53.255 at Thu, 22 Nov 2007 11:57:50 +0000:
Well, it's easy to code though...
german description, java-codesnippet
http://www.bb242.de/2007/10/18/pdfs-konkateniert-als-httpresponse-zuruckgeben/
Kevin from 72.235.175.12 at Sun, 27 Jan 2008 00:51:38 +0000:
Three different solutions that have worked well for me have already been mentioned: pdftk (my favorite), ImageMagick's "convert", and Ghostscript (either using "gs" or "ps2pdf"). For people who have to use Windows, <a href="http://www.pdfcreator.de.vu/">PDFCreator</a>'s "Wait - Collect" button will allow you to gather a bunch of print jobs together (even from different programs), then later print them all as one PDF.
Important to note, though: <strong>the Ghostscript method, I've found, is best at optimizing image-laden PDFs into the smallest possible filesize, especially if they were created by other programs that produce less efficient PDFs.</strong> For example, I use XSane (on Linux) to scan directly to PDF format, then the command-line tools to concatenate the PDFs. XSane has no problem making a 300 dpi scan into a 5+ MB PDF! When combining several of these PDFs using Ghostscript, however, the final filesize ends up in the hundreds of Kbytes--easily one-tenth the input filesize. The "before" and "after" PDFs are indistinguishable, even zoomed way in.
I don't claim to know what Ghostscript is doing to squeeze out all those extra megabytes, but do consider experimenting with different toolchains, compression, and resolution options if the size of your output PDF is a concern. Ghostscript has many, many <a href="http://pages.cs.wisc.edu/~ghost/doc/cvs/Ps2pdf.htm">options for compression and output resolution</a> (<em>e.g.</em>, <tt>-dPDFSETTINGS=/ebook</tt> or <tt>/screen</tt>); there may even be a comparison of the filesize savings offered by the various options already somewhere already on the web...
Keep in mind, though, that most of these tools mentioned will turn editable, searchable text into a bunch of dumb polygons, so you won't be able to use the "Search" function in your PDF viewer on the document anymore, and you'll probably lose the document structure, hyperlinks, bookmarks as well. As mentioned by one of the other posters, try the Xpdf package's "pstopdf" (or--*gasp*--buy Acrobat) if this is a concern.
Markus from 89.58.51.151 at Tue, 08 Apr 2008 18:38:08 +0000:
for the pdfpages-approach: on many unix systems, you can automatically determine the page orientation via pdfinfo:
(C-Shell style)
set width = `pdfinfo document.pdf | grep "Page size" | sed -e "s/\(.*\)\(\ [0-9]*\ \)\(x\)\(\ [0-9]*\)\(.*\)/\2/g"`
set height = `pdfinfo document.pdf | grep "Page size" | sed -e "s/\(.*\)\(\ [0-9]*\ \)\(x\)\(\ [0-9]*\)\(.*\)/\4/g"`
set orientation = "portrait"
if($width > $height) then
set orientation = "landscape"
endif
mark from 132.228.195.207 at Tue, 13 May 2008 23:50:21 +0000:
I've used your web page a couple of dozen times. I speak for myself, and probably hundreds of others, when I say thanks.
Robert Spendl from 84.255.202.248 at Wed, 14 May 2008 07:53:05 +0000:
I have just run in a problem when I had to print over 200 two or three paged short documents. I wanted to put 2 of them on one page. Printing each one would be annoying, so I decided to concatenate all files - luckily named by sequential numbers - with
"pdftk *.pdf cat output complete.pdf"
Since some documents have odd number of pages, new documents did not start on a new page. So I have written a short script 'pdf_oddcat' that concatenates PDF files where each file starts on an odd number (i.e. it adds one empty page to any file with odd numbers). The current directory has to include a file called "empty.pdf", containing only a single empty page. Anyway, I hope someone finds the script useful!
#!/bin/bash
if [ "$1" = "" ]
then
echo "pdf_oddcat ouput.pdf file1.pdf file2.pdf ... - concatenates PDF files starting each one on odd page"
fi
filelist=""
i=0
for ff in "$@"
do
if [ $i -eq 0 ]
then
fileout=$ff
else
filelist="${filelist} ${ff}"
numpg=$(pdftk $ff dump_data output - | grep NumberOfPages | cut -d' ' -f 2-)
rem=$(( $numpg % 2))
if [ $rem -eq 1 ]
then
filelist="${filelist} empty.pdf"
fi
fi
let "i+=1"
done
pdftk $filelist cat output $fileout
Deepak Kataria from 122.163.175.166 at Sat, 24 May 2008 10:27:42 +0000:
Hi
I an new to ghostscript.
I want to create a .ps file from pdf files which are saved in directory (folder and subfolder).
From where should i start?
Thanks in advance.
Hoping to hear from your side soon.
Thanks and regards,
Deepak Kataria
Cloter M from 189.58.175.111 at Wed, 11 Jun 2008 14:16:25 +0000:
"pdftk *.pdf cat output all.pdf"
works like a charm for me! simple and great solution!
thanks!
Niik from 65.103.63.198 at Mon, 23 Jun 2008 21:57:50 +0000:
Very nice, thank you!
I did the following using iText under Java 1.4 to concatenate multiple pdfs and generate one large pdf on disc:
FileOutputStream outFile = new FileOutputStream("/usr/tmp/prt/" + outfilename+ ".pdf");
PdfCopyFields copy = new PdfCopyFields(outFile);
copy.open();
for (int i = 0; i < discPdfs.size(); i++) {
PdfReader reader = new PdfReader("/usr/tmp/prt/" + discPdfs.get(i) + ".pdf");
copy.addDocument(reader);
}
copy.close();
Tom Soja from 85.182.128.194 at Mon, 07 Jul 2008 10:11:45 +0000:
Merging pdf files seems to be a difficult task unless you are a proud owner of a copy of the "full" Adobe Acrobat. This page is
an excellent reference for those who refuse to spend large amounts of money on software.
One criticism though. Your comment on Multivalent encouraged me to
try it, and it performed well in concatenating two pdf files from
different sources. Maybe you can revise the section on Multivalent some time. Using jars and classes inside a jar is an issue which is unrelated to processing pdf files.
MarseHole from 193.133.140.104 at Fri, 22 Aug 2008 16:00:44 +0000:
Good job matey.
The pdfpages approach worked first time for me.
Thanks very much!
Javier Pérez from 80.32.168.211 at Wed, 27 Aug 2008 01:24:20 +0000:
Great and interesting information!
Good Job!
Helge Skrivervik from 81.191.224.114 at Tue, 02 Sep 2008 08:49:40 +0000:
We're heavily dependent on pdftk on our web-site, which serves PDF-pages generated on demand. Unfortunately, pdftk is no longer maintained. Furthermore, pdftk is dependent upon GCJ3.4 which is also obsolete and unavailable on most platforms (Fedora Core 4 and lower is OK, Mac PPC w/Panther and Macports is OK). Still, pdftk can be compiled and linked on newer platforms using gcc/gcj 4.2/4.3, but the result is unpredictable and not stable. In particular, concatenating pdfs frequently cause pdftk to hang if run from a script (!), and otherwise fail (coredump) if run directly from the command line.
We're still looking for a replacement, and I believe iText is the right way to go. However, there seems to be no general command line front end to iText at this time, so we'll have to do some programming in order to get what we need. Menwhile. the joinPDF program mentioned above, works fine. Thanks for a useful page.
Tom from 216.243.132.94 at Tue, 09 Sep 2008 00:30:11 +0000:
Thanks for the site; it's a huge time saver. I have to say, however, that Multivalent worked the best for me. I think your invective against Java is misdirected.
$java -cp ./Multivalent*.jar tool.pdf.Merge
It works great.
Dan from 138.37.33.66 at Fri, 12 Sep 2008 11:17:36 +0000:
Thanks for the tip - I use TeX so the "texexec" tip worked perfectly, first time
TurnerDMX from 208.72.99.2 at Tue, 23 Sep 2008 18:48:26 +0000:
Downloaded Multivalent20060102.jar to a WindowsXP box and it works like a charm.
wera from 81.195.28.143 at Thu, 02 Oct 2008 09:49:12 +0000:
Thank you very much!
after you read, you can delete this uninformative comment )
LevE from 67.170.12.170 at Sat, 11 Oct 2008 21:55:54 +0000:
Do not know if this is a stable solution, but it works for me. I use python 2.4 or later and pyPDF 2.0 or later package. Here is the script.
#######################################
#pdfMerge.py
#There are tons of ways to improve this script. Any modification is allowed :)
import sys
from pyPdf import PdfFileWriter, PdfFileReader
def CP(input, output):
for pageNum in range(input.numPages):
page = input.getPage(pageNum)
output.addPage(page)
args = sys.argv[1:]
if len(args) < 3:
print "usage: pdfmerge in1.pdf in2.pdf out.pdf"
else:
output = PdfFileWriter()
CP(PdfFileReader(file("CLI.pdf", "rb")), output)
CP(PdfFileReader(file("Exceptional.pdf", "rb")), output)
output.write(file("output.pdf", "wb"))
#######################################
you run: pdfmerge in1.pdf in2.pdf out.pdf
That's all folks. Enjoy the ride :)
Thang from 203.162.94.101 at Wed, 15 Oct 2008 02:48:23 +0000:
Thanks for writing.
This solve my problem. I use GPL Ghostscript 8.15 on Windows Vista.
It work well for me.
The command should look like this:
gswin32 -dNOPAUSE -dBATCH -q -sDEVICE=pdfwrite -sOutputFile=out.pdf in1.pdf in2.pdf ....
I will write it to me note. But the problem how can I keep track of all good tips :-)
The lines append to help Google find this page easily: append PDF, merge PDF, cat PDF
coin from 124.43.99.214 at Sun, 16 Nov 2008 02:10:10 +0000:
pdftk is just great they have a windows version as well:
http://www.pdfhacks.com/pdftk/
laowai from 128.131.79.139 at Fri, 12 Dec 2008 15:53:51 +0000:
Hi and thanks for putting this info online!
I did it the pdfpages-way in a moment when I needed a solution quickly.
Works like charm.
Google was OK with "concatenating pdf", one of the first hits.
Sean from 91.176.139.164 at Mon, 15 Dec 2008 21:34:23 +0000:
Very helpful page, many thanks.
I tested the commands below on an OSX v10.4 PowerPC box with CLI tools installed via fink.
First I individually scanned two text pages directly to PDF with an HP scanner 8-bit greyscale at 200dpi on another OSX computer, then rotated the second page manually in Preview and saved.
The pdftk hint from racin above worked immediately, concatenated the 2 files and kept the orientations and resolution.
After fussing with ghostscript I built this commandline which worked too, and created a PDF just as sharp but 65% smaller:
gs -q -sDEVICE=pdfwrite -sPAPERSIZE=a4 -dBATCH -dNOPAUSE -dFirstPage=1 -dLastPage=2 -r200 -sOutputFile=out.pdf *.pdf
the solutioner from 90.52.19.168 at Fri, 19 Dec 2008 20:57:23 +0000:
easy: just upload your pdf (plus any interspersed jpg) files to lulu.com to produce a book and let lulu do the conversion into a single pdf. after the conversion finished you can download the single pdf. as a bonus, you can even order a hardcopy of your book. ;)
FC3 from 148.177.1.210 at Wed, 07 Jan 2009 15:31:33 +0000:
PDFBox works fine for me. See www.pdfbox.org
Merging Adobe Acrobat documents requires no more than instantiating a merger object and three method calls. Merged everything I've thrown at it so far and since it's Java it runs everywhere I've tried it without a lot of grief.
Sten from 83.233.5.75 at Tue, 13 Jan 2009 19:54:26 +0000:
PrimoPDF converts anything to pdf and also allows appending one pdf after the other. Try eg word+jpg+pdf+excel and they will come out as one nice pdf document!
Gunnsteinn from 144.92.48.225 at Wed, 14 Jan 2009 00:50:50 +0000:
I would like to point out a free online tool for merging multiple PDF files, without having to pay anything or install any shareware.
The URL is: http://www.MergePDF.net
S.R. from 24.6.237.134 at Wed, 14 Jan 2009 11:03:03 +0000:
Thank you so freaking much.
Trying to put an NIH grant together and their new application now requires the letters of support from your collaborators to be in one file. (Stupid! They are each PDFs on letterhead!)
You just saved me hours of time.
fwiw, I used PDF pages. :)
Mike Heins from 12.176.97.130 at Fri, 16 Jan 2009 17:14:50 +0000:
Easiest I found is PDF::Reuse, the Perl package. Just install it:
perl -MCPAN -e 'install("PDF::Reuse")'
Then just use this perl script:
# file: concat-pdf.pl
use strict;
use PDF::Reuse;
prFile("out.pdf");
for(@ARGV) {
prDoc($_);
}
prEnd();
This will put all files on the command line into the file out.pdf.
It is that simple -- worked first time for me.
Sanjeev from 125.22.2.42 at Wed, 04 Feb 2009 05:20:18 +0000:
Thanks for sharing such info.
We are using ps2pdf utility to generate pdf documnet and then pdcat to concatenate the two pdf files. But sometimes we face a problem while concatenating two pdf files. (However this is 1 in 500) cases but user is compalining for this and wanted 0% error.
The steps we perform are as follows:
1. We generate a PDF file in our application using XING and ps2 pdf.
2. We get second PDF file from the user(It may be generated by any of the tool)
3. We use pdcat to concatenate two files.
Everything works fine except very few cases where resultant pdf file got corrupted. Can anybody have an idea what could be teh reason or solution for this?
Many thanks for your help.
Pieter from 196.35.158.183 at Wed, 25 Feb 2009 11:22:01 +0000:
Thanks, that was VERY useful. The GhostScript method worked perfectly for me.
Frank Daley from 202.129.83.42 at Thu, 05 Mar 2009 02:39:44 +0000:
I have successfully been using the Ghostscript solution for some time.
However I recently had a need to add a consecutive series of page numbers to a concatenated file.
Has anyone been able to achieve this(other than using Acrobat!)?
Frank
Jack from 209.206.245.126 at Thu, 12 Mar 2009 23:04:35 +0000:
Hey, just thought I'd put this in as a quick one-off to help ppl from bash command line:
gs -q -sPAPERSIZE=letter -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=../out.pdf `ls *.pdf`
steve from 12.11.224.4 at Tue, 17 Mar 2009 15:08:05 +0000:
This guy's Perl script "pscat" has always worked for me, in conjunction with "pdftops", from xpdf (mentioned above):
http://www.kfki.hu/~cspeter/util/index.html#pscat
Pscat says it's a "simple hack" but it's always worked perfectly for me, without installing lots of other software.
Patrick Pfleiderer from 84.58.115.193 at Wed, 18 Mar 2009 08:48:24 +0000:
Thank you for you efforts, Matthew, it saved me great pains. I was fine with the pdfpages way.
kinenveu from 163.7.4.20 at Mon, 30 Mar 2009 03:21:16 +0000:
i advise you the same as him.
it's a wonderfull free tool.
-----------------------
Jason from 24.99.54.26 at Thu, 01 Mar 2007 05:19:17 +0000:
You can use ImageMagick's "convert" command:
convert *.pdf all.pdf
It will even create a lowres pdf from encrypted pdfs.
-----------------------------
Luis from 141.213.95.84 at Wed, 15 Apr 2009 15:40:14 +0000:
For OS X I found this thing: http://monkeybreadsoftware.de/Freeware/CombinePDFs.shtml
it was super easy and did a really good job
-L
Technophreak from 70.81.23.48 at Fri, 17 Apr 2009 23:19:45 +0000:
PDFTK worked best for me !! Really great tool !
Pseudonymous Coward from 216.165.95.70 at Tue, 28 Apr 2009 04:24:43 +0000:
pdfpages just worked for me. Thanks a bunch, man.
Vidness from 24.28.194.244 at Wed, 13 May 2009 18:31:35 +0000:
The concat-pdf.pl one was the first I tried and it worked like a champ. Thanks!
Matt from 18.85.23.127 at Thu, 28 May 2009 21:28:48 +0000:
There's http://www.pdfjoin.com/
It's in the fedora repos as well, (yum install pdfjam)
Paul Lorento from 216.226.43.194 at Tue, 16 Jun 2009 19:39:54 +0000:
I really don't understand why its so complicated. If Adobe is ripping you OFF with a high price. Well just get their latest ADOBE PRO and get a Keygen to have FULL Free access to the software. Usually I'm buying them, but not sure who is f*** who in this situation. When I pay a 1000$+ for a simple software, I really feel that they f*** me !
Jan from 80.199.1.54 at Thu, 18 Jun 2009 19:44:36 +0000:
I've used pdfconcat from amyuni for some years without problems, but some "new" PDF's got "scrambled" and this page saved me a lot of searching and testing. Thanks
PDFTK solved my problems.
Phill Rogers from 212.9.31.115 at Thu, 30 Jul 2009 11:41:30 +0000:
I've struggled for ages with catting PDFs due to our needs:
Dealing with a few thousand moderately sized PDFs.
Maintain text search-ability & further PDF manipulation.
Not all FOSS tools install/compile cleanly on our AIX box.
PDF::API2 pm is great for most things but runs out of memory.
Thanks to your page, I tried the PDF::Reuse pm with great success.
E.g. 2012 PDFs totaling 11136 pages & 859MB merged in 4.5 minutes.
Anne from 24.206.100.230 at Fri, 31 Jul 2009 23:02:13 +0000:
Tried pdftk and it worked great for me. Thanks!
Justin from 83.244.142.68 at Mon, 03 Aug 2009 12:58:25 +0000:
I have cygwin on my Windows XP box which allowed me to use ghostscript, and it worked like a charm. (I guess you could also use the standard windows ghostscript/ghostgum too?)
patrick from 63.224.45.178 at Mon, 03 Aug 2009 19:27:28 +0000:
Had trouble with ghostscript on AIX throwing segment fault when given a list of pdfs to concatenate. But it always works when concatenating just 2 files. To combine more, just copy the previous output file to a temporary file and combine it and the next one in series. Works great and it's free!
Volker from 78.50.244.205 at Sun, 16 Aug 2009 18:02:29 +0000:
The reason why Ghostscript generates small PDFs (as Kevin reported) may be that it re-encodes images, reducing their quality in the process. If you want to retain image quality and embed fonts for compatibility, you have to use the following options:
gs -dNOPAUSE -dSAFER -sDEVICE=pdfwrite -sOUTPUTFILE=result.pdf \
-sPAPERSIZE=a4 -dCompatibilityLevel=1.3 \
-dEmbedAllFonts=true -dSubsetFonts=true -dMaxSubsetPct=100 \
-dAutoFilterColorImages=false -dAutoFilterGrayImages=false \
-dColorImageFilter=/FlateEncode -dGrayImageFilter=/FlateEncode \
-dAutoFilterMonoImages=false -dMonoImageFilter=/CCITTFaxEncode \
src1.pdf src2.pdf ...
Miaoyin from 98.212.61.5 at Sun, 30 Aug 2009 03:53:29 +0000:
Namo Amitabha! Thanks a lot for the helpful info! The LaTeX pdfpages package approach also worked for me the first time :-)
Using this approach, I also had the problem of big top and bottom margins in unrotated landscape pages as you mentioned when combining pdf files in both portrait and landscape format, and then found the solutions, as shown below:
For example, I was combining 3 pdf files: file1 and file2 are portrait, file3 is landscape. There're 2 possible solutions:
1) still leave file3 as landscape in final pdf:
\includepdf[pages=-]{file1.pdf}
\includepdf[pages=-]{file2.pdf}
\includepdf[pages=-,landscape=true]{file3.pdf}
In the resulted file, file1 and file2's pages are portrait, while file3's pages are landscape--the same as its original.
2) make all pages portrait in the final pdf:
\includepdf[pages=-]{file1.pdf}
\includepdf[pages=-]{file2.pdf}
\includepdf[pages=-,angle=90]{file3.pdf}
In the resulted file, file3's pages are rotated 90 degrees counter-clockwise, so all pages are portrait.
Enjoy and thanks again! :-) Namo Amitabha!
David from 83.233.152.131 at Mon, 31 Aug 2009 09:34:38 +0000:
Concatenating pdf files in Mac OS X is actually really simple.
Open both documents in Preview.app. Choose View > Sidebar in menu bar. At the bottom of the sidebar choose thumbnail view. If you select a page in one of the documents you can drag it to the other document, where ever you want it. You can then save the edited pdfs.
Juaque from 198.54.202.250 at Tue, 01 Sep 2009 09:10:24 +0000:
I use a PrimoPDF(Free) it installs a pdf printer which has the an apending function.
It is very simple, almost anyone can figure it out.
You can even add pieces of other documents you want to include in you new pdf.
ps. pdftk also works for pdf's
Hope this will help some people.
Piraja from 91.154.11.129 at Wed, 16 Sep 2009 11:20:12 +0000:
Thank you for this discussion, and especially to racin – pdftk worked beautifully! Recapitulation:
sudo apt-get install pdftk
cd <folder containing your pdf's>
pdftk *.pdf cat output all.pdf
Mike Stewart from 148.167.2.10 at Fri, 30 Oct 2009 19:45:01 +0000:
Thanks for putting this page together. Saved me countless hours of work too. PDFTK worked for me too.
musper from 78.2.111.22 at Mon, 30 Nov 2009 07:57:54 +0000:
Bump!
I've installed PDF Editor from synaptic ubuntu repo. Guess other distros have it too.
PDF Editor is no-brainer GUI tool, open your .pdf, import more pages from other .pdf files, order pages on import... Great tool that worked for me in no time.
Tom from 12.162.242.138 at Mon, 07 Dec 2009 16:39:17 +0000:
I need to concatenate 3-7000 PDFs together. Each PDF must be printed in a specific order. The challenge is how to create a single (really big) PDF to send to a printer when most libraries want to append the PDFs in-memory.
I'm looking for a utility or library that will allow me to append pdfs under program control and spool the result to disk as it's created rather than keep it in memory.
Al_ from 196.3.50.254 at Wed, 23 Dec 2009 13:49:57 +0000:
pdftk worked for me. So far I tried it for a flat directory structure (i.e., all input pdf in the same directory), but should be easy to write a bash script to combine from all subdirectories. However, I also need bookmarks (a.k.a. outline)in the final big pdf reflecting the original directory structure where the input files came from with directory and subdirectory names as well as file names.
Any suggestion?
Jeff Dickens from 67.158.116.42 at Wed, 23 Dec 2009 21:03:59 +0000:
Regarding the pyPDF solution: Where do you see pyPDF 2.0 ? The latest release I can find is 1.12.
Richard J Caul from 71.146.205.150 at Fri, 29 Jan 2010 07:15:00 +0000:
I am using pdfsam (for split and merge) for this purpose. Great tool, works fine for me. www.pdfsam.org
mario from 192.150.195.19 at Mon, 01 Feb 2010 14:57:36 +0000:
Use Bullzip pdf printer. Very easy to use and working on every situation. I use it from 3 years ago...
Do not enter a fake email address. If you don't want to provide one, just leave it blank. Comments with fake email addresses will be deleted.
This form is for posting public comments to be read by other people who visit this Web site. If you have a software support question, or other material directed to the page author instead of to the general public, please send email instead.
All the data you enter, and your IP address, will be saved and displayed. Don't enter secret information. HTML is not accepted; it will be displayed as plain text. Your comment will only be added if you enter valid data in all required fields; if it isn't, use the back button and try again.
I, and I alone, reserve the right to remove postings for any reason.
racin from 160.228.152.151 at Tue, 6 Dec 2005 23:27:27 +0000:
You can also use pdftk.
To concatene files, use: pdftk fichier*.pdf cat output all.pdf