Merging PDF’s with PDFBox
Merging Portable Document Format documents using PDFBox couldn’t be simpler. The developer(s) of PDFBox has taken care of all of the hard work and encapsulated it in one class of their Application Programming Interface. All you need to do is use it.
The class I am referring to is the PDFMergerUtility class. This class provides everything you need to take multiple single or multi page PDF documents and merge them into one PDF document. Below I will go over the simple steps of using this class to merge all PDF’s located in a directory without having to pass each file as an argument.
The first step is to initialize the class as follows:
PDFMergerUtility mergePdf = new PDFMergerUtility();
With the class initialized we can start to use it to merge our PDF’s. The next step in our process is to read and store the two arguments that gets passed into our application for later use. When invoking our utility from the command line we expect two arguments to be passed in, the first, the folder that contains the documents and the second, the file name of the final merged PDF. We store these arguments as two String variables:
String folder = args;
String destinationFileName = args;
The next step is to get hold of all of the files in the directory that
was passed to our utility and store them as a String variable called folder. For this I wrote a small method that uses the
private static String getFiles(String folder) throws IOException
File _folder = new File(folder);
filesInFolder = _folder.list();
throw new IOException("Path is not a directory");
The first thing we check is that the directory passed to us is in fact a directory. If not, we throw an IOException with the message Path is not a directory. After we verified that this is a directory we use the
list() function from the
java.io.File class to get the files from the directory. The
method returns an array of all of the files in the directory. We store
this in a String array and return this array to the caller.
filesInFolder = _folder.list();
Because the final steps of our utility can possibly cause one of two
exception two be thrown, we will enclose it within a try/catch block.
The first thing we do inside our try block is to store the size of the
array as an int variable called numberOfFiles, we will be using this inside our for loop a little later. Next we store our files in a
String called, you guessed it, files.
Armed with this information we can go ahead and loop through our array
of files. The reason why we need to loop through our files is because
we need to add them to the source of the PDFMergeUtility using it’s
The for loop is then also where we will be making use of the first of our two variables, numberOfFiles.
for(int i = 0; i < numberOfFiles; i++)
Inside the loop we add each file to the PDFMergeUtility’s source using the following line of code:
mergePdf.addSource(folder + File.separator + files[i]);
The only steps left for us is to set the file name and location of the merged document and then call the PDFMergeUtility’s
mergePdf.setDestinationFileName(folder + File.separator + destinationFileName);
To close of our try block we catch the two possible exception that could be thrown by the methods used inside the try block. These are the COSVisitorException and an IOException. With this done our utility is complete! I hope you enjoyed this tutorial and find the utility useful. You can download the complete source here and use it as you see fit. Please feel free to post your comments as to how this utility can be improved and expanded upon.
(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)