how to get field page in PDFBox API 2? - java

i'm trying to get the field page in my project,
and i dont know how to get the page number for each field and field.
i have this code:
String formTemplate = "Template.pdf";
String filledForm = "filledForm.pdf";
PDDocument pdfDocument = PDDocument.load(new File(formTemplate));
PDAcroForm acroForm = pdfDocument.getDocumentCatalog().getAcroForm();
if (acroForm != null)
{
PDField field = acroForm.getField( "name" );
field.getAcroForm().setNeedAppearances(true);
field.setValue("my name");
acroForm.getField( "date" );
field.setValue("my date");
}
pdfDocument.save(filledForm);
pdfDocument.close();
}
How do I get the page numbers of the fields?
thanks
ron

This will show you on what page(s) (0-based) the field appears:
PDField field = acroForm.getField( "date" );
for (PDAnnotationWidget widget : field.getWidgets())
{
PDPage page = widget.getPage();
if (page == null)
{
// incorrect PDF. Plan B: try all pages to check the annotations.
for (int p = 0; p < doc.getNumberOfPages(); ++p)
{
List<PDAnnotation> annotations = doc.getPage(p).getAnnotations();
for (PDAnnotation ann : annotations)
{
if (ann.getCOSObject() == widget.getCOSObject())
{
System.out.println("found at page: " + p);
break;
}
}
}
continue;
}
int pageNum = pdfDocument.getPages().indexOf(page);
System.out.println("found at page: " + pageNum);
}

Related

PDFBOX 2.0+ java flatten annotations freetext created by foxit

I ran into a very tough issue. We have forms that were supposed to be filled out, but some people used annotation freeform text comments in foxit instead of filling the form fields, so the annotations never flatten. When our render software generates the final document annotations are not included.
The solution I tried is to basically go through the document, get the annotation text content and write it to the pdf so it is on the final document then remove the actual annotation, but I run into an issue where I don't know the font the annotation is using, line space, etc so cannot find out how to get it from a pdfbox to recreate exacactly as the annotation looks on the unflattened form.
Basically I want to flatten annotatations that are freeform created in foxit (The typewriter comment feature)
Here is the code. It is working, but again I am struggling with figuring out how to get the annotations to write to my final pdf document. Again flatten on the acroform is not working because these are not acroform fields! The live code filters out anything that is not a freetext type annotation, but below code should show my issue.
public static void main(String [] args)
{
String startDoc = "C:/test2/test.pdf";
String finalFlat = "C:/test2/test_FLAT.pdf";
try {
// for testing
try {
//BasicConfigurator.configure();
File myFile = new File(startDoc);
PDDocument pdDoc = PDDocument.load( myFile );
PDDocumentCatalog pdCatalog = pdDoc.getDocumentCatalog();
PDAcroForm pdAcroForm = pdCatalog.getAcroForm();
// set the NeedApperances flag
pdAcroForm.setNeedAppearances(false);
// correct the missing page link for the annotations
for (PDPage page : pdDoc.getPages()) {
for (PDAnnotation annot : page.getAnnotations()) {
System.out.println(annot.getContents());
System.out.println(annot.isPrinted());
System.out.println(annot.isLocked());
System.out.println(annot.getAppearance().toString());
PDPageContentStream contentStream = new PDPageContentStream(pdDoc, page, PDPageContentStream.AppendMode.APPEND,true,true);
int fontHeight = 14;
contentStream.setFont(PDType1Font.TIMES_ROMAN, fontHeight);
float height = annot.getRectangle().getLowerLeftY();
String s = annot.getContents().replaceAll("\t", " ");
String ss[] = s.split("\\r");
for(String sss : ss)
{
contentStream.beginText();
contentStream.newLineAtOffset(annot.getRectangle().getLowerLeftX(),height );
contentStream.showText(sss);
height = height + fontHeight * 2 ;
contentStream.endText();
}
contentStream.close();
page.getAnnotations().remove(annot);
}
}
pdAcroForm.flatten();
pdDoc.save(finalFlat);
pdDoc.close();
}
catch (Exception e) {
e.printStackTrace();
}
}
catch (Exception e) {
System.err.println("Exception: " + e.getLocalizedMessage());
}
}
This was not a fun one. After a million different tests, and I STILL do not understand all the nuances, but this is the version that appeas to flatten all pdf files and annotations if they are visible on PDF. Tested about half a dozen pdf creators and if an annotation is visible on a page this hopefully flattens it. I suspect there is a better way by pulling the matrix and transforming it and what not, but this is the only way I got it to work everywhere.
public static void flattenv3(String startDoc, String endDoc) {
org.apache.log4j.Logger.getRootLogger().setLevel(org.apache.log4j.Level.INFO);
String finalFlat = endDoc;
try {
try {
//BasicConfigurator.configure();
File myFile = new File(startDoc);
PDDocument pdDoc = PDDocument.load(myFile);
PDDocumentCatalog pdCatalog = pdDoc.getDocumentCatalog();
PDAcroForm pdAcroForm = pdCatalog.getAcroForm();
if (pdAcroForm != null) {
pdAcroForm.setNeedAppearances(false);
pdAcroForm.flatten();
}
// set the NeedApperances flag
boolean isContentStreamWrapped;
int ii = 0;
for (PDPage page: pdDoc.getPages()) {
PDPageContentStream contentStream;
isContentStreamWrapped = false;
List < PDAnnotation > annotations = new ArrayList < > ();
for (PDAnnotation annotation: page.getAnnotations()) {
if (!annotation.isInvisible() && !annotation.isHidden() && annotation.getNormalAppearanceStream() != null)
{
ii++;
if (ii > 1) {
// contentStream.close();
// continue;
}
if (!isContentStreamWrapped) {
contentStream = new PDPageContentStream(pdDoc, page, AppendMode.APPEND, true, true);
isContentStreamWrapped = true;
} else {
contentStream = new PDPageContentStream(pdDoc, page, AppendMode.APPEND, true);
}
PDAppearanceStream appearanceStream = annotation.getNormalAppearanceStream();
PDFormXObject fieldObject = new PDFormXObject(appearanceStream.getCOSObject());
contentStream.saveGraphicsState();
boolean needsTranslation = resolveNeedsTranslation(appearanceStream);
Matrix transformationMatrix = new Matrix();
boolean transformed = false;
float lowerLeftX = annotation.getNormalAppearanceStream().getBBox().getLowerLeftX();
float lowerLeftY = annotation.getNormalAppearanceStream().getBBox().getLowerLeftY();
PDRectangle bbox = appearanceStream.getBBox();
PDRectangle fieldRect = annotation.getRectangle();
float xScale = fieldRect.getWidth() - bbox.getWidth();
transformed = true;
lowerLeftX = fieldRect.getLowerLeftX();
lowerLeftY = fieldRect.getLowerLeftY();
if (bbox.getLowerLeftX() <= 0 && bbox.getLowerLeftY() < 0 && Math.abs(xScale) < 1) //BASICALLY EQUAL TO 0 WITH ROUNDING
{
lowerLeftY = fieldRect.getLowerLeftY() - bbox.getLowerLeftY();
if (bbox.getLowerLeftX() < 0 && bbox.getLowerLeftY() < 0) //THis is for the o
{
lowerLeftX = lowerLeftX - bbox.getLowerLeftX();
}
} else if (bbox.getLowerLeftX() == 0 && bbox.getLowerLeftY() < 0 && xScale >= 0) {
lowerLeftX = fieldRect.getUpperRightX();
} else if (bbox.getLowerLeftY() <= 0 && xScale >= 0) {
lowerLeftY = fieldRect.getLowerLeftY() - bbox.getLowerLeftY() - xScale;
} else if (bbox.getUpperRightY() <= 0) {
if (annotation.getNormalAppearanceStream().getMatrix().getShearY() < 0) {
lowerLeftY = fieldRect.getUpperRightY();
lowerLeftX = fieldRect.getUpperRightX();
}
} else {
}
transformationMatrix.translate(lowerLeftX,
lowerLeftY);
contentStream.transform(transformationMatrix);
contentStream.drawForm(fieldObject);
contentStream.restoreGraphicsState();
contentStream.close();
}
}
page.setAnnotations(annotations);
}
pdDoc.save(finalFlat);
pdDoc.close();
File file = new File(finalFlat);
// Desktop.getDesktop().browse(file.toURI());
} catch (Exception e) {
e.printStackTrace();
}
} catch (Exception e) {
System.err.println("Exception: " + e.getLocalizedMessage());
}
}
}

PDFBOX extract image with Color space Indexed

I'm trying to extract all the images from pdf by using the below code, it work fine for all images except the images with color space indexed.
try (final PDDocument document = PDDocument.load(new File("./pdfs/22.pdf"))){
PDPageTree list = document.getPages();
for (PDPage page : list) {
PDResources pdResources = page.getResources();
int i = 1;
for (COSName name : pdResources.getXObjectNames()) {
PDXObject o = pdResources.getXObject(name);
if (o instanceof PDImageXObject) {
PDImageXObject image = (PDImageXObject)o;
String filename = OUTPUT_DIR + "extracted-image-" + i + ".png";
ImageIO.write(image.getImage(), "png", new File(filename));
i++;
}
}
}
} catch (IOException e){
System.err.println("Exception while trying to create pdf document - " + e);
}
Do i miss something? How can I extract such type of images??

Not able to get the specific location of the bookmark in a page using PDFBOX

I am using PDFBOX 2.0.2 jar to add more than one PDF in a existing heading bookmarked PDF file. And for the same i am splitting it and merging other PDF.
Splitter splitter = new Splitter();
splitter.setStartPage(1);
splitter.setEndPage(noOfPagesInHeadingBkmrkedPDF);
Before Split and merge , i am keeping all the bookmark in HashMap with key as pageNumber and value as bookmark name. And after merge i am setting back the bookmark w.r.t. My query is - how to get the specific co-ordinate (location) of bookmark on the page so that the after merge i should be able to set it back to that particular location of the page.
Code snippet for creating the HashMap before Split :
public void getAllBookmarks(PDOutlineNode bookmarksInOriginalFile, String emptyString, Map<Integer, String> bookmarkMap) throws IOException {
PDOutlineItem current = null;
if (null != bookmarksInOriginalFile)
current = bookmarksInOriginalFile.getFirstChild();
while (current != null) {
Integer pageNumber = 0;
PDPageDestination pd = null;
if (current.getDestination() instanceof PDPageDestination) {
pd = (PDPageDestination) current.getDestination();
pageNumber = (pd.retrievePageNumber() + 1); // Do we have any method available to get the location on the specific page ??
}
if (current.getAction() instanceof PDActionGoTo) {
PDActionGoTo gta = (PDActionGoTo) current.getAction();
if (gta.getDestination() instanceof PDPageDestination) {
pd = (PDPageDestination) gta.getDestination();
pageNumber = (pd.retrievePageNumber() + 1);
}
}
String bookmarkName = emptyString + current.getTitle();
if(null!=bookmarkName && !EMPTY_STRING.equalsIgnoreCase(bookmarkName)){
bookmarkMap.put(pageNumber-1,bookmarkName);
}
getAllBookmarks(current, emptyString,bookmarkMap);
current = current.getNextSibling();
}
}
Any help would be much appreciated.
Thank you...
As i am able to solve my solution using #TilmanHausherr Suggestion. I am answering my question. I changed the below piece of code :
public void getAllBookmarks(PDOutlineNode bookmarksInOriginalFile, String emptyString, Map<Integer,BookmarkMetaDataBO> bookmarkMap) throws IOException {
PDOutlineItem current = null;
if (null != bookmarksInOriginalFile)
current = bookmarksInOriginalFile.getFirstChild();
while (current != null) {
Integer pageNumber = 0;
PDPageDestination pd = null;
PDPageXYZDestination pdx = null;
// These value will give the specific location
**int left = 0;
int top = 0;**
if (current.getDestination() instanceof PDPageXYZDestination) {
pdx = (PDPageXYZDestination) current.getDestination();
pageNumber = (pdx.retrievePageNumber() + 1);
**left = pdx.getLeft();
top = pdx.getTop();**
}
if (current.getAction() instanceof PDActionGoTo) {
PDActionGoTo gta = (PDActionGoTo) current.getAction();
if (gta.getDestination() instanceof PDPageDestination) {
pd = (PDPageDestination) gta.getDestination();
pageNumber = (pd.retrievePageNumber() + 1);
}
}
String bookmarkName = emptyString + current.getTitle();
if(null!=bookmarkName && !EMPTY_STRING.equalsIgnoreCase(bookmarkName)){
BookmarkMetaDataBO bkmrkBo = new BookmarkMetaDataBO();
**bkmrkBo.setTop(left);
bkmrkBo.setLeft(top);**
bookmarkMap.put(pageNumber-1,bkmrkBo);
}
getAllBookmarks(current, emptyString,bookmarkMap);
current = current.getNextSibling();
}
}
Thank you...

Image is not inserting in db from excel in java spring boot?

Hi Everyone I'm beginner in Java Spring Boot.I want to read the data from excel.Data contains both text and image.I have read text from excel and insert into db successfully.But I could not read the image and I have to insert the image into db.I have attached the error below Can anyone give me a solution? Any help would be appreciate.Thanks in advance....
Image :
#RequestMapping("uploadQuestion")
public String uploadQuestion(#RequestParam("file") MultipartFile file, #RequestParam Long examId, Model model,
HttpSession session) throws IOException, InvalidFormatException {
int flag = 0;
String id = (String) session.getAttribute("userId");
model.addAttribute("examList", onlineExamMasterRepository.findAll());
DataFormatter formatter = new DataFormatter();
List<OnlineExamQuestionMaster> quetions = new ArrayList<OnlineExamQuestionMaster>();
List<OnlineExamOptionMaster> options = new ArrayList<OnlineExamOptionMaster>();
if (file.isEmpty()) {
model.addAttribute("info", "Please select a file to upload");
return "onlinexam/questionUpload :: section";
}
InputStream in = file.getInputStream();
XSSFWorkbook workbook = new XSSFWorkbook(in);
XSSFSheet sheet = workbook.getSheetAt(0);
Row row;
System.out.println(sheet.getLastRowNum());
for (int i = 1; i <= sheet.getLastRowNum(); i++) {
OnlineExamQuestionMaster qm = new OnlineExamQuestionMaster();
OnlineExamQuestionMasterPK qmp = new OnlineExamQuestionMasterPK();
OnlineExamOptionMasterPK omp[] = new OnlineExamOptionMasterPK[4];
OnlineExamOptionMaster om[] = new OnlineExamOptionMaster[4];
qmp.setExamId(examId);
qm.setLogTimestamp(new Date());
qm.setLogUserid(id);
flag++;
row = (Row) sheet.getRow(i);
System.out.println(row.getCell(0).toString());
if (row.getCell(0).toString().equals(null)) {
model.addAttribute("info", "Some columns are null please check and try");
return "onlinexam/questionUpload :: section";
} else {
qmp.setQuestionId(Long.parseLong(formatter.formatCellValue(row.getCell(0))));
if (onlineExamQuestionMasterRepository.exists(qmp)) {
model.addAttribute("message", "Already QuestionId with " + formatter.formatCellValue(row.getCell(0))
+ " Exist for ExamId " + examId);
return "onlinexam/questionUpload :: section";
}
}
if (row.getCell(1).toString().equals("")) {
model.addAttribute("info", "Some columns are null please check and try");
return "onlinexam/questionUpload :: section";
} else
{
row = (Row) sheet.getRow(i);
Iterator<Cell> iterator2 = row.cellIterator();
/*XSSFWorkbook workbook2 = sheet.getWorkbook();
List<XSSFPictureData> pictures = workbook2.getAllPictures();
Iterator<XSSFPictureData> iterator = pictures.iterator();*/
while(iterator2.hasNext())
{
PictureData pictureData = (PictureData)iterator2.next();
String fileextension = pictureData.suggestFileExtension();
byte[] data = pictureData.getData();
if(fileextension.equals("jpeg"))
{
qm.setImage(data);;
}
else
qm.setQidDescription(row.getCell(1).toString().trim());
}
}
if (row.getCell(2).toString().equals("")) {
model.addAttribute("info", "Some columns are null please check and try");
return "onlinexam/questionUpload :: section";
} else {
omp[0] = new OnlineExamOptionMasterPK();
om[0] = new OnlineExamOptionMaster();
omp[0].setQid(Long.parseLong(formatter.formatCellValue(row.getCell(0))));
omp[0].setOptionId("A");
omp[0].setExamId(examId);
om[0].setLogTimestamp(new Date());
om[0].setLogUserid(id);
om[0].setOptionDesc(row.getCell(2).toString().trim());
om[0].setId(omp[0]);
}
if (row.getCell(3).toString().equals("")) {
model.addAttribute("info", "Some columns are null please check and try");
return "onlinexam/questionUpload :: section";
} else {
omp[1] = new OnlineExamOptionMasterPK();
om[1] = new OnlineExamOptionMaster();
omp[1].setExamId(examId);
omp[1].setQid(Long.parseLong(formatter.formatCellValue(row.getCell(0))));
omp[1].setOptionId("B");
om[0].setLogTimestamp(new Date());
om[0].setLogUserid(id);
om[1].setOptionDesc(row.getCell(3).toString().trim());
om[1].setId(omp[1]);
}
if (row.getCell(4).toString().equals("")) {
model.addAttribute("info", "Some columns are null please check and try");
return "onlinexam/questionUpload :: section";
} else {
omp[2] = new OnlineExamOptionMasterPK();
om[2] = new OnlineExamOptionMaster();
omp[2].setExamId(examId);
omp[2].setQid(Long.parseLong(formatter.formatCellValue(row.getCell(0))));
omp[2].setOptionId("C");
om[0].setLogTimestamp(new Date());
om[0].setLogUserid(id);
om[2].setOptionDesc(row.getCell(4).toString().trim());
om[2].setId(omp[2]);
}
if (row.getCell(5).toString().equals("")) {
model.addAttribute("info", "Some columns are null please check and try");
return "onlinexam/questionUpload :: section";
} else {
omp[3] = new OnlineExamOptionMasterPK();
om[3] = new OnlineExamOptionMaster();
omp[3].setExamId(examId);
omp[3].setQid(Long.parseLong(formatter.formatCellValue(row.getCell(0))));
omp[3].setOptionId("D");
om[0].setLogTimestamp(new Date());
om[0].setLogUserid(id);
om[3].setOptionDesc(row.getCell(5).toString().trim());
om[3].setId(omp[3]);
}
if (row.getCell(6).toString().equals("")) {
model.addAttribute("info", "Some columns are null please check and try");
return "onlinexam/questionUpload :: section";
} else {
qm.setAnswer(row.getCell(6).toString().toUpperCase().trim());
}
if (row.getCell(7).toString().equals("")) {
model.addAttribute("info", "Some columns are null please check and try");
return "onlinexam/questionUpload :: section";
} else {
qm.setMarks(Long.parseLong(formatter.formatCellValue(row.getCell(7))));
}
qm.setId(qmp);
quetions.add(qm);
options.addAll(Arrays.asList(om));
}
for (OnlineExamQuestionMaster h : quetions) {
onlineExamQuestionMasterRepository.save(h);
}
System.out.println(options.size());
for (OnlineExamOptionMaster h : options) {
System.out.println(h.toString());
onlineExamOptionMasterRepository.save(h);
}
model.addAttribute("info", flag + "Questions Uploaded Sucessfully");
return "onlinexam/questionUpload :: section";
}
]3]3
Pictures are not cell content in Excel. They hover in a separate drawing layer (XSSFDrawing in case of XSSF) over the sheet and are anchored to cells.
So if the need is getting pictures according to the position they are anchored to, then we need to
get the drawing layer
loop over all shapes in that layer and if the shape is a picture, then
get the picture
get the anchor position of that picture
Following example is doing that and produces a Map which maps XSSFPicture to their positions. I have got the XSSFPicture because they provide much more information than the XSSFPictureData alone. And the XSSFPictureData can easily got form the XSSFPicture.
import java.io.FileInputStream;
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.*;
import java.util.Map;
import java.util.HashMap;
public class ExcelGetPicturesWithPosition {
static Map<String, XSSFPicture> getPicturesWithPosition(XSSFSheet sheet) {
Map<String, XSSFPicture> pictures = new HashMap<String, XSSFPicture>();
XSSFDrawing drawing = sheet.getDrawingPatriarch();
for (XSSFShape shape : drawing.getShapes()) {
if (shape instanceof XSSFPicture) {
XSSFPicture picture = (XSSFPicture)shape;
XSSFClientAnchor anchor = picture.getClientAnchor();
String cellAddr = "R" + anchor.getRow1() + "C" + anchor.getCol1();
pictures.put(cellAddr, picture);
}
}
return pictures;
}
public static void main(String[] args) throws Exception {
XSSFWorkbook workbook = (XSSFWorkbook)WorkbookFactory.create(new FileInputStream("ExcelWithPictures.xlsx"));
XSSFSheet sheet = workbook.getSheetAt(0);
Map<String, XSSFPicture> pictures = getPicturesWithPosition(sheet);
System.out.println(pictures);
workbook.close();
}
}
Once having that Map it is easy to get the pictures while iterating the sheet. For example if we are on int row = 2; int column = 1;:
XSSFPicture picture = pictures.get("R"+row+"C"+column);

How to read bookmarks in PDF using itext at multi level?

I am using iText-Java to split PDFs at bookmark level.
Does anybody know or have any examples for splitting a PDF at bookmarks that exist at a level 2 or 3?
For ex: I have the bookmarks in the following levels:
Father
|-Son
|-Son
|-Daughter
|-|-Grand son
|-|-Grand daughter
Right now I have below code to read the bookmark which reads the base bookmark(Father). Basically SimpleBookmark.getBookmark(reader) line did all the work.
But I want to read the level 2 and level 3 bookmarks to split the content present between those inner level bookmarks.
public static void splitPDFByBookmarks(String pdf, String outputFolder){
try
{
PdfReader reader = new PdfReader(pdf);
//List of bookmarks: each bookmark is a map with values for title, page, etc
List<HashMap> bookmarks = SimpleBookmark.getBookmark(reader);
for(int i=0; i<bookmarks.size(); i++){
HashMap bm = bookmarks.get(i);
HashMap nextBM = i==bookmarks.size()-1 ? null : bookmarks.get(i+1);
//In my case I needed to split the title string
String title = ((String)bm.get("Title")).split(" ")[2];
log.debug("Titel: " + title);
String startPage = ((String)bm.get("Page")).split(" ")[0];
String startPageNextBM = nextBM==null ? "" + (reader.getNumberOfPages() + 1) : ((String)nextBM.get("Page")).split(" ")[0];
log.debug("Page: " + startPage);
log.debug("------------------");
extractBookmarkToPDF(reader, Integer.valueOf(startPage), Integer.valueOf(startPageNextBM), title + ".pdf",outputFolder);
}
}
catch (IOException e)
{
log.error(e.getMessage());
}
}
private static void extractBookmarkToPDF(PdfReader reader, int pageFrom, int pageTo, String outputName, String outputFolder){
Document document = new Document();
OutputStream os = null;
try{
os = new FileOutputStream(outputFolder + outputName);
// Create a writer for the outputstream
PdfWriter writer = PdfWriter.getInstance(document, os);
document.open();
PdfContentByte cb = writer.getDirectContent(); // Holds the PDF data
PdfImportedPage page;
while(pageFrom < pageTo) {
document.newPage();
page = writer.getImportedPage(reader, pageFrom);
cb.addTemplate(page, 0, 0);
pageFrom++;
}
os.flush();
document.close();
os.close();
}catch(Exception ex){
log.error(ex.getMessage());
}finally {
if (document.isOpen())
document.close();
try {
if (os != null)
os.close();
} catch (IOException ioe) {
log.error(ioe.getMessage());
}
}
}
Your help is much appreciated.
Thanks in advance! :)
You get an ArrayList<HashMap> when you call SimpleBookmark.getBookmark(reader); (do the cast if you need it). Try to iterate through that Arraylist and see its structure. If a bookmarks have sons (as you call it), it will contains another list with the same structure.
A recursive method could be the solution.
Reference for those who are looking at this using itext7
public void walkOutlines(PdfOutline outline, Map<String, PdfObject> names, PdfDocument pdfDocument,List<String>titles,List<Integer>pageNum) { //----------loop traversing all paths
for (PdfOutline child : outline.getAllChildren()){
if(child.getDestination() != null) {
prepareIndexFile(child,names,pdfDocument,titles,pageNum,list);
}
}
}
//-----Getting pageNumbers from outlines
public void prepareIndexFile(PdfOutline outline, Map<String, PdfObject> names, PdfDocument pdfDocument,List<String>titles,List<Integer>pageNum) {
String title = outline.getTitle();
PdfDestination pdfDestination = outline.getDestination();
String pdfStr = ((PdfString)pdfDestination.getPdfObject()).toUnicodeString();
PdfArray array = (PdfArray) names.get(pdfStr);
PdfObject pdfObj = array != null ? array.get(0) : null;
Integer pageNumber = pdfDocument.getPageNumber((PdfDictionary)pdfObj);
titles.add(title);
pageNum.add(pageNumber);
if(outline.getAllChildren().size() > 0) {
for (PdfOutline child : outline.getAllChildren()){
prepareIndexFile(child,names,pdfDocument,titles,pageNum);
}
}
}
public boolean splitPdf(String inputFile, final String outputFolder) {
boolean splitSuccess = true;
PdfDocument pdfDoc = null;
try {
PdfReader pdfReaderNew = new PdfReader(inputFile);
pdfDoc = new PdfDocument(pdfReaderNew);
final List<String> titles = new ArrayList<String>();
List<Integer> pageNum = new ArrayList<Integer>();
PdfNameTree destsTree = pdfDoc.getCatalog().getNameTree(PdfName.Dests);
Map<String, PdfObject> names = destsTree.getNames();//--------------------------------------Core logic for getting names
PdfOutline root = pdfDoc.getOutlines(false);//--------------------------------------Core logic for getting outlines
walkOutlines(root,names, pdfDoc, titles, pageNum,content); //------Logic to get bookmarks and pageNumbers
if (titles == null || titles.size()==0) {
splitSuccess = false;
}else { //------Proceed if it has bookmarks
for(int i=0;i<titles.size();i++) {
String title = titles.get(i);
String startPageNmStr =""+pageNum.get(i);
int startPage = Integer.parseInt(startPageNmStr);
int endPage = startPage;
if(i == titles.size() - 1) {
endPage = pdfDoc.getNumberOfPages();
}else {
int nextPage = pageNum.get(i+1);
if(nextPage > startPage) {
endPage = nextPage - 1;
}else {
endPage = nextPage;
}
}
String outFileName = outputFolder + File.separator + getFileName(title) + ".pdf";
PdfWriter pdfWriter = new PdfWriter(outFileName);
PdfDocument newDocument = new PdfDocument(pdfWriter, new DocumentProperties().setEventCountingMetaInfo(null));
pdfDoc.copyPagesTo(startPage, endPage, newDocument);
newDocument.close();
pdfWriter.close();
}
}
}catch(Exception e){
//---log
}
}

Categories