Given an Html file, which is a common html and contains a table, How can I identify the table and convert it to an image?
I know, there is a open tool, http://code.google.com/p/java-html2image/, but it convert the whole html file an Image file.
Related
I have a PDF form which contains some fields.I want to make an application/utility in Java that takes a PDF file and convert its fields into editable fields, so that i can fill them using any PDF application(Adobe Reader/Foxit Reader) on Windows.Is there any library in java that can do this?
The pdf file has text as well as tabular data. If not then is there
any way by which I can understand whether the current page of pdf
contains tables or not
I am able to Extract data from the pdf page but can't confirm whether it is
tabular data or verbose(paragraphs) text.
I want to blur sensitive information in pdf file. I read about pyPdf in python and PDFBox of java but I could not get how to search and replace text in pdf file. By replacing I mean blur or even asterik character.
I also thought of a step in which I can take image of very page of pdf and then show them in html one by one. But then the same problem is there how to replace text in those images?
I have some document templates(.dotx files) with placeholders. I need to read that template and replace placeholders with actual text which is coming from database columns. I am able to do this using docx4j's WordprocessingMLPackage, but problem is, in some of database columns there is HTML code. This is text coming from a rich text editor fields. When I tried to replace this text in word document template, I am getting pure html code is copied into document. I want convert that html code into actual html text and write into document. How am I able to achieve this?
You can use https://github.com/plutext/docx4j-ImportXHTML either directly, or via content control databinding OpenDoPE extensions.
I have to extract whole pdf content using iText. But i couldn't read the contents in table.Because table is not a selectable part in Pdf. How can I make the table content as selectable in order to extract the table content?