How to generate pdf of html page in java

How to generate pdf of html page in java - java

I have to generate pdf of HTML page. I have written a method for this but it generates an error. Please guide me where I'm wrong. Thank You!
public void htmlToPdf(
String htmlPath,
File pdfFile
) throws IOException, DocumentException {
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(
document,
new FileOutputStream(pdfFile)
);
document.open();
XMLWorkerHelper.getInstance().parseXHtml(
writer,
document,
new FileInputStream(htmlPath),
Charset.forName("UTF-8")
);
document.close();
}
Error :
Cannot resolve method 'parseXHtml(com.lowagie.text.pdf.PdfWriter, com.lowagie.text.Document, java.io.FileInputStream, java.nio.charset.Charset)'

So you want to generate PDFs from HTML with Java? (check EDIT 2020 at the bottom)
Here is the procedure I am using with flying-saucer.
Format your HTML with CSS 2.1
Write the process to generate a PDF
Create the PDF generator interface
Use a custom object to wrap images with attributes for further formatting
Implements your interface with your PDF parameters and images
1. Format your HTML with CSS 2.1
Example can be a JSP with ELs, any other template (you will be able to get the
generated HTML with parameters with an internal POST request), or just static
HTML.
You cannot use proportionnal values like em, rem, vh, vw or complex
CSS like animations.
You can use <style> </style> tag or inline style= attribute
Here is an example of a JSP in my webapp.
<!DOCTYPE html>
<%# page session="false"
language="java"
contentType="text/html; charset=UTF-8"
pageEncoding="UTF-8"
isELIgnored="false" %>
<%# taglib uri="http://java.sun.com/jsp/jstl/core" prefix="c" %>
<html>
<head>
<META CHARSET="UTF-8" />
<title>My PDF</title>
<style>
/* you can add reset css too */
/* stylesheet */
body { font-family: sans-serif; }
.someCSSClass {}
.anotherCSSClass {}
</style>
</head>
<body>
<div class="someCSSClass">
<p class="anotherCSSClass" style="line-height:16px;">
${ param.aParameter }
</p>
2. Write the process to generate a PDF with an interface
Why using an interface? Because in the case you need to generate additional
PDFs from different models you will not have to write the same logic to
generate each PDFs.
import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.net.URL;
import java.net.URLConnection;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import javax.servlet.ServletOutputStream;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import org.xhtmlrenderer.pdf.ITextRenderer;
import com.itextpdf.text.DocumentException;
import com.itextpdf.text.Image;
import com.itextpdf.text.pdf.PdfContentByte;
import com.itextpdf.text.pdf.PdfImage;
import com.itextpdf.text.pdf.PdfIndirectObject;
import com.itextpdf.text.pdf.PdfName;
import com.itextpdf.text.pdf.PdfReader;
import com.itextpdf.text.pdf.PdfStamper;
import main.java.bean.ImagePDF;
import main.java.interface.PDFInterface;
import main.java.bean.Constants;
/**
* PDFGenerator
* Class to generate PDF (can implement Servlet).
*/
public class PDFGenerator {
private static final String TMP_DIR = System.getProperty("java.io.tmpdir");
/*
* May not be a GET, can be simple method call for local application or
* whatever you need
*/
#Override
protected void goGet(
HttpServletRequest request,
HttpServletResponse response
) throws IOException {
PDFInterface pdfImplementation = null;
/*
* instance your PDF Model implementation according to this
* parameter (for example)
*/
int pdfModel = Integer.parseInt(
request.getParameter("requestedPDFModel")
);
switch (pdfModel) {
case Constants.PDF_MODEL_1:
pdfImplementation = new PDFImplementationOne();
/*
* You could get the image reference from GET request too,
* or from database or from constants
*/
pdfImplementation.addImage(
"image1.png",
120,
50,
"image_name1",
request
);
break;
case Constants.PDF_MODEL_2:
pdfImplementation = new PDFImplementationTwo();
pdfImplementation.addImage(
"image2.png",
350,
70,
"image_name2",
request
);
break;
default :
System.out.println("Cannot find an implementation for the requested PDF.");
return null;
}
String html = null;
/*
Get the HTML from an URL : if your implementation returns null
then you can for example decide to get the HTML from a file in your implementation
*/
if (pdfImplementation.getUrl(request) != null) {
// Send POST request to generate the HTML from a template (JSP, JSF, Thymeleaf, ...)
URLConnection connection = new URL(
pdfImplementation.getUrl(request)
+pdfImplementation.getEncodedQueryString()
).openConnection();
connection.setDoOutput(true); // POST : remove this to do a GET
connection.setRequestProperty("Accept-Charset", "UTF-8");
connection.setRequestProperty(
"Content-Type",
"application/x-www-form-urlencoded;charset=UTF-8"
);
try (OutputStream output = connection.getOutputStream()) {
output.write(
pdfImplementation
.getEncodedQueryString()
.getBytes(StandardCharsets.UTF_8)
);
}
// Open an input stream on the response
BufferedReader in = new BufferedReader(
new InputStreamReader(connection.getInputStream())
);
StringBuilder sb = new StringBuilder();
// A line in our generated HTML
String inputLine;
// Read all HTML lines and concatenate
while ((inputLine = in.readLine()) != null) {
sb.append(inputLine);
}
html = sb.toString();
in.close();
}
// Get the HTML from a File
else {
html = String.join(
"",
Files.readAllLines(pdfImplementation.getHTMLFile().toPath())
);
}
// Create a temp file to make the PDF
File tempPDFFile = new File(
TMP_DIR + pdfImplementation.getGeneratedPDFFileName()
);
if (!tempPDFFile.exists()) { tempPDFFile.createNewFile(); }
FileOutputStream fos = new FileOutputStream(tempPDFFile);
// Output the HTML to the temp PDF file
new ITextRenderer() {{
setDocumentFromString(html);
layout();
createPDF(fos);
}};
fos.close();
// Create your final PDF file
File pdf = new File(pdfImplementation.getPDFFilename());
// Add images if needed
addImageToPDF(pdfImplementation, tempPDFFile, pdf);
// Write in response if you need servlet implementation
writePDFContentToResponse(pdf, response);
}
/**
* writePDFContentToResponse
* #param pdf : the final PDF file
* #param response : a HTTPServletResponse to write PDF file bytes
* #throws IOException
*/
void writePDFContentToResponse(
File pdf,
HttpServletResponse response
) throws IOException {
InputStream fis = new FileInputStream(pdf);
String mimeType = getServlet().getServletContext()
.getMimeType(pdf.getAbsolutePath());
response.setContentType(
mimeType != null ? mimeType : "application/octet-stream"
);
response.setContentLength((int) pdf.length());
response.setHeader(
"Content-Disposition",
"attachment; filename="+pdf.getName()+".pdf"
);
ServletOutputStream os = response.getOutputStream();
byte[] bufferData = new byte[1024];
int read = 0;
while((read = fis.read(bufferData)) != -1) {
os.write(bufferData, 0, read);
}
os.flush();
os.close();
fis.close();
response.flushBuffer();
Files.delete(pdf.toPath());
}
/**
* addImageToPDF
*
* #param pdfImplementation : the pdfImplementation to get the array of
* custom image objects ImagePDF.
* #param tempPDFFile : the temp PDF file with already HTML content
* converted.
* #param pdf : the final PDF file which will have images stamped.
* #throws DocumentException
* #throws IOException
*/
void addImageToPDF(
PDFInterface pdfImplementation,
File tempPDFFile,
File pdf
) throws DocumentException, IOException {
PdfReader reader = new PdfReader(new FileInputStream(tempPDFFile));
PdfStamper stamper = new PdfStamper(
reader,
new FileOutputStream(pdf)
);
for (ImagePDF img: pdfImplementation.getImages()) {
Image image = img.getImage();
image.scalePercent(img.getScale());
PdfImage stream = new PdfImage(image, "", null);
stream.put(
new PdfName("ITXT_SpecialId"),
new PdfName("123456789")
);
PdfIndirectObject ref = stamper.getWriter().addToBody(stream);
image.setDirectReference(ref.getIndirectReference());
image.setAbsolutePosition(
img.getWidthPosition(),
img.getHeightPosition()
);
PdfContentByte over = stamper.getOverContent(1);
over.addImage(image);
}
stamper.close();
reader.close();
}
}
 3. Create the PDF generator interface
import java.io.File;
import java.io.IOException;
import java.util.List;
import javax.servlet.http.HttpServletRequest;
import com.itextpdf.text.BadElementException;
/**
* PDFInterface
* Interface to define the behavior a PDF model has to implement.
*/
public interface PDFInterface {
/**
* getUrl
* #param request the HTTPServletRequest to fetch parameters for the PDF
* #return the URL target to make a HTTP POST request to get the generated
* HTML (for example if you are making a HTTP POST on a JSP to generate
* HTML dynamically.
*/
String getUrl(HttpServletRequest request);
/**
* getHTMLFile
* #return return the HTML file from the local storage to be read to get
* the HTML.
*/
File getHTMLFile();
/**
* setParametres
* #param object : an object or a list of objects to be encoded to the
* query String to generate the PDF.
*/
void setParametres(Candidat candidat);
String getEncodedQueryString();
/**
* getImages
* #return a custom ImagePDF object with needed attributes to add an image
* after the PDF has been generated has the HTML cannot be read to get
* image during the generation of the PDF.
*/
List<ImagePDF> getImages();
/**
* addImage
* #param url : the URL to get the image
* #param x : the X position
* #param y : the Y position
* #param name : the name of the image
* #param request : the HTTPServletRequest to generate the relative link
* to fetch the image.
* #param scale : the scale of the image
* #throws BadElementException
* #throws IOException
*/
void addImage(
String url,
float x,
float y,
String name,
HttpServletRequest request,
float scale
) throws BadElementException, IOException;
/**
* getPDFFilename
* #return : the name of the PDF file to be generated
*/
String getPDFFilename();
}
4. The ImagePDF object (in case you need to add image to your PDF)
import java.io.IOException;
import com.itextpdf.text.BadElementException;
import com.itextpdf.text.Image;
/**
* ImagePDF
* Class for a custom ImagePDF object to fit needs to stamp an image on a
* generated PDF (URI to get the image, scale, positions x y ...).
*/
public class ImagePDF implements java.io.Serializable {
private static final long serialVersionUID = 1L;
private Image image;
private float widthPosition;
private float heightPosition;
private String name;
private Float scale;
/**
* ImagePDF
* #param urlImage : the URL to fetch the image
* #param heightPosition : the y position on the PDF canvas
* #param widthPosition : the x position on the PDF canvas
* #param name : the name of the image
* #param scale : the scale of the image on the PDF canvas
* #throws BadElementException
* #throws IOException
*/
public ImagePDF(
String urlImage,
float widthPosition,
float heightPosition,
String name,
Float scale
) throws BadElementException, IOException {
this.image = Image.getInstance(urlImage);
this.heightPosition = heightPosition;
this.widthPosition = widthPosition;
this.name = name;
this.scale = scale;
}
// Getters and setters ...
 5. Implements your interface for your PDF parameters
(used in example above)
/**
* PDFImplementationOne
* The PDFImplementation to generate a specific PDF.
*/
public class PDFImplementationOne implements PDFInterface {
private static final String PARAM_1 = "param1";
private static final String PARAM_2 = "param2";
private Map<String, String> parameters;
private List<ImagePDF> images;
/**
* PDFImplementationOne
* You can pass service to add information retreival from DB or objects to
* pass to parameters in the constructor if needed.
*/
public PDFImplementationOne (CustomObject aParameter) {
this.parameters = new HashMap<>();
this.images = new ArrayList<>();
// in case you need parameters, passed in constructor
setParametres(aParameter);
}
/* (non-Javadoc)
* #see main.java.interface.PDFInterface#getUrl()
*/
#Override
public String getUrl(HttpServletRequest request) {
/*
* This is an example in case your generate your HTML from JSP with
* parameters, if it is from static file then return null
*/
StringBuilder sb = new StringBuilder("http://");
sb.append(request.getServerName());
sb.append((request.getServerName().startsWith("127.0.0")?":8080":""));
sb.append("/MyApp/urlToJSP");
return sb.toString();
}
/*
* (non-Javadoc)
* #see main.java.interface.PDFInterface#addImage(
* java.lang.String,
* float,
* float,
* java.lang.String,
* javax.servlet.http.HttpServletRequest,
* float scale
* )
*/
#Override
public void addImage(
String fileName,
float x,
float y,
String name,
HttpServletRequest request
) {
/*
* Here I get the image from a ressource server but you can read the
* image from local storage as well
*/
StringBuilder url = new StringBuilder("http://");
url.append(request.getServerName());
url.append(request.getServerName().startsWith("127.0.0")?":8080":"");
url.append("/MyApp/img/");
url.append(fileName);
try {
ImagePDF image = new ImagePDF(url.toString(), x, y, name, scale);
images.add(image);
}
catch (BadElementException | IOException e) {
System.out.println(Cannot set image for PDF "+url.toString());
}
}
/* (non-Javadoc)
* #see main.java.interface.PDFInterface#getImages()
*/
#Override
public List<ImagePDF> getImages() {
return this.images;
}
/* (non-Javadoc)
* #see main.java.interface.PDFInterface#setParameters(
* CustomObject customObject
* )
*/
#Override
public void setParametres(CustomObject customObject) {
parametres.put(PARAM_1, customObject.getAttribute().toString());
// may have other parameters ...
}
/* (non-Javadoc)
* #see model.bean.ResultatsEcritsPDF#getEncodedQueryString()
*/
#Override
public String getEncodedQueryString() {
/*
* Create the queryString to do further HTTP POST or GET to fetch the
* generated HTML with parameters
*/
StringBuilder queryStringBuilder = new StringBuilder("?");
parameters.entrySet().stream().forEach(e -> {
queryStringBuilder.append(e.getKey());
queryStringBuilder.append("=");
try {
queryStringBuilder.append(
URLEncoder.encode(
e.getValue() == null
? ""
: e.getValue(),
StandardCharsets.UTF_8.name()
)
);
}
catch (UnsupportedEncodingException e1) {
queryStringBuilder.append("");
}
queryStringBuilder.append("&");
});
// Remove the last &
return queryStringBuilder.toString().substring(
0,
queryStringBuilder.toString().length()-1
);
}
/* (non-Javadoc)
* #see model.bean.PDFInterface#getHTMLFile()
*/
#Override
public File getHTMLFile() {
return new File("/path/to/myHTMLFile.html");
}
/* (non-Javadoc)
* #see model.bean.PDFInterface#getPDFFilename()
*/
#Override
public String getPDFFilename() {
return "myPDF.pdf";
}
}
Tell me if it needs some clarification.
EDIT 2020
Things are much simpler now with the improvement of libraries and calling HTTP server itself to generate the dynamic HTML content is not enough simpler and requires additional network configuration in some case.
Here is the new process :
Make a HTML template with CSS2.1 (with <style> tag or inline style=") and include template expressions (EL-style or whatever)
Fetch HTML template as String
Replace template expressions "${ }" in HTML
Replace images in HTML like <img src="image.png" /> by encoded base64 images
Make the PDF file
Write it to response or whatever
Here is the project structure I am using (for example) :
main
|--java
|--bean
|--PdfConverter.java
|--resources
|--pdf
|--template.html
|--img
|--image.png
Dependencies :
<dependency>
<groupId>com.github.librepdf</groupId>
<artifactId>openpdf</artifactId>
<version>1.3.20</version>
</dependency>
<dependency>
<groupId>org.xhtmlrenderer</groupId>
<artifactId>flying-saucer-core</artifactId>
<version>9.1.20</version>
</dependency>
<dependency>
<groupId>org.xhtmlrenderer</groupId>
<artifactId>flying-saucer-pdf-openpdf</artifactId>
<version>9.1.20</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-text</artifactId>
<version>1.9</version>
</dependency>
HTML template (with images) :
<html>
<head>
<style>
body {
font-family:sans-serif;
font-size:14px;
margin: 0 auto;
padding: 0;
}
h1 {
text-align:center;
font-size:21px;
text-transform:capitalize;
}
</style>
</head>
<body>
<h1>some title</h1>
<p>Some paragraph : ${ foo }</p>
<!-- you can style images with CSS! -->
<img src="image.png" style="width:50px;height:50px" />
</body>
</html>
PdfConverter :
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.Base64;
import java.util.Map;
import java.util.Scanner;
import javax.servlet.ServletContext;
import javax.servlet.ServletOutputStream;
import javax.servlet.http.HttpServletResponse;
import org.apache.commons.lang.StringUtils;
import org.apache.commons.text.StringSubstitutor;
import org.apache.poi.util.IOUtils;
import org.springframework.http.MediaType;
import org.xhtmlrenderer.pdf.ITextRenderer;
/**
* PdfConverter
* Extends this to implement additional to make the map to replace template
* expressions.
* #author user
* #since 28 juil. 2020
*/
public class PdfConverter {
/**
* Temp directory.
*/
private static final String TMP_DIR =
System.getProperty("java.io.tmpdir") + "/";
/**
* Directory to HTML templates (dedicated to PDF generation).
*/
private static final String PDF_DIR =
"pdf/";
/**
* Directory to the image folders (dedicated to PDF generation).
*/
private static final String PDF_IMG_DIR =
"pdf/img/";
/**
* Prefixes for templates expressions.
*/
private static final String PREFIX_TEMPLATE = "${ ";
/**
* Suffixes for template expressions.
*/
private static final String SUFFIX_TEMPLATE = " }";
/**
* Generated PDF file.
*/
private File generatedPDF;
/**
* PDF file name.
*/
private String pdfName;
/**
* PdfConverter
* #param m map key, value to replace, to replace expressions in HTML
* template.
* #param s ServletContext to get resources from context path.
* #param fileName desired name of the generated PDF.
* #param template name of the HTML template to make the PDF.
* #throws IOExceptio
*/
public PdfConverter(
Map<String, String> m,
ServletContext s,
String fileName,
String template
) throws IOException {
// Set PDF filename
setPdfName(fileName);
// Fetch HTML template
#SuppressWarnings("resource")
String html = new Scanner(
s.getResourceAsStream(PDF_DIR+ template),
StandardCharsets.UTF_8.toString()
).useDelimiter("\\A").next();
/*
* Replace template expressions "${ }" in HTML
*/
StringSubstitutor sub = new StringSubstitutor(
m,
PREFIX_TEMPLATE,
SUFFIX_TEMPLATE
);
String resolvedString = sub.replace(html);
/*
* Replace images like <img src="image.png" /> by
* <img src=\"data:image/png;base64," + base64Image
*/
String[] imgs = StringUtils.substringsBetween(
resolvedString,
"<img src=\"", "\""
);
for (String s1 : imgs) {
String mime = Files.probeContentType(Paths.get(PDF_IMG_DIR + s1));
resolvedString = resolvedString.replace(
s1,
"data:" + mime + ";base64,"
+ Base64.getEncoder().encodeToString(
IOUtils.toByteArray(
s.getResourceAsStream(PDF_IMG_DIR + s1)
)
)
);
}
// Make the PDF file
FileOutputStream fos = new FileOutputStream(TMP_DIR+getPdfName());
ITextRenderer it = new ITextRenderer();
it.setDocumentFromString(resolvedString);
it.layout();
it.createPDF(fos);
fos.close();
// Set the PDF generated file to this PdfConverter instance
setGeneratedPDF(new File(TMP_DIR+getPdfName()));
}
/**
* getGeneratedPDF
*
* #return the generatedPDF
*/
public File getGeneratedPDF() {
return generatedPDF;
}
/**
* setGeneratedPDF
*
* #param generatedPDF the generatedPDF to set
*/
public void setGeneratedPDF(File generatedPDF) {
this.generatedPDF = generatedPDF;
}
/**
* getPdfName
*
* #return the pdfName
*/
public String getPdfName() {
return pdfName;
}
/**
* setPdfName
*
* #param pdfName the pdfName to set
*/
public void setPdfName(String pdfName) {
this.pdfName = pdfName;
}
/**
* writePdfToResponse
* Write the PDF file into the response and delete it from temp directory
* afterwards.
* #param response
* #throws IOException
*/
public void writePdfToResponse(
HttpServletResponse response
) throws IOException {
try (
FileInputStream fis =
new FileInputStream(getGeneratedPDF())
) {
response.setContentType(MediaType.APPLICATION_PDF_VALUE);
response.setHeader(
"Content-Disposition",
"inline; filename=" + getPdfName()
);
response.addHeader(
"Content-Length",
Long.toString(getGeneratedPDF().length())
);
ServletOutputStream servletOutputStream =
response.getOutputStream();
int read = 0;
byte[] bytes = new byte[1024];
while ((read = fis.read(bytes)) != -1) {
servletOutputStream.write(bytes, 0, read);
}
response.flushBuffer();
}
catch (IOException ioe) {
response.setContentType(MediaType.TEXT_PLAIN_VALUE);
response.getWriter().print("Cannot render PDF file.");
response.flushBuffer();
}
finally {
// Delete generated PDF after writing it to the response
getGeneratedPDF().delete();
}
}
}
And how to use it in servlet (Spring MVC example) :
/**
* downloadPDF
*
* #param response
* #param foo
* #throws IOException
*/
#PostMapping("/downloadPDF")
public void downloadPDF(
HttpServletRequest request,
HttpServletResponse response,
String foo
) throws IOException {
Map<String, String> m = new HashMap<>();
m.put("foo", "my_foo_value");
PdfConverter pdfConverter = new PdfConverter(
m,
request.getServletContext(),
"my_pdf",
"template"
);
pdfConverter.writePdfToResponse(response);
}

Related

How to use PDFBox to extract all text on a page that is NOT behind an image?

I need to extract all text on a page that is not behind an image, OCR style.
So far, I use PrintImageLocations to get image locations. I do a translation from image coordinates to character coordinates. Then I use a modified version of PDFTextStripperByArea to get the text not behind any image location.
It works but... is there a simpler, one pass, way to get the text that is not behind an image?
Here is my modified version of PDFTextStripperByArea for retrieving text excluded from the areas entered:
package tester;
import java.awt.geom.Rectangle2D;
import java.io.IOException;
import java.io.StringWriter;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.text.PDFTextStripper;
import org.apache.pdfbox.text.TextPosition;
/**
* This will extract text from a specified region in the PDF.
*
* #author Ben Litchfield
*/
public class PDFTextStripperByAreaAndExcluded_original extends PDFTextStripper
{
private final ArrayList<List<TextPosition>> excludedCharacterList = new ArrayList<List<TextPosition>>();
private StringWriter excludedText = new StringWriter();
private final List<String> regions = new ArrayList<String>();
private final Map<String, Rectangle2D> regionArea = new HashMap<String, Rectangle2D>();
private final Map<String, ArrayList<List<TextPosition>>> regionCharacterList
= new HashMap<String, ArrayList<List<TextPosition>>>();
private final Map<String, StringWriter> regionText = new HashMap<String, StringWriter>();
/**
* Constructor.
* #throws IOException If there is an error loading properties.
*/
public PDFTextStripperByAreaAndExcluded_original() throws IOException
{
super.setShouldSeparateByBeads(false);
}
/**
* This method does nothing in this derived class, because beads and regions are incompatible. Beads are
* ignored when stripping by area.
*
* #param aShouldSeparateByBeads The new grouping of beads.
*/
#Override
public final void setShouldSeparateByBeads(boolean aShouldSeparateByBeads)
{
}
/**
* Add a new region to group text by.
*
* #param regionName The name of the region.
* #param rect The rectangle area to retrieve the text from. The y-coordinates are java
* coordinates (y == 0 is top), not PDF coordinates (y == 0 is bottom).
*/
public void addRegion( String regionName, Rectangle2D rect )
{
regions.add( regionName );
regionArea.put( regionName, rect );
}
/**
* Delete a region to group text by. If the region does not exist, this method does nothing.
*
* #param regionName The name of the region to delete.
*/
public void removeRegion(String regionName)
{
regions.remove(regionName);
regionArea.remove(regionName);
}
/**
* Get the list of regions that have been setup.
*
* #return A list of java.lang.String objects to identify the region names.
*/
public List<String> getRegions()
{
return regions;
}
/**
* Get the text for the region, this should be called after extractRegions().
*
* #param regionName The name of the region to get the text from.
* #return The text that was identified in that region.
*/
public String getTextForRegion( String regionName )
{
StringWriter text = regionText.get( regionName );
return text.toString();
}
/**
* Get the text excluded from all regions, this should be called after extractRegions().
*
* #return The text that was identified as not in any region.
*/
public String getTextExcluded( )
{
return excludedText.toString();
}
/**
* Process the page to extract the region text.
*
* #param page The page to extract the regions from.
* #throws IOException If there is an error while extracting text.
*/
public void extractRegions( PDPage page ) throws IOException
{
setStartPage(getCurrentPageNo());
setEndPage(getCurrentPageNo());
excludedCharacterList.add( new ArrayList<TextPosition>() );
excludedText = new StringWriter();
for (String region : regions)
{
setStartPage(getCurrentPageNo());
setEndPage(getCurrentPageNo());
//reset the stored text for the region so this class
//can be reused.
String regionName = region;
ArrayList<List<TextPosition>> regionCharactersByArticle = new ArrayList<List<TextPosition>>();
regionCharactersByArticle.add( new ArrayList<TextPosition>() );
regionCharacterList.put( regionName, regionCharactersByArticle );
regionText.put( regionName, new StringWriter() );
}
if( page.hasContents() )
{
processPage( page );
}
}
/**
* {#inheritDoc}
*/
#Override
protected void processTextPosition(TextPosition text)
{
boolean included = false;
for (Map.Entry<String, Rectangle2D> regionAreaEntry : regionArea.entrySet())
{
Rectangle2D rect = regionAreaEntry.getValue();
if (rect.contains(text.getX(), text.getY()))
{
included = true;
charactersByArticle = regionCharacterList.get(regionAreaEntry.getKey());
super.processTextPosition(text);
}
}
if(!included) {
charactersByArticle = excludedCharacterList;
super.processTextPosition(text);
}
}
/**
* This will print the processed page text to the output stream.
*
* #throws IOException If there is an error writing the text.
*/
#Override
protected void writePage() throws IOException
{
for (String region : regionArea.keySet())
{
charactersByArticle = regionCharacterList.get( region );
output = regionText.get( region );
super.writePage();
}
charactersByArticle = excludedCharacterList;
output = excludedText;
super.writePage();
}
}

Adding Content-Type Headers to Java HTTP echo server

I'm playing with HTTPservers and running into an issue. I need to add ContentType headers to the response, but when I do the client gets ERR_EMPTY_RESPONSE.
If I remove:
headers.add("Content-Type", "text/html");
Then the server works fine, but I need to pass the CType headers for my app. What gives? How do I include Content-Type headers?
/*
* EchoServer.java
*
* Accept an HTTP request and echo it back as the HTTP response.
*
* Copyright (c) 2005 Sun Microsystems, Inc
* Copyright (c) 2008 Operational Dynamics Consulting, Pty Ltd
*
* The code in this file is made available to you by its authors under the
* terms of the "GNU General Public Licence, version 2" See the LICENCE file
* for the terms governing usage and redistribution.
*/
/*
* This code is a simple derivation of the example in the package
* documentation for com.sun.net.httpserver, as found in file
* jdk/src/share/classes/com/sun/net/httpserver/package-info.java as shipped
* with the openjdk 1.6 b08 code drop. Used under the terms of the GPLv2.
*/
import static java.net.HttpURLConnection.HTTP_OK;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.net.InetSocketAddress;
import java.net.URLDecoder;
import java.util.List;
import com.sun.net.httpserver.Headers;
import com.sun.net.httpserver.HttpExchange;
import com.sun.net.httpserver.HttpExchange.*;
import com.sun.net.httpserver.HttpHandler;
import com.sun.net.httpserver.HttpServer;
/**
* Echo the body of an HTTP request back as the HTTP response. This is merely
* a simple exercise of the Secret Sun Web Server. As configured, the URL to
* access it is http://localhost:8000/echo.
*
* #author Andrew Cowie
*/
public final class Test {
public static void main(String[] args) throws IOException {
final InetSocketAddress addr;
final HttpServer server;
addr = new InetSocketAddress(8000);
server = HttpServer.create(addr, 10);
server.createContext("/echo", new EchoHandler());
server.start();
}
}
class EchoHandler implements HttpHandler {
public void handle(HttpExchange t) throws IOException {
final InputStream is;
final OutputStream os;
StringBuilder buf;
int b;
final String request, response;
buf = new StringBuilder();
/*
* Get the request body and decode it. Regardless of what you are
* actually doing, it is apparently considered correct form to consume
* all the bytes from the InputStream. If you don't, closing the
* OutputStream will cause that to occur
*/
is = t.getRequestBody();
while ((b = is.read()) != -1) {
buf.append((char) b);
}
is.close();
if (buf.length() > 0) {
request = URLDecoder.decode(buf.toString(), "UTF-8");
} else {
request = null;
}
/*
* Construct our response:
*/
buf = new StringBuilder();
buf.append("<html><head><title>HTTP echo server</title></head><body>");
buf.append("<p><pre>");
buf.append(t.getRequestMethod() + " " + t.getRequestURI() + " " + t.getProtocol() + "\n");
/*
* Process the request headers. This is a bit involved due to the
* complexity arising from the fact that headers can be repeated.
*/
Headers headers = t.getRequestHeaders();
for (String name : headers.keySet()) {
List<String> values = headers.get(name);
for (String value : values) {
buf.append(name + ": " + value + "\n");
}
}
/*
* If there was an actual body to the request, add it:
*/
if (request != null) {
buf.append("\n");
buf.append(request);
}
buf.append("</pre></p>");
buf.append("</body></html>\n");
response = buf.toString();
System.out.println(response);
/*
* And now send the response. We could have instead done this
* dynamically, using 0 as the response size (forcing chunked
* encoding) and writing the bytes of the response directly to the
* OutputStream, but building the String first allows us to know the
* exact length so we can send a response with a known size. Better :)
*/
headers.add("Content-Type", "text/html");
t.sendResponseHeaders(HTTP_OK, response.length());
os = t.getResponseBody();
os.write(response.getBytes());
/*
* And we're done!
*/
os.close();
t.close();
}
}

You are extracting the headers from the request and changing request headers doesn't make any sense. you will need to modify the response headers. You can do so by the adding the following
t.getResponseHeaders().add("Content-Type", "text/html");

Integrating Pentaho Reporting with Java web application

I am trying to integrate my Pentaho generated reports with a Java application. My reports are based on OLAP data and written using MDX queries. I found an example on one of the blogs and used it as foundation. My code is:
package org.pentaho.reporting.engine.classic.samples;
import in.nic.spaconsole.ServletContextProvider;
import java.io.File;
import java.io.IOException;
import java.net.URL;
import java.util.Map;
import java.util.HashMap;
import org.pentaho.reporting.engine.classic.core.DataFactory;
import org.pentaho.reporting.engine.classic.core.MasterReport;
import org.pentaho.reporting.engine.classic.core.ReportProcessingException;
import org.pentaho.reporting.libraries.resourceloader.Resource;
import org.pentaho.reporting.libraries.resourceloader.ResourceException;
import org.pentaho.reporting.libraries.resourceloader.ResourceManager;
/**
* Generates a report in the following scenario:
* <ol>
* <li>The report definition file is a .prpt file which will be loaded and parsed
* <li>The data factory is a simple JDBC data factory using HSQLDB
* <li>There are no runtime report parameters used
* </ol>
*/
public class Sample1 extends AbstractReportGenerator
{
/**
* Default constructor for this sample report generator
*/
public Sample1()
{
}
/**
* Returns the report definition which will be used to generate the report. In this case, the report will be
* loaded and parsed from a file contained in this package.
*
* #return the loaded and parsed report definition to be used in report generation.
*/
public MasterReport getReportDefinition(String reportPath)
{
ResourceManager manager = new ResourceManager();
manager.registerDefaults();
try {
Resource res = manager.createDirectly(new URL(reportPath),
MasterReport.class);
MasterReport report = (MasterReport) res.getResource();
return report;
} catch (Exception e) {
e.printStackTrace();
}
return null;
}
/**
* Returns the data factory which will be used to generate the data used during report generation. In this example,
* we will return null since the data factory has been defined in the report definition.
*
* #return the data factory used with the report generator
*/
public DataFactory getDataFactory()
{
return null;
}
/**
* Returns the set of runtime report parameters. This sample report uses the following three parameters:
* <ul>
* <li><b>Report Title</b> - The title text on the top of the report</li>
* <li><b>Customer Names</b> - an array of customer names to show in the report</li>
* <li><b>Col Headers BG Color</b> - the background color for the column headers</li>
* </ul>
*
* #return <code>null</code> indicating the report generator does not use any report parameters
*/
public Map<String, Object> getReportParameters()
{
final Map parameters = new HashMap<String, Object>();
parameters.put("stday", 28);
parameters.put("styear", 2012);
parameters.put("stmonth", 10);
parameters.put("eday", 28);
parameters.put("eyear", 2012);
parameters.put("emonth", 10);
parameters.put("Sitesearch","india.gov.in");
parameters.put("firstResult",1);
parameters.put("lastResult", 100);
parameters.put("Pagenumber",1);
return parameters;
}
/**
* Simple command line application that will generate a PDF version of the report. In this report,
* the report definition has already been created with the Pentaho Report Designer application and
* it located in the same package as this class. The data query is located in that report definition
* as well, and there are a few report-modifying parameters that will be passed to the engine at runtime.
* <p/>
* The output of this report will be a PDF file located in the current directory and will be named
* <code>SimpleReportGeneratorExample.pdf</code>.
*
* #param args none
* #throws IOException indicates an error writing to the filesystem
* #throws ReportProcessingException indicates an error generating the report
*/
public static void main(String[] args) throws IOException, ReportProcessingException
{
final File outputFilenamehtml = new File(Sample1.class.getSimpleName() + ".html");
final File outputFilenamepdf = new File(Sample1.class.getSimpleName() + ".pdf");
// Generate the report
// new Sample1().generateReport(AbstractReportGenerator.OutputType.PDF, outputFilenamepdf);
// System.err.println("Generated the report [" + outputFilenamepdf.getAbsolutePath() + "]");
// new Sample1().generateReport(AbstractReportGenerator.OutputType.HTML, outputFilenamehtml);
// Output the location of the file
//System.err.println("Generated the report111 [" + outputFilenamehtml.getAbsolutePath() + "]");
}
public void report(String Path) throws IllegalArgumentException, ReportProcessingException, IOException {
// TODO Auto-generated method stub
// final File outputFilenamehtml = new File(Sample1.class.getSimpleName() + ".html");
final File outputFilenamepdf = new File(Sample1.class.getSimpleName() + ".pdf");
// Generate the report
new Sample1().generateReport(AbstractReportGenerator.OutputType.PDF, outputFilenamepdf,Path);
System.err.println("Generated the report [" + outputFilenamepdf.getAbsolutePath() + "]");
// new Sample1().generateReport(AbstractReportGenerator.OutputType.HTML, outputFilenamehtml,Path);
// Output the location of the file
// System.err.println("Generated the report111 [" + outputFilenamehtml.getAbsolutePath() + "]");
}
}
Abstractreportgenerator.java
/*
* This program is free software; you can redistribute it and/or modify it under the
* terms of the GNU Lesser General Public License, version 2.1 as published by the Free Software
* Foundation.
*
* You should have received a copy of the GNU Lesser General Public License along with this
* program; if not, you can obtain a copy at http://www.gnu.org/licenses/old-licenses/lgpl-2.1.html
* or from the Free Software Foundation, Inc.,
* 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
*
* This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY;
* without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
* See the GNU Lesser General Public License for more details.
*
* Copyright (c) 2009 Pentaho Corporation.. All rights reserved.
*/
package org.pentaho.reporting.engine.classic.samples;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.util.Map;
import org.pentaho.reporting.engine.classic.core.ClassicEngineBoot;
import org.pentaho.reporting.engine.classic.core.DataFactory;
import org.pentaho.reporting.engine.classic.core.MasterReport;
import org.pentaho.reporting.engine.classic.core.ReportProcessingException;
import org.pentaho.reporting.engine.classic.core.layout.output.AbstractReportProcessor;
import org.pentaho.reporting.engine.classic.core.modules.output.pageable.base.PageableReportProcessor;
import org.pentaho.reporting.engine.classic.core.modules.output.pageable.pdf.PdfOutputProcessor;
import org.pentaho.reporting.engine.classic.core.modules.output.table.base.FlowReportProcessor;
import org.pentaho.reporting.engine.classic.core.modules.output.table.base.StreamReportProcessor;
import org.pentaho.reporting.engine.classic.core.modules.output.table.html.AllItemsHtmlPrinter;
import org.pentaho.reporting.engine.classic.core.modules.output.table.html.FileSystemURLRewriter;
import org.pentaho.reporting.engine.classic.core.modules.output.table.html.HtmlOutputProcessor;
import org.pentaho.reporting.engine.classic.core.modules.output.table.html.HtmlPrinter;
import org.pentaho.reporting.engine.classic.core.modules.output.table.html.StreamHtmlOutputProcessor;
import org.pentaho.reporting.engine.classic.core.modules.output.table.xls.FlowExcelOutputProcessor;
import org.pentaho.reporting.libraries.repository.ContentLocation;
import org.pentaho.reporting.libraries.repository.DefaultNameGenerator;
import org.pentaho.reporting.libraries.repository.stream.StreamRepository;
/**
* This is the base class used with the report generation examples. It contains the actual <code>embedding</code>
* of the reporting engine and report generation. All example embedded implementations will need to extend this class
* and perform the following:
* <ol>
* <li>Implement the <code>getReportDefinition()</code> method and return the report definition (how the report
* definition is generated is up to the implementing class).
* <li>Implement the <code>getTableDataFactory()</code> method and return the data factory to be used (how
* this is created is up to the implementing class).
* <li>Implement the <code>getReportParameters()</code> method and return the set of report parameters to be used.
* If no report parameters are required, then this method can simply return <code>null</code>
* </ol>
*/
public abstract class AbstractReportGenerator
{
/**
* The supported output types for this sample
*/
public static enum OutputType
{
PDF, EXCEL, HTML
}
/**
* Performs the basic initialization required to generate a report
*/
public AbstractReportGenerator()
{
// Initialize the reporting engine
ClassicEngineBoot.getInstance().start();
}
/**
* Returns the report definition used by this report generator. If this method returns <code>null</code>,
* the report generation process will throw a <code>NullPointerException</code>.
*
* #return the report definition used by thus report generator
*/
public abstract MasterReport getReportDefinition(String Path);
/**
* Returns the data factory used by this report generator. If this method returns <code>null</code>,
* the report generation process will use the data factory used in the report definition.
*
* #return the data factory used by this report generator
*/
public abstract DataFactory getDataFactory();
/**
* Returns the set of parameters that will be passed to the report generation process. If there are no parameters
* required for report generation, this method may return either an empty or a <code>null</code> <code>Map</code>
*
* #return the set of report parameters to be used by the report generation process, or <code>null</code> if no
* parameters are required.
*/
public abstract Map<String, Object> getReportParameters();
/**
* Generates the report in the specified <code>outputType</code> and writes it into the specified
* <code>outputFile</code>.
*
* #param outputType the output type of the report (HTML, PDF, HTML)
* #param outputStream2 the file into which the report will be written
* #throws IllegalArgumentException indicates the required parameters were not provided
* #throws IOException indicates an error opening the file for writing
* #throws ReportProcessingException indicates an error generating the report
*/
public void generateReport(final OutputType outputType,File outputFile,String Path)
throws IllegalArgumentException, IOException, ReportProcessingException
{
if (outputFile == null)
{
throw new IllegalArgumentException("The output file was not specified");
}
OutputStream outputStream = null;
try
{
// Open the output stream
outputStream = new BufferedOutputStream(new FileOutputStream(outputFile));
// Generate the report to this output stream
generateReport(outputType, outputStream,Path);
}
finally
{
if (outputStream != null)
{
outputStream.close();
}
}
}
/**
* Generates the report in the specified <code>outputType</code> and writes it into the specified
* <code>outputStream</code>.
* <p/>
* It is the responsibility of the caller to close the <code>outputStream</code>
* after this method is executed.
*
* #param outputType the output type of the report (HTML, PDF, HTML)
* #param outputStream the stream into which the report will be written
* #throws IllegalArgumentException indicates the required parameters were not provided
* #throws ReportProcessingException indicates an error generating the report
*/
public void generateReport(final OutputType outputType, OutputStream outputStream,String Path)
throws IllegalArgumentException, ReportProcessingException
{
if (outputStream == null)
{
throw new IllegalArgumentException("The output stream was not specified");
}
// Get the report and data factory
final MasterReport report = getReportDefinition(Path);
final DataFactory dataFactory = getDataFactory();
// Set the data factory for the report
if (dataFactory != null)
{
report.setDataFactory(dataFactory);
}
// Add any parameters to the report
final Map<String, Object> reportParameters = getReportParameters();
if (null != reportParameters)
{
for (String key : reportParameters.keySet())
{
report.getParameterValues().put(key, reportParameters.get(key));
}
}
// Prepare to generate the report
AbstractReportProcessor reportProcessor = null;
try
{
// Greate the report processor for the specified output type
switch (outputType)
{
case PDF:
{
final PdfOutputProcessor outputProcessor =
new PdfOutputProcessor(report.getConfiguration(), outputStream, report.getResourceManager());
reportProcessor = new PageableReportProcessor(report, outputProcessor);
break;
}
case EXCEL:
{
final FlowExcelOutputProcessor target =
new FlowExcelOutputProcessor(report.getConfiguration(), outputStream, report.getResourceManager());
reportProcessor = new FlowReportProcessor(report, target);
break;
}
case HTML:
{
final StreamRepository targetRepository = new StreamRepository(outputStream);
final ContentLocation targetRoot = targetRepository.getRoot();
final HtmlOutputProcessor outputProcessor = new StreamHtmlOutputProcessor(report.getConfiguration());
final HtmlPrinter printer = new AllItemsHtmlPrinter(report.getResourceManager());
printer.setContentWriter(targetRoot, new DefaultNameGenerator(targetRoot, "index", "html"));
printer.setDataWriter(null, null);
printer.setUrlRewriter(new FileSystemURLRewriter());
outputProcessor.setPrinter(printer);
reportProcessor = new StreamReportProcessor(report, outputProcessor);
break;
}
}
// Generate the report
reportProcessor.processReport();
}
finally
{
if (reportProcessor != null)
{
reportProcessor.close();
}
}
}
}
and I have a controller that ivokes this sample1.java
#POST
#Path("/get/reportDisplay")
#Consumes(MediaType.APPLICATION_JSON)
#Produces(MediaType.TEXT_HTML)
public Response exportReport(final ReportBean reportBean,
#Context final HttpServletRequest request,
#Context final HttpServletResponse response,
#Context final ServletContext context) throws IOException,
ParseException, InstantiationException, IllegalAccessException,
ClassNotFoundException, SQLException, JRException, IllegalArgumentException,ReportProcessingException {
String reportPath ="file://" +context.getRealPath("anor_admin.prpt");
Sample1 sample=new Sample1();
sample.report(reportPath);
return Response.status(200).build();
}
//End of Methods
} //End of class
But I am getting errors and not able to preview my reports. Please help me with this.

I've successfully embedded Report Engine in a web application but I had to fix some classpath errors in order to get it work.
Be sure your war includes report-engine-classic.core.jar and lib*.jar from pentaho-library.
You'll need extra dependencies if you use charts.
What kind of errors are you getting?

Programmatically downloading a file using Selenium in Java

I was looking for a Chrome Extension which can intercept the download whenever we click on a PDF link or on a link that spawns a PDF at server end programmatically. One way of doing it was Selenium Browser Profiling, I found this particular code. I want Selenium to download the PDF file and rename it according to the strings i pass from the JAVA program.
How can i use this code to download code and hook it up it with my program. It should trigger this whenever i execute a command like this:-
**driver.findElement(By.xpath("//a[contains(#href,\"/bbtobs/bbtolbext/statements/savepdf?type=current&AccountIndex=0\")]")).click();**
CODE-
package com.lazerycode.selenium.filedownloader;
import org.apache.commons.io.FileUtils;
import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.params.ClientPNames;
import org.apache.http.client.protocol.ClientContext;
import org.apache.http.impl.client.BasicCookieStore;
import org.apache.http.impl.client.DefaultHttpClient;
import org.apache.http.impl.cookie.BasicClientCookie;
import org.apache.http.params.HttpParams;
import org.apache.http.protocol.BasicHttpContext;
import org.apache.log4j.Logger;
import org.openqa.selenium.Cookie;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import java.io.File;
import java.io.IOException;
import java.net.URISyntaxException;
import java.net.URL;
import java.util.Set;
public class FileDownloader {
private static final Logger LOG = Logger.getLogger(FileDownloader.class);
private WebDriver driver;
private String localDownloadPath = System.getProperty("java.io.tmpdir");
private boolean followRedirects = true;
private boolean mimicWebDriverCookieState = true;
private int httpStatusOfLastDownloadAttempt = 0;
public FileDownloader(WebDriver driverObject) {
this.driver = driverObject;
}
/**
* Specify if the FileDownloader class should follow redirects when trying to download a file
*
* #param value
*/
public void followRedirectsWhenDownloading(boolean value) {
this.followRedirects = value;
}
/**
* Get the current location that files will be downloaded to.
*
* #return The filepath that the file will be downloaded to.
*/
public String localDownloadPath() {
return this.localDownloadPath;
}
/**
* Set the path that files will be downloaded to.
*
* #param filePath The filepath that the file will be downloaded to.
*/
public void localDownloadPath(String filePath) {
this.localDownloadPath = filePath;
}
/**
* Download the file specified in the href attribute of a WebElement
*
* #param element
* #return
* #throws Exception
*/
public String downloadFile(WebElement element) throws Exception {
return downloader(element, "href");
}
/**
* Download the image specified in the src attribute of a WebElement
*
* #param element
* #return
* #throws Exception
*/
public String downloadImage(WebElement element) throws Exception {
return downloader(element, "src");
}
/**
* Gets the HTTP status code of the last download file attempt
*
* #return
*/
public int getHTTPStatusOfLastDownloadAttempt() {
return this.httpStatusOfLastDownloadAttempt;
}
/**
* Mimic the cookie state of WebDriver (Defaults to true)
* This will enable you to access files that are only available when logged in.
* If set to false the connection will be made as an anonymouse user
*
* #param value
*/
public void mimicWebDriverCookieState(boolean value) {
this.mimicWebDriverCookieState = value;
}
/**
* Load in all the cookies WebDriver currently knows about so that we can mimic the browser cookie state
*
* #param seleniumCookieSet
* #return
*/
private BasicCookieStore mimicCookieState(Set seleniumCookieSet) {
BasicCookieStore mimicWebDriverCookieStore = new BasicCookieStore();
for (Cookie seleniumCookie : seleniumCookieSet) {
BasicClientCookie duplicateCookie = new BasicClientCookie(seleniumCookie.getName(), seleniumCookie.getValue());
duplicateCookie.setDomain(seleniumCookie.getDomain());
duplicateCookie.setSecure(seleniumCookie.isSecure());
duplicateCookie.setExpiryDate(seleniumCookie.getExpiry());
duplicateCookie.setPath(seleniumCookie.getPath());
mimicWebDriverCookieStore.addCookie(duplicateCookie);
}
return mimicWebDriverCookieStore;
}
/**
* Perform the file/image download.
*
* #param element
* #param attribute
* #return
* #throws IOException
* #throws NullPointerException
*/
private String downloader(WebElement element, String attribute) throws IOException, NullPointerException, URISyntaxException {
String fileToDownloadLocation = element.getAttribute(attribute);
if (fileToDownloadLocation.trim().equals("")) throw new NullPointerException("The element you have specified does not link to anything!");
URL fileToDownload = new URL(fileToDownloadLocation);
File downloadedFile = new File(this.localDownloadPath + fileToDownload.getFile().replaceFirst("/|\\\\", ""));
if (downloadedFile.canWrite() == false) downloadedFile.setWritable(true);
HttpClient client = new DefaultHttpClient();
BasicHttpContext localContext = new BasicHttpContext();
LOG.info("Mimic WebDriver cookie state: " + this.mimicWebDriverCookieState);
if (this.mimicWebDriverCookieState) {
localContext.setAttribute(ClientContext.COOKIE_STORE, mimicCookieState(this.driver.manage().getCookies()));
}
HttpGet httpget = new HttpGet(fileToDownload.toURI());
HttpParams httpRequestParameters = httpget.getParams();
httpRequestParameters.setParameter(ClientPNames.HANDLE_REDIRECTS, this.followRedirects);
httpget.setParams(httpRequestParameters);
LOG.info("Sending GET request for: " + httpget.getURI());
HttpResponse response = client.execute(httpget, localContext);
this.httpStatusOfLastDownloadAttempt = response.getStatusLine().getStatusCode();
LOG.info("HTTP GET request status: " + this.httpStatusOfLastDownloadAttempt);
LOG.info("Downloading file: " + downloadedFile.getName());
FileUtils.copyInputStreamToFile(response.getEntity().getContent(), downloadedFile);
response.getEntity().getContent().close();
String downloadedFileAbsolutePath = downloadedFile.getAbsolutePath();
LOG.info("File downloaded to '" + downloadedFileAbsolutePath + "'");
return downloadedFileAbsolutePath;
}
}

String s = driver.findElement(By.cssSelector("#navbtm img")).getAttribute("src");
URL url = new URL(s);
System.out.println(url);
BufferedImage bufImgOne = ImageIO.read(url);
ImageIO.write(bufImgOne, "png", new File("test.png"));

I have found an excellent article about it. Tried, and works perfectly:
http://ardesco.lazerycode.com/index.php/2012/07/how-to-download-files-with-selenium-and-why-you-shouldnt/
UPDATE:
Better solution is to get the image from the browser itself:
https://groups.google.com/forum/#!msg/selenium-users/8atiPIh39OY/Gp9_KEXnpRUJ

Can't read a google static map via HtmlUnit

I am trying to read a static map from Google Maps on the following URL.
This works fine from my web browser, but when I try this from HtmlUnit I get an UnexpectedPage result. Does anyone know what this means?

According to the javadoc of UnexpectedPage, you're receiving an UnexpectedPage because server returns an unexpected content type. If you check the returned header in HtmlUnit you can see that it contains: Content-Type=image/png
Here's a little application that retrieves an image from an URL:
import java.awt.Image;
import java.io.IOException;
import java.io.InputStream;
import javax.imageio.ImageIO;
import javax.swing.ImageIcon;
import javax.swing.JFrame;
import javax.swing.JLabel;
import com.gargoylesoftware.htmlunit.UnexpectedPage;
import com.gargoylesoftware.htmlunit.WebClient;
/** Small test application used to fetch a map. */
public class FetchMapSwingApp extends JFrame {
/** Serial Id. */
private static final long serialVersionUID = 1920071939468904323L;
/**
* Default constructor.
*/
public FetchMapSwingApp() {
// Make sure the application closes correctly
setDefaultCloseOperation(javax.swing.WindowConstants.EXIT_ON_CLOSE);
// The map where trying to read
String url = "http://maps.googleapis.com/maps/api/staticmap?center=55.690815,12.560678&zoom=15&size=400x500&sensor=false";
// Fetch the image
Image image = fetchMap(url);
// Add the image to the JFrame and resize the frame.
add(new JLabel(new ImageIcon(image)));
pack();
}
/**
* Fetch the image on the given URL.
*
* #param url
* the image location
* #return the fetched image
*/
private Image fetchMap(String url) {
Image image = null;
WebClient webClient = new WebClient();
webClient.setThrowExceptionOnScriptError(false);
try {
// The URL returns a png file!
UnexpectedPage page = webClient.getPage(url);
InputStream inputStream = page.getInputStream();
// Read the stream to an image
image = ImageIO.read(inputStream);
} catch (IOException e) {
e.printStackTrace();
}
return image;
}
/**
* Start of the application.
*
* #param args
* the arguments to the main method
*/
public static void main(String args[]) {
java.awt.EventQueue.invokeLater(new Runnable() {
#Override
public void run() {
new FetchMapSwingApp().setVisible(true);
}
});
}
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to generate pdf of html page in java - java

Related

How to use PDFBox to extract all text on a page that is NOT behind an image?

Adding Content-Type Headers to Java HTTP echo server

Integrating Pentaho Reporting with Java web application

Programmatically downloading a file using Selenium in Java

Can't read a google static map via HtmlUnit

Categories

Resources