Connection timed out while generating pdf from html string - java

I am using <dependency>
<groupId>org.xhtmlrenderer</groupId>
<artifactId>flying-saucer-pdf-itext5</artifactId>
<version>9.0.4</version>
</dependency>
to convert my HTML string to PDF.
try {
String table = getHtmlAsString();//returns html string which contains reference to external CSS
DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = builder.parse(new ByteArrayInputStream(table.getBytes("UTF-8")));
ITextRenderer renderer = new ITextRenderer();
renderer.setDocument(doc, null);
ByteArrayOutputStream byteArray = new ByteArrayOutputStream();
renderer.layout();
renderer.createPDF(byteArray);
byte[] pdf = byteArray.toByteArray();
byteArray.close();
writeByteArrayToFile(pdf);
} catch (Exception e) {
e.printStackTrace();
}
The code is working fine locally. But on production server Its throwing connection time out exception.
Here is complete stacks trace
java.net.ConnectException: Connection timed out: connect at
java.net.DualStackPlainSocketImpl.connect0(Native Method) at
java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source) at
java.net.AbstractPlainSocketImpl.doConnect(Unknown Source) at
java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source) at
java.net.AbstractPlainSocketImpl.connect(Unknown Source) at
java.net.PlainSocketImpl.connect(Unknown Source) at
java.net.SocksSocketImpl.connect(Unknown Source) at
java.net.Socket.connect(Unknown Source) at
java.net.Socket.connect(Unknown Source) at
sun.net.NetworkClient.doConnect(Unknown Source) at
sun.net.www.http.HttpClient.openServer(Unknown Source) at
sun.net.www.http.HttpClient.openServer(Unknown Source) at
sun.net.www.http.HttpClient.<init>(Unknown Source) at
sun.net.www.http.HttpClient.New(Unknown Source) at
sun.net.www.http.HttpClient.New(Unknown Source) at
sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(Unknown Source) at
sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source) at
sun.net.www.protocol.http.HttpURLConnection.connect(Unknown Source) at
sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source) at
org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown Source) at
org.apache.xerces.impl.XMLEntityManager.startEntity(Unknown Source) at
org.apache.xerces.impl.XMLEntityManager.startDTDEntity(Unknown Source) at
org.apache.xerces.impl.XMLDTDScannerImpl.setInputSource(Unknown Source) at
org.apache.xerces.impl.XMLDocumentScannerImpl$DTDDispatcher.dispatch(Unknown Source) at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at
org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at
org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at
org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at
org.apache.xerces.parsers.DOMParser.parse(Unknown Source) at
org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source) at
javax.xml.parsers.DocumentBuilder.parse(Unknown Source) at
com.mypackage.PDFProcessor.getPdf(PDFProcessor.java:84)
Line No.84 in code is Document doc = builder.parse(new ByteArrayInputStream(table.getBytes("UTF-8")));
At first I thought may be server is not able to fetch External CSS. So I saved html string as html file and I am able to see the page on browser. That means server is able to access the CSS.

You may still be right about your CSS, since the fact that the main html has been saved does not mean that the CSS is well saved (actually, we don't even know if it tries to save it, have you tried to do your work through a proxy like Paros in order to know every communication you attempt?)
http://sourceforge.net/projects/paros/
By the way, your problem may be different. Are you behind a corporate proxy in your production environment? Does your "flying saucer pdf library" need any external resource (any .xsd or something like that) that could be stuck in the firewall and cause that time-out? Make sure that you have under controll all the communication your application generates and you will find your problem.

Related

Not able to create any portlet on Liferay 6.2 CE GA4 Offline without internet

I have created a fresh Liferay 6.2 CE GA4 using pre-downloaded SDK/Tomcat/LR Source. I have created server in eclipse and when I try to create portlet for the first time it try to download "gradle-2.2.1-bin.zip" from the internet.
I have downloaded same from gradle from "https://downloads.gradle.org/distributions/gradle-2.2.1-bin.zip" and placed it inside "liferay-plugins-sdk-6.2\tools\gradle\gradle\wrapper".
Still it shows the same error as below:
`Downloading https://services.gradle.org/distributions/gradle-2.2.1-bin.zip
Exception in thread "main" java.net.ConnectException: Connection timed out: connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(Unknown Source)
at java.net.PlainSocketImpl.connectToAddress(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.connect(Unknown Source)
at com.sun.net.ssl.internal.ssl.BaseSSLSocketImpl.connect(Unknown Source)
at sun.net.NetworkClient.doConnect(Unknown Source)
at sun.net.www.http.HttpClient.openServer(Unknown Source)
at sun.net.www.http.HttpClient.openServer(Unknown Source)
at sun.net.www.protocol.https.HttpsClient.(Unknown Source)
at sun.net.www.protocol.https.HttpsClient.New(Unknown Source)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(Unknown Source)
at org.gradle.wrapper.Download.downloadInternal(Download.java:56)
at org.gradle.wrapper.Download.download(Download.java:42)
at org.gradle.wrapper.Install$1.call(Install.java:57)
at org.gradle.wrapper.Install$1.call(Install.java:44)
at org.gradle.wrapper.ExclusiveFileAccessManager.access(ExclusiveFileAccessManager.java:65)
at org.gradle.wrapper.Install.createDist(Install.java:44)
at org.gradle.wrapper.WrapperExecutor.execute(WrapperExecutor.java:126)
at org.gradle.wrapper.GradleWrapperMain.main(GradleWrapperMain.java:56)
`
Am I placing that zip on wrong directory or missing something else ?
From the error message, I'd say: Yes, it seems you put the file into the wrong directory. Which one is the correct one? I don't know. It'd be easiest to do your first build while connected to the internet. The download will just happen once (also, there might be more dependencies that need to be downloaded) and you can do your builds offline after that.
To resolve your problem, you need to:
1) Delete content on:
D:\Profiles\XXX.gradle\wrapper\dists\gradle-2.2.1-bin\88n1whbyjvxg3s40jzz5ur27
2) Download manually:
https://services.gradle.org/distributions/gradle-2.2.1-bin.zip
3) Put gradle-2.2.1-bin.zip on:
D:\Profiles\XXX.gradle\wrapper\dists\gradle-2.2.1-bin\88n1whbyjvxg3s40jzz5ur27

Connection Time out error for Crawler while using Jsoup for crawler

I am trying to execute a crawler program from my office. A very basic one which is available in internet and which works fine in my home PC. However while I am trying to run the same program in my office PC i am getting connect timed out error. I thought it was proxy problem and tried accessing some site from eclipse internal browser and it worked fine also.
Document doc = Jsoup.connect("http://flipkart.com/").timeout(0).get();
Please find below my stack trace
Exception in thread "main" java.net.ConnectException: Connection timed out: connect
at java.net.DualStackPlainSocketImpl.connect0(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at sun.net.NetworkClient.doConnect(Unknown Source)
at sun.net.www.http.HttpClient.openServer(Unknown Source)
at sun.net.www.http.HttpClient.openServer(Unknown Source)
at sun.net.www.http.HttpClient.<init>(Unknown Source)
at sun.net.www.http.HttpClient.New(Unknown Source)
at sun.net.www.http.HttpClient.New(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.connect(Unknown Source)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:449)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:434)
at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:181)
at org.jsoup.helper.HttpConnection.get(HttpConnection.java:170)
at org.syntel.crawler.Crawler.processPage(Crawler.java:44)
at org.syntel.crawler.Crawler.main(Crawler.java:20)
How can I fix this problem?
#alkis made the suggestion:
Try setting a user agent. ff you are using a proxy check this other question:
How to add proxy support to Jsoup (HTML parser)?
Try using:
System.out.println("Testing JSOUP\n--------------");
Proxy proxy = new Proxy( //
Proxy.Type.HTTP, //
InetSocketAddress.createUnresolved("www.yourPROXY.com", 80) //
);
Document doc = Jsoup.connect("http://en.wikipedia.org/").proxy(proxy).get();
Elements newsHeadlines = doc.select("#mp-itn b a");
System.out.println(newsHeadlines.html());

Using Axis1 over https

how are you?
Well, the problem is that we are using Axis1 to consume a wsdl based webservice which works fine when the URL where the WSDL is located uses plain old HTTP connection, but when it uses a SSL secured conection it brings a ConnectionException when Axis1 tries to download the WSDL document content.
Even reading comments on XMLUtils.class the Axis developers aren't even sure if it will work with HTTPS as it reads on line 810.
Is there any way to solve this? Whe tried to install the certificates on the computer, on ...jre7/lib/security/cacerts and tried to trust all certificates but the problem persists...
Thanks in advance.
Edit:
You can reproduce the Exception with this code:
InputSource source = new InputSource(urlWSDL);
DocumentBuilder db = DocumentBuilderFactoryImpl.newInstance().newDocumentBuilder();
Document doc = (Document) db.parse(source);
The Exception is:
java.net.ConnectException: Connection timed out: connect
at java.net.DualStackPlainSocketImpl.connect0(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at sun.security.ssl.SSLSocketImpl.connect(Unknown Source)
at sun.security.ssl.BaseSSLSocketImpl.connect(Unknown Source)
at sun.net.NetworkClient.doConnect(Unknown Source)
at sun.net.www.http.HttpClient.openServer(Unknown Source)
at sun.net.www.http.HttpClient.openServer(Unknown Source)
at sun.net.www.protocol.https.HttpsClient.<init>(Unknown Source)
at sun.net.www.protocol.https.HttpsClient.New(Unknown Source)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(Unknown Source)
at java.net.URL.openStream(Unknown Source)
at org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
at org.apache.xerces.impl.XMLVersionDetector.determineDocVersion(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
The problem was the Company Proxy, so I've appended:
System.setProperty("https.proxySet", "true");
System.setProperty("https.proxyHost", this.getConfiguracionProxy().getUrlProxy());
System.setProperty("https.proxyPort", this.getConfiguracionProxy().getPuertoProxy());
To:
System.setProperty("http.proxySet", "true");
System.setProperty("http.proxyHost", this.getConfiguracionProxy().getUrlProxy());
System.setProperty("http.proxyPort", this.getConfiguracionProxy().getPuertoProxy());
I've found the solution looking at Wireshark. When I was getting the file on SoapUI or on a web browser, the IP was other than the IP used by our application (the true IP). Then I realized that I was behind a proxy.
I've never used Wireshark... I've learned a lot, which is a good thing.
This sets as a System property the stored proxy configuration.
Thanks everybody.

getting unhandled exception type malformed url exception

I'm pretty new to java , actually I have just started learning it
I tried to do an exercise and the exercise was to read the first five lines of a webpage
for start I wrote this code :
import java.io.* ;
import java.net.URL ;
class testcode {
public static void main(String[] args) throws Exception {
URL address = new URL("http://www.yahoo.com/") ;
InputStream is = address.openStream() ;
InputStreamReader isr = new InputStreamReader(is) ;
BufferedReader reader = new BufferedReader(isr) ;
String line = reader.readLine() ;
}
}
but when I run this piece of code through Eclipse , I get this :
Exception in thread "main" java.net.ConnectException: Connection refused: connect
at java.net.DualStackPlainSocketImpl.connect0(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at sun.net.NetworkClient.doConnect(Unknown Source)
at sun.net.www.http.HttpClient.openServer(Unknown Source)
at sun.net.www.http.HttpClient.openServer(Unknown Source)
at sun.net.www.http.HttpClient.<init>(Unknown Source)
at sun.net.www.http.HttpClient.New(Unknown Source)
at sun.net.www.http.HttpClient.New(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.connect(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
at java.net.URL.openStream(Unknown Source)
at test.testcode.main(testcode.java:10)
Why is this happening !?
and ofcourse when I dont put the throws Exception part at the begining I get the malformed url exception !
PS : my internet connection works just fine !
Can Somebody please help me and explain why is this happening while doing that ?
I have a pretty good c++ background so feel free to explain as deep as you can :D
It seems like your program can't connect to the URL. Are u using internet behind proxy? If so, then make sure ur program is configured accordingly. One way is to use this code:
System.setProperty("http.proxyHost", "proxy.mydomain.com");
System.setPropery("http.proxyPort", "8080");

Parsing XML files with Java and spaces in file path

I have files on my file system, on Windows XP. I want to parse them using Java (JRE 1.6).
Problem is, I don't understand how Java and Xerces work together when the file path has spaces in it.
If the file has no spaces in its path, all works fine.
If there are spaces, I may have this kind of trouble, even if I call the parser with a FileInputStream instance :
java.net.UnknownHostException: .
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at sun.net.NetworkClient.doConnect(Unknown Source)
at sun.net.NetworkClient.openServer(Unknown Source)
at sun.net.ftp.FtpClient.openServer(Unknown Source)
at sun.net.ftp.FtpClient.openServer(Unknown Source)
at sun.net.www.protocol.ftp.FtpURLConnection.connect(Unknown Source)
at sun.net.www.protocol.ftp.FtpURLConnection.getInputStream(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at javax.xml.parsers.DocumentBuilder.parse(Unknown Source)
(sun.net.ftp.FtpClient.openServer ??? Wtf ?)
or else this kind of trouble :
java.net.MalformedURLException: unknown protocol: d
at java.net.URL.<init>(Unknown Source)
at java.net.URL.<init>(Unknown Source)
at java.net.URL.<init>(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source)
(It says unknown protocol: d because, I guess, the file is on the D drive.)
Has anyone any clue of why that happens, and how to circumvent the problem ? I tried to supply my own EntityResolver but my log tells me it is not even called before the crash.
EDIT:
Here is the code calling the parser.
public Document fileToDom(File file) throws ProcessException {
Document doc = null;
try {
DocumentBuilderFactory db = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = db.newDocumentBuilder();
if (this.errorHandler!=null){
builder.setErrorHandler(this.errorHandler);}
else {
builder.setErrorHandler(new DefaultHandler());
}
FileInputStream test= new FileInputStream(file);
doc = builder.parse(test);
...
} catch (Exception e) {...}
...
}
For the moment I find myself forced to remove the DOCTYPE before the parse, which removes all the problems, and the DTD validation... Not so great a solution.
Are you just using DocumentBuilder.parse(filename)?
If so, that's failing because it expects a URI. Open a FileInputStream to the file, and then pass that to DocumentBuilder.parse(InputStream).
Try this URI style:
file:///d:/folder/folder%20with%20space/file.xml
It looks like it's trying to connect to a URL in the doctype header so it can download it in order to validate the document against the downloaded DTD.
Try this.
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(test));
doc = builder.parse(is);
instead of just parsing the 'test'

Categories