Java JTextPane HTML image cookies - java

I am using a JTextPane to display data from a webpage that isn't mine, so I have no control over its contents. It requires a user to be logged in, so I use URLConnections to connect to that webpage and use cookies in the URLConnection to retrieve data. That works fine. However, when I put this data in a JTextPane with the content type set to text/html, the images do not display as they require those cookies with the session id and stuff to be sent in order to retrieve the uploaded images.
Is there any way I can make the JTextPane (though I am able to use anything else in the jdk that displays html) use my cookies?
Thanks.
I store the cookies in a linked list:
loadText = "Logging in...";
url = new URL("http://www.example.com/login.php");
connection = url.openConnection();
connection.setDoOutput(true);
OutputStreamWriter out = new OutputStreamWriter(
connection.getOutputStream());
out.write("username=" + URLEncoder.encode(username, "UTF-8")
+ "&password=" + URLEncoder.encode(password, "UTF-8")
+ "&testcookies=1");
out.flush();
out.close();
List<String> cookies = new LinkedList<String>();
for (int i = 1; (headerName = connection.getHeaderFieldKey(i)) != null; i++) {
if (headerName.equals("Set-Cookie")) {
String cookie = connection.getHeaderField(i);
cookie = cookie.substring(0, cookie.indexOf(";"));
cookies.add(cookie);
}
}
And I also need to strip away unneccesary HTML, which gives me a string I plug into the textpane:
String p1 = rawPage.split("<div id=\"contentstart\">")[1]
.split("</div><!--id='contentstart'-->")[0];
p1 = p1.replaceAll("<p><strong></strong></p>", "");
p1 = p1.replaceAll("<p></p>", "");
parsed = true;
JTextPane tp = new JTextPane();
tp.setEditable(false);
JScrollPane js = new JScrollPane();
js.getViewport().add(tp);
js.setHorizontalScrollBarPolicy(ScrollPaneConstants.HORIZONTAL_SCROLLBAR_NEVER);
getContentPane().add(js);
js.setSize(640, 480);
tp.setContentType("text/html");
tp.setText(p1);

Are you not reading the content from URLConnection? Something like this may help.
Post your code so that we can get more insight.
JTextPane pane;
..
HTMLDocument htmlDocument = (HTMLDocument) pane.getDocument();
htmlDocument.putProperty("IgnoreCharsetDirective", Boolean.TRUE);
htmlDocument.putProperty(Document.StreamDescriptionProperty, pageUrl);
pane.read(connection.getInputStream, htmlDocument);
-- or --
You may try the browser swing component instead of JTextPane.
http://djproject.sourceforge.net/ns/index.html

Cookies are stored in relation to your browser. For example, if you have some cookies in Firefox, Microsoft IE can't see those cookies. Similarly, the cookies you have obtained from the webpage you're looking for are not available to your Java application.
But also, JTextPane is not a full-featured HTML browser. You can use it to render basic HTML (actually HTML 2.0, a much older version of HTML), but it won't work with cookies, CSS, and other now-standard web features.
You may want to look at full-featured web browsers, such as Flying Saucer - see http://weblogs.java.net/blog/2007/07/14/flying-saucer-r7-out
But even if you do this, Flying Saucer won't see the cookies that you've obtained through other browsers.

Related

C# Download web page after java loading

How i can download webpage which uses java based loading mechanism?
Code below returns nearly empty document due site mechanism.
When viewed in browser you see "loading..." and after a while content is presented.
Also i want to avoid using WebBrowser control.
HtmlDocument doc = new HtmlDocument();
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
req.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;
if (!string.IsNullOrWhiteSpace(userAgent))
req.UserAgent = userAgent;
if (cookies != null)
{
req.CookieContainer = new CookieContainer();
foreach (Cookie c in cookies)
req.CookieContainer.Add(c);
}
var resp = req.GetResponse();
var resp_str = resp.GetResponseStream();
using (StreamReader sr = new StreamReader(resp_str, Encoding.GetEncoding("windows-1251")))
{
string r = sr.ReadToEnd();
doc.LoadHtml(r);
}
return doc;
Well you basically need a web browser to do the javascript running. Your webrequest now only fetches the data, as is, from the server.
You could use System.Windows.Forms.WebBrowser but its not pretty. This https://stackoverflow.com/a/11394830/2940949 might give you some idea on the basic issue.

unable save image in jsp

I'm unable to save a Data URI in JSP. I am trying like this, is there any mistake in the following code?
<%# page import="java.awt.image.*,java.io.*,javax.imageio.*,sun.misc.*" %>
function save_photo()
{
Webcam.snap(function(data_uri)
{
document.getElementById('results').innerHTML =
'<h2>Here is your image:</h2>' + '<img src="'+data_uri+'"/>';
var dat = data_uri;
<%
String st = "document.writeln(dat)";
BufferedImage image = null;
byte[] imageByte;
BASE64Decoder decoder = new BASE64Decoder();
imageByte = decoder.decodeBuffer(st);
ByteArrayInputStream bis = new ByteArrayInputStream(imageByte);
image = ImageIO.read(bis);
bis.close();
if (image != null)
ImageIO.write(image, "jpg", new File("d://1.jpg"));
out.println("value=" + st); // here it going to displaying base64 chars
System.out.println("value=" + st); //but here it is going to displaying document.writeln(dat)
%>
}
}
Finally, the image is not saved.
I think you didn't get the difference between JSP and JavaScript. While JSP is executed on the Server at the time your browser requires the web page, JavaScript is executed at the Client side, so in your browser, when you do an interaction that causes the JavaScript to run.
You Server (eg Apache Tomcat) will firstly execute your JSP code:
String st = "document.writeln(dat)";
BufferedImage image = null;
byte[] imageByte;
BASE64Decoder decoder = new BASE64Decoder();
imageByte = decoder.decodeBuffer(st);
ByteArrayInputStream bis = new ByteArrayInputStream(imageByte);
image = ImageIO.read(bis);
bis.close();
if (image != null)
ImageIO.write(image, "jpg", new File("d://1.jpg"));
out.println("value=" + st);
System.out.println("value=" + st);
As you can see, nowhere is the value of st changed. Your broser will receive the following snippet from your server:
value=document.writeln(dat);
Since your browser is the one that executes JavaScript, he will execute it and show the Base64-encoded Image - but your server won't.
For the exact difference, read this article.
To make the code working, the easiest way is to redirect the page:
function(data_uri)
{
// redirect
document.location.href = 'saveImage.jsp?img='+data_uri;
}
Now, you can have a JSP-page called saveImage.jsp that saves the Image, and returns the webpage you had already, and write the dara_uri into the element results.
Another, but more difficult way is to use AJAX. Here is an introduction to it.
You are trying to use JavaScript variables in Java code. Java code is running on your server, while Javascript code runs in user's browser. By the time JavaScript code executes, your Java code has already been executed. Whatever you're trying to do, you have to do it in pure javascript, or send an AJAX call to your server when your Javascript code has done it's thing.

Java program to download images from a website and display the file sizes

I'm creating a java program that will read a html document from a URL and display the sizes of the images in the code. I'm not sure how to go about achieving this though.
I wouldn't need to actually download and save the images, i just need the sizes and the order in which they appear on the webpage.
for example:
a webpage has 3 images
<img src="dog.jpg" /> //which is 54kb
<img src="cat.jpg" /> //which is 75kb
<img src="horse.jpg"/> //which is 80kb
i would need the output of my java program to display
54kb
75kb
80kb
Any ideas where i should start?
p.s I'm a bit of a java newbie
If you're new to Java you may want to leverage an existing library to make things a bit easier. Jsoup allows you to fetch an HTML page and extract elements using CSS-style selectors.
This is just a quick and very dirty example but I think it will show how easy Jsoup can make such a task. Please note that error handling and response-code handling was omitted, I merely wanted to pass on the general idea:
Document doc = Jsoup.connect("http://stackoverflow.com/questions/14541740/java-program-to-download-images-from-a-website-and-display-the-file-sizes").get();
Elements imgElements = doc.select("img[src]");
Map<String, String> fileSizeMap = new HashMap<String, String>();
for(Element imgElement : imgElements){
String imgUrlString = imgElement.attr("abs:src");
URL imgURL = new URL(imgUrlString);
HttpURLConnection httpConnection = (HttpURLConnection) imgURL.openConnection();
String contentLengthString = httpConnection.getHeaderField("Content-Length");
if(contentLengthString == null)
contentLengthString = "Unknown";
fileSizeMap.put(imgUrlString, contentLengthString);
}
for(Map.Entry<String, String> mapEntry : fileSizeMap.entrySet()){
String imgFileName = mapEntry.getKey();
System.out.println(imgFileName + " ---> " + mapEntry.getValue() + " bytes");
}
You might also consider looking at Apache HttpClient. I find it generally preferable over the raw URLConnection/HttpURLConnection approach.
You should break you problem into 3 sub problems
Download the HTML document
Parse the HTML document and find the images
Download the images and determine its size
You can use regular expressions to find tag and get image URL. After that you'll need and HttpUrlConnection class to get image data and measure it's size.
You can do this:
try {
URL urlConn = new URL("http://yoururl.com/cat.jpg");
URLConnection urlC = urlConn.openConnection();
System.out.println(urlC.getContentLength());
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}

Android: Extracting the text between two HTML tags

I need to extract the text between two HTML tags and store it in a string. An example of the HTML I want to parse is as follows:
<div id=\"swiki.2.1\"> THE TEXT I NEED </div>
I have done this in Java using the pattern (swiki\.2\.1\\\")(.*)(\/div) and getting the string I want from the group $2. However this will not work in android. When I go to print the contents of $2 nothing appears, because the match fails.
Has anyone had a similar problem with using regex in android, or is there a better way (non-regex) to parse the HTML page in the first place. Again, this works fine in a standard java test program. Any help would be greatly appreciated!
For HTML-parsing-stuff I always use HtmlCleaner: http://htmlcleaner.sourceforge.net/
Awesome lib that works great with Xpath and of course Android. :-)
This shows how you can download an XML from URL and parse it to get a certain value from an XML attribute (also shown in the docs):
public static String snapFromHtmlWithCookies(Context context, String xPath, String attrToSnap, String urlString,
String cookies) throws IOException, XPatherException {
String snap = "";
// create an instance of HtmlCleaner
HtmlCleaner cleaner = new HtmlCleaner();
// take default cleaner properties
CleanerProperties props = cleaner.getProperties();
props.setAllowHtmlInsideAttributes(true);
props.setAllowMultiWordAttributes(true);
props.setRecognizeUnicodeChars(true);
props.setOmitComments(true);
URL url = new URL(urlString);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
connection.setDoOutput(true);
// optional cookies
connection.setRequestProperty(context.getString(R.string.cookie_prefix), cookies);
connection.connect();
// use the cleaner to "clean" the HTML and return it as a TagNode object
TagNode root = cleaner.clean(new InputStreamReader(connection.getInputStream()));
Object[] foundNodes = root.evaluateXPath(xPath);
if (foundNodes.length > 0) {
TagNode foundNode = (TagNode) foundNodes[0];
snap = foundNode.getAttributeByName(attrToSnap);
}
return snap;
}
Just edit it for your needs. :-)

JTextPane HTML Image integration

I have a JTextPane and I do it :
HTMLEditorKit kit = new HTMLEditorKit();
HTMLDocument doc = new HTMLDocument();`
this.setEditorKit(kit);
this.setDocument(doc);
Then I do :
profilePictureSrc = "http://ola/profilePicture1.jpg";
chatContent ="<img src=\"" + profilePictureSrc + "\">";
Where profilePictureSrc is a URL Object.
It works but I must use a String instead of the URL (Java Hashtable put method slow down my application)
How Can I do that ? Do I have to put the picture files somewhere and use a relative Path to reach them ? Thank you very much for your ideas
Best Regards
You can convert url objects to strings.
String urlstring = myurl.toString();

Categories