Vaadin how to print dynamic raw html using PrintUI class? - java

I'm following the example code which puts the html string into a Label. The html is perfect in the browser, is multiple pages and so on. However when I do a Print Preview (or Print) the printout is limited to only one page and there is vertical scrollbar on the printout.
How do I print multiple pages and remove the scrollbar?
My code in the PrintUI class is only:
setContent(new Label(template, ContentMode.HTML));

The answer can be found at: https://vaadin.com/forum/#!/thread/3869543/3869542
You basically need to resort to pure html. The following code does this and fixes the issue:
private void setSizeUndefined2Print()
{
com.vaadin.ui.JavaScript.getCurrent().execute("document.body.style.overflow = \"auto\";" +
"document.body.style.height = \"auto\"");
UI.getCurrent().setSizeUndefined();
this.setSizeUndefined();
}
You can find more details in the above link.

Related

Why is my Jsoup Code not Returning the Correct Elements?

I am working on an app in Android Studio and am having some trouble web-scraping with JSoup. I have successfully connected to the webpage and returned some basic elements to test the library, but now I cannot actually get the elements I need for my app.
I am trying to get a number of elements with the "data-at" attribute. The weird thing is, a few elements with the "data-at" attribute are returned, but not the ones I am looking for. For whatever reason my code is not extracting all of the elements that share the "data-at" attribute on the web page.
This is the URL of the webpage I am scraping:
https://express.liatoyotaofcolonie.com/inventory?f=dealer.name%3ALia%20Toyota%20of%20Colonie&f=submodel%3ACamry&f=trim%3ALE&f=year%3A2020
The method containing the web-scraping code:
#Override
protected String doInBackground(Void... params) {
String title = "";
Document doc;
Log.d(TAG, queryString.toString());
try {
doc = Jsoup.connect(queryString.toString()).get();
Elements content = doc.select("[data-at]");
for (Element e: content) {
Log.d(TAG, e.text());
}
} catch (IOException e) {
Log.e(TAG, e.toString());
}
return title;
}
The results in Logcat
The element I want to retrieve
One of the elements that is actually being retrieved
This is because some of the content - including the one you are looking for - is created asyncronously and is not present in initial DOM (Javascript ;))
When you view the source of the page you will notice that there is only 17 data-at occurences, while running document.querySelector("[data-at]") 29 nodes are returned.
What you are able to get in the JSoup is static content of the page (initial DOM). You wont be able to fetch dynamically created content as you do not run required JS scripts.
In order to overcome this, you will have to either fetch and parse required resources manually (eg trace what AJAX calls are made by the browser) or use headless browser setup. Selenium + headless Chrome should be enough.
Letter option will allow you to scrape ANY posible web application, including SPA apps, which is not possible using plaing Jsoup.
I don't quite know what to do about this, but I'm going to try one more time... The "Problematic Lines" in your code are these:
doc = Jsoup.connect(queryString.toString()).get();
Elements content = doc.select("[data-at]");
It is the queryString that you have requested - the URL points to a page that contains quite a bit of script code. When you load up a browser and click the button (or menu-option) that reads: "View Source", the HTML you see is not the same exact HTML that is broadcast to and received by JSoup.
If the HTML that is broadcast contains any <SCRIPT TYPE="text/javascript"> ... </SCRIPT> in it (and the named URL in your question does), AND those <SCRIPT> tags are involved in the initial loading of the page, then JSoup will not know anything about it... It only parses what it receives, it cannot process any dynamic content.
There are four ways that I know of to get the "Post Script Loaded" version of the HTML from a dynamic web-page, and I will type them here, now. The first is likely the most popular method (in Java) that I have heard about on Stack Overflow:
Selenium This Answer will show how the tool can run Java-Script. These are some Selenium Docs. And then there is this page right here has a great "first class" for using the tool to retrieve post-script processed HTML. Again, there is no way JSoup can retrieve HTML that is sent to the browser by script (JS/AJAX/Angular/React) since it just a parser.
Puppeteer This requires running a language called Node.js Perhaps calling a simple Node.js program from Java could work, but it would be a "Two Language" solution. I've never used it. Here is an answer that shows getting, sort of, what you are trying to get... The HTML after the script.
WebView Android Java Programmers have a popular class called "WebView" (documented here), that I have recently been told about (yesterday ... but it has been out for years) that will execute script in a browser, and return the HTML. Here is an answer that shows "JavaScript Injection" to retrieve DOM Tree elements from a "WebView" instance (which is how I was told it was done)
Splash My favorite tool, which I don't think anyone has heard of, but has been the simplest for me... So there is an A.P.I. called the "Splash API". Here is their explanation for a "Java-Script Rendering Service." Since this one I have been using... I'll post a code snippet that shows how "Splash Tool" can retrieve post-script processed HTML below.
To run the Splash API (only if you have access to the docker loading program) ... You start a Splash Server as below. These two lines are typed into a GCP (Google Cloud Platform) Shell instance, and the server starts right up without any configurations:
Pull the image:
$ sudo docker pull scrapinghub/splash
Start the container:
$ sudo docker run -it -p 8050:8050 --rm scrapinghub/splash
In your code, just prepend the String to your URL's:
"http://localhost:8050/render.html?url="
So in your code, you would use the following command (instead), and the script would (more likely) load all the HTML Elements that you are not finding:
String SPLASH_URL = "http://localhost:8050/render.html?url=";
doc = Jsoup.connect(SPLASH_URL + queryString.toString()).get();

JasperReports export to Excel uses only last set background color

Im pretty pretty new to Dynamic-Jasper, but due to work i had to add a new feature to our already implemented solution.
My Problem
The Goal is to add a Column to a report that consists only out of a background-color based on some Information. I managed to do that, but while testing I stumbled upon a Problem. While all my Columns in the html and pdf view had the right color, the Excel one only colored the fields in the last Color.
While debugging i noticed, that the same colored Fields had the same templateId, but while all Views run through mostly the same Code the Excel one showed different behavior and had the same ID in all fields.
My Code where I manipulate the template
for(JRPrintElement elemt : jasperPrint.getPages().get(0).getElements()) {
if(elemt instanceof JRTemplatePrintText) {
JRTemplatePrintText text = (JRTemplatePrintText) elemt;
(...)
if (text.getFullText().startsWith("COLOR_IDENTIFIER")) {
String marker = text.getFullText().substring(text.getFullText().indexOf('#') + 1);
text.setText("ID = " + ((JRTemplatePrintText) elemt).getTemplate().getId());
int rgb = TypeConverter.string2int(Integer.parseInt(marker, 16) + "", 0);
((JRTemplatePrintText) elemt).getTemplate().setBackcolor(new Color(rgb));
}
}
}
The html view
The Excel view
Temporary Conclusion
The same styles uses the same Objects in the background and the JR-Excel export messes something up by assigning the same Object to all the Fields that I manipulated there. If anyone knows of a misstake by me or potential Solutions to change something different to result the same thing please let me know.
Something different I tried earlier, was trying to set the field in an evaluate Method that was called by Jasper. In that method we assign the textvalue of each field. It contained a map with JRFillFields, but unfortunatelly the Map-Implementation denied access to them and just retuned the Value of those. The map was provided by dj and couldn't be switched with a different one.
Edit
We are using JasperReports 6.7.1
I found a Solution, where I replaced each template with a new one that was supposed to look exactly alike. That way every Field has its own ID guaranteed and its not up to chance, how JasperReports handles its Data internaly.
JRTemplateElement custom =
new JRTemplateText(((JRTemplatePrintText) elemt).getTemplate().getOrigin(),
((JRTemplatePrintText) elemt).getTemplate().getDefaultStyleProvider());
custom.setBackcolor(new Color(rgb));
custom.setStyle(((JRTemplatePrintText) elemt).getTemplate().getStyle());
((JRTemplatePrintText) elemt).setTemplate(custom);

How can I use the Wikipedia API to extract/parse the link I am looking for?

In Wikipedia 95% of the links leads to the Philosophy page. I am trying to write a program in Java that takes any link on wikipedia and clicks the first link(which is not citation/sound/extraneous link and also ignores parentsitzed link .)
For e.g if you start with this url http://en.wikipedia.org/wiki/Dutch_people, it should click Ethnic Group http://en.wikipedia.org/wiki/Ethnic_group and so on until it reaches Philosophy
You should see this Getting_to_Philosophy
Check http://xefer.com/wikipedia (type any word) to see how it works .
I already wrote the back end that stores the data in database in 3 columns
Unique_URL_Id URL_Link Next_URL_Id so latter on printing the whole path will be easier.
The backend works fine(if I give it just a list of links to follow). However extracting and finding the first link is something not working as it should work.
Here is sample code I wrote just for extracting from a URL using jSoap API
public static void extractWikiPage(String title) throws IOException{
Document doc = Jsoup.connect("http://en.wikipedia.org/wiki/Europe").get();
//int titles = doc.toString().indexOf("(");
//Get the first paragraph where the main body contents starts
String body = doc.getElementsByTag("p").first().toString();
System.out.println(body);
Document doc2= Jsoup.parse(body);
Elements href=doc2.getElementsByTag("a");
int x="".indexOf("");
for(Element h: href){
System.out.println(h.toString());
}
//System.out.println(linkText);
System.exit(1);
}
I am just finding the first occurence of '<p>' since that's where 95% of the links to the next page start. And in that paragraph, I am trying to get all the links but I need the first one that satisfies the condition I wrote above.
How can I use Wikipedia API to solve extracting the data I am looking for.I appreciate your help.
/w/api.php?action=query&prop=revisions&format=json&rvprop=content&rvlimit=1&rawcontinue=&titles=Dutch_people is the query that returns the wikitext for that page.
You'll have to parse that result to get the data you want back. You'll be looking for the first thing that is inside of [[double square brackets]] (probably after /\{\{Infobox(.*?)\}\}/i or something like that to exclude links in the infobox and any maintenance tags that might be on the page) that don't start with "something:" to eliminate all interwiki links and categories and file/media pages.

Bridge between the Java applet and the text input controls on the web page

I have been working with a Java applet which is an applet that helps to write using only a mouse. For my case, I am trying to incorporate this into my webiste project as follows:
When the user clicks on any input element (textbox/textarea) on the page, this JAVA applet loads on the webpage itself. In the screenshot of the JAVA applet seen below, the user points to an alphabet to and the corresponding text gets written in the text box of the applet.
Now what I am trying to do is to get this text from the TextBox of the applet to the input element on the webpage. I know that this needs an interaction between the Java and JavaScript, but not being a pro, I really do not have the catch. Here's the Java applet and the code I have written.
Java applet and jQuery code (298kB): http://bit.ly/jItN9m
Please could somebdoy help for extending this code.
Thanks a lot!
Update
I searched somewhere and found this -> To get the text inside of Java text box, a getter method in the Applet to retrieve the text:
public class MyApplet extends JApplet {
// ...
public String getTextBoxText() { return myTextBox.getText(); }
}
In the JQuery code, the following lines are to be added I think:
var textBoxText = $("#applet-id")[0].getTextBoxText();
//Now do something with the text
For the code of the applet, I saw a GNOME git page here. The getText call already exists -- look at the bottom of this file: http://git.gnome.org/browse/dasher/tree/java/dasher/applet/JDasherApplet.java
I'd need to call 'getCurrentEditBoxText' but when should this method 'getCurrentEditBoxText' be called?
In my case, I would probably have to do it when the user clicks in a new input control etc.
You can have full communication between your Applet and any javascript method on the page. Kyle has a good post demonstrating how the Javascript can call the applet and request the text value. However, I presume you want the HTML Textfield to update with each mouse click, meaning the applet needs to communicate with the page. I would modify your javascript to something like this:
var activeTextArea = null;
$('textarea, input').click(function() {
$(this).dasher();
activeTextArea = this;
});
function updateText(text) {
// Careful: I think textarea and input have different
// methods for setting the value. Check the
// jQuery documentation
$(activeTextArea).val(text);
}
Assuming you have the source for the applet, you can have it communicate with the above javascript function. Add this import:
import netscape.javascript.JSObject;
And then, in whatever onClick handler you have for the mouse clicks, add:
// After the Applet Text has been updated
JSObject win = null;
try {
win = (JSObject) JSObject.getWindow(Applet.this);
win.call("updateText", new Object[] { textBox.getText() });
} catch (Exception ex) {
// oops
}
That will update the text each time that chunk of code is called. If you do NOT have access to the applet source, things get trickier. You'd need to set some manner of javascript timeout that constantly reads the value from the applet, but this assumes the applet has such a method that returns the value of the textbox.
See Also: http://java.sun.com/products/plugin/1.3/docs/jsobject.html
Update Modifying the applet is your best shot since that is where any event would be triggered. For example, if you want the HTML TextField to change on every click, the click happens in the applet which would need to be modified to trigger the update, as described above. Without modifying the applet, I see two options. Option #1 uses a timer:
var timer;
var activeTextArea;
$('textarea, input').click(function() {
$(this).dasher();
activeTextArea = this;
updateText();
}
function updateText() {
// Same warnings about textarea vs. input
$(activeTextArea).val($('#appletId')[0].getCurrentEditBoxText());
timer = setTimeout("updateText()", 50);
}
function stopUpdating() {
clearTimeout(timer);
}
This is similar to the code above except clicking on a text area triggers the looping function updateText() which will set the value of the HTML text field to the value of the Applet text field every 50ms. This will potentially introduce a minor delay between click and update, but it'll be small. You can increase the timer frequency, but that will add a performance drain. I don't see where you've 'hidden' the applet, but that same function should call stopUpdating so that we are no longer trying to contact a hidden applet.
Option #2 (not coded)
I would be to try and capture the click in the Applet as it bubbles through the HTML Dom. Then, you could skip the timer and put a click() behavior on the Applet container to do the same update. I'm not sure if such events bubble, though, so not sure if this would work. Even if it did, I'm not sure how compatible it would be across browsers.
Option #3
Third option is to not update the HTML text field on every click. This would simply be a combination of Kyle's and my posts above to set the value of the text field whenever you 'finish' with the applet.
Here's a possible solution. To get the text inside of your Java text box, write a getter method in the Applet to retrieve the text:
public class MyApplet extends JApplet {
// ...
public String getTextBoxText() { return myTextBox.getText(); }
}
In your JQuery code, add the following lines:
var textBoxText = $("#applet-id")[0].getTextBoxText();
//Now do something with the text
I found most of what I posted above here. Hope this helps.
This page explains how to manipulate DOM from a Java applet. To find the input element, simply call the document.getElementById(id) function with id of an id attribute of the text input box.

Calling Java method from HTML link

I'm currently building a Twitter client in Java using the Twitter4J API. To create a Twitter "timeline", I am currently pulling data from Twitter such as profile images, tweets and usernames, then displaying them in a JTextPane, formatted using HTML. Code example below:
StringBuilder out = new StringBuilder();
try {
List<Status> statuses = HandleEvents.instance().twitter.getHomeTimeline();
out.append("<html>");
for (Status status : statuses)
{
out.append("<img src=\"").append(status.getUser().getProfileImageURL())
.append("\" width=30 height=30><b>").append(status.getUser().getName())
.append(":</b> ").append(status.getText())
.append("<br><br>");
}
out.append("</html>");
tweetsTextPane.setText(out.toString());
This displays a timeline of 20 tweets, separated by two line breaks. Under each tweet, I would like to place a simple hyperlink, called "Retweet", which calls one of my Java methods - HandleEvents.instance().twitter.retweetStatus(status.getId())
How would I got about doing this? Can the call be made directly between the tags, or do I have to make the call using JavaScript?
Any help would be appreciated. Many thanks.
You don't really need to have a hyperlink do you? Since it's a Swing app you could just add a JLabel that only looks like a hyperlink (but if you put in a little effort, it could behave like one as well). Add a listener for mouse clicks on that JLabel and you've can hook your current handler there.
On the other hand, if you do want actual HTML links, what you can do is implement your own HyperlinkListener.
Here are a couple of examples:
http://www.java2s.com/Tutorial/Java/0240__Swing/HyperlinkListenerExample.htm
http://www.java2s.com/Code/Java/Event/DemonstratingtheHyperlinkListener.htm
http://www.devx.com/tips/Tip/12997

Categories