A weird problem happened when parsing a html page using HTMLParser - java

I was parsing a web page using HTMLParser in Java, I met a weird problem when using class HasAttributeFilter.
The element I wanna parse in the page is <span style="vertical-align: middle;"></span>, so the expression should be HasAttributeFilter filter = new HasAttributeFilter("style", "vertical-align: middle;");, right? Yeah, I used this exp, but it DIDN'T WORK! BUT I am sure there IS the node in the page
After that, I applied some other exp, such as HasAttributeFilter filter = new HasAttributeFilter("class", "singlecolumnminwidth"); to the same page, and also, the node is there, something weird happened, this expression WORKED!
Has anyone met this problem before? Help me ...
Thanks in advance!
The page's link.

what do you get if you fetch the value of this attribue and print it out to the screen?
do you maybe have to escape some chars like space or minus? think it could have problems with the space in between
does vertical-align:middle; work?
or maybe test if its the minus causing an error

Related

How to get the translated text from google translator using selenium?

I have tried the following code to get the google translator text using selenium:
result=driver.findElement(By.xpath("//body/div[#id='result_box']/span"));
I also tried these:
1.result=driver.findElement(By.xpath(".//body/div[#id='result_box']/span"));
2.result=driver.findElement(By.xpath("./*div[#id='result_box']/span"));
3.result=driver.findElement(By.xpath(".//div[#id='result_box']/span"));
4.result=driver.findElement(By.xpath("//div[#id='result_box']/span"));
5.result=driver.findElement(By.xpath(".//body/div[#id='result_box']/span"));
6.result=driver.findElement(By.xpath("./*[#id='result_box']/span"));
But none of the above works. I then tried to get the text by:
result=driver.findElement(By.id("result_box")).findElement(By.tagName("span"));
translatedtext=result.getText();
This returns a result but when I try to show the result in JTextarea it shows me '????' instead of the actual translated text.
I have also tried 'result.getAttribute("innerHTML")' but it also shows some question marks (?????) instead of the original translated text in JTextarea.
How can I solve this problem?
The result box has tag <span>, not <div>
result = driver.findElement(By.xpath(".//span[#id='result_box']/span"));
Or
result = driver.findElement(By.xpath(".//*[#id='result_box']/span"));
With double slash.
you also can use css selector like this:
result = driver.findElement(By.cssSelector("#result_box>span"));
people say that is faster than xpath
This worked for me, however i used python,you can try using equivalent of find_element_by_id function in Java
driver.find_element_by_id("gt-res-dir-ctr").text

How do I add a anchor to a HREF in JSTL?

I was wondering how I go about adding a anchor to a HREF in JSTL. I've tried googling it and am no further forward really, I tried adding it as a PARAM but to no avail. My current code is like this;
My URL
But my code fails to validate and my server Error 500's. Could someone explain to me how to successfully an anchor to my HREF please?
I figured it out, but if there is a better way of doing it, please let me know ;)
I have this code in place, the dynamic bit is inside of a Mustache template;
<c:url value="/myURL#" var="myUrl"/>
My URL

Using Jsoup to select classes and id

I am using this as an example
http://www.shopping.com/digital-camera/products?CLT=SCH&KW=digital+camera
In the linke above there is a class
<span class="numTotalResults">
Results 1 - 40 of 1500+
</span>
I got it using
Document query_result = Jsoup.connect("http://www.shopping.com")
.data("CLT", "digital camera")
.post();
but when I
System.out.println(query_result.select(".numTotalResults"));
System.out.println(query_result.select("#quickLookItem-1"));
System.out.println(query_result.select("[name=D0]"));
Nothing happens,
while
System.out.println(query_result);
System.out.println(query_result.select("span"));
clearly prints out the values
The selector seems to work only with div and span and anchor, but I can' select the classes or the id
Can someone help me?
Thanks
Edit:
It seems like the post did not go through. I don't quite understand why it didn't.
Instead of using POST request, try GET one:
Document query_result = Jsoup.connect("http://www.shopping.com/digital-camera/products?CLT=SCH&KW=digital+camera")
.get();
Take a look how does this search works. It doesn't use POST method and it keeps all search parameters in a query string. After this small change your first select example will work well.

How to Append java codes in JSP using JQuery

I am working with JSP pages and I need to .append() into a DIV element some java codes.
$("#myDiv").append("<% out.println("ali"); %>");
The previous code is actually wrong because there's quotes, so I escaped them.
$("#myDiv").append("<% out.println(\"ali\"); %>");
But I wasn't successful, nothing was appended to #myDiv
This is correct way to append. If nothing is appended, then the div might not exists in the page.
Try using firebug and check what6 is the output of console.log($('#muDiv'))
Your code should work, it sounds that jQuery is not loaded correctly or there is another problem.
http://jsfiddle.net/5PTwN/1/
Try this
$("#myDiv").append('<% out.print("ali"); %>');
below line also should work because java code will execute server side and inside append will be replaced with "ali".
$("#myDiv").append("<% out.print("ali"); %>");

Getting Started With Android & JSOUP

I am currently attempting to make an Android application and have come to the conclusion that I must use JSOUP to finish it. I am using JSOUP to extract data from the Internet and then post it on my app.
What I am trying to figure out is how to extract multiple bits of data from the url and then use each one of them inside of their own XML String TextView (If that is correct?)
Here is a snipbit of the HTML I am trying to extract.
a href="http://www.campusdish.com/en-US/CSMA/OldDominion/Locations/rda.aspx?RCN=m12296&MI=122&RN=BACoN TURKEY SLICED" OnCick="javascript: NewWindow('http://www.campusdish.com/en-US/CSMA/OldDominion/Locations/rda.aspx?RCN=m12296&MI=122&RN=BACON TURKEY SLICED', 'RDA_window', 'width=450, height=600, scrollbars=no, toolbar=no, directories=no, status=no, menubar=no, copyhistory=no');return false" Class="recipeLink">BACON TURKEY SLICED
I am trying to extract the words BACON TURKEY SLICED
The problem is I do not understand JSOUP at all. Like I have an idea about it but I can't seem to practically use it and all that. I was wondering if someone could try and give me a push in the right direction.
Also, I have tried reading the cookbook to no prevail.
If anyone could help, thank you so much!
EDIT
Here are two more. I believe they are the exact same thing.
a href="http://www.campusdish.com/en-US/CSMA/OldDominion/Locations/rda.aspx?RCN=m4903&MI=122&RN=STATION OMELET" OnClick="javascript: NewWindow('http://www.campusdish.com/en-US/CSMA/OldDominion/Locations/rda.aspx?RCN=m4903&MI=122&RN=STATION OMELET', 'RDA_window', 'width=450, height=600, scrollbars=no, toolbar=no, directories=no, status=no, menubar=no, copyhistory=no');return false" Class="recipeLink">STATION OMELET
a href="http://www.campusdish.com/en-US/CSMA/OldDominion/Locations/rda.aspx?RCN=m784&MI=122&RN=CEREAL HOT GRITS" OnClick="javascript: NewWindow('http://www.campusdish.com/en-US/CSMA/OldDominion/Locations/rda.aspx?RCN=m784&MI=122&RN=CEREAL HOT GRITS', 'RDA_window', 'width=450, height=600, scrollbars=no, toolbar=no, directories=no, status=no, menubar=no, copyhistory=no');return false" Class="recipeLink">CEREAL HOT GRITS
So, this answer is going to assume that you are interested in:
<a href=".." >TEXT YOU WANT</a>
All these <a> tags have the style attribute "recipeLink".
Given your example, here as a String:
String tastyTurkeySandwich= "BACON TURKEY SLICED";
You can extract the (first) text with the following code:
Document doc = Jsoup.parse(tastyTurkeySandwich);
Elements links = doc.select("a[href].recipeLink");
// This will just print the text in the first one
System.out.println(links.first().text());
To iterate over an Elements (which implements the Iterable interface) instance:
for (Element link : links) {
// Calling link.text() will return BACON TURKEY SLICED etc. etc.
System.out.println(link.text());
}
In short:
a[href] will match all the <a> tags that have a href attribute.
the .recipeLink part will filter that selection to only include links that have the recipeLink style.

Categories