How to get text from specific href with jsoup? - java

I get text from http://m.wol.jw.org/en/wol/dt/r1/lp-e/2014/6/26 via jsoup in my android app.
It looks like:
public static void refreshFromNetwork(Context context) {
Document document;
Elements dateElement;
Elements textElement;
Elements commentElement;
try {
Calendar calendar = Calendar.getInstance();
int year = calendar.get(Calendar.YEAR);
int month = calendar.get(Calendar.MONTH) + 1;
int day = calendar.get(Calendar.DAY_OF_MONTH);
sDayURL = sURL + "/" + year + "/" + month + "/" + day;
document = Jsoup.connect(sDayURL).get();
if (document.hasText()) {
dateElement = document.select(".ss");
textElement = document.select(".sa");
commentElement = document.select(".sb");
sDate = dateElement.text();
sText = textElement.text();
sComment = commentElement.html();
sSavedForCheckingDate = sLocalDate;
savePrefs(context);
sDayURL = null;
} else {
Toast.makeText(mContext,
mContext.getString(R.string.warning_unstable_connection),
Toast.LENGTH_SHORT).show();
}
} catch (IOException e) {
System.out.println("error");
e.printStackTrace();
}
}
But there are some hrefs in text. When the cursor is on them, pops up with text frame.
I can't post images, so see it there: http://habrastorage.org/files/45e/b09/17f/45eb0917f3644bbd9e5ea2b79d98363d.png
But when I try to get text from that href (I get it from sComment with html), it returns me all the text (which displays when I click on href), not part of it, like in popup. I'm not a web developer, so I don't understand, how to get only the desired text. How can I do it?

Follow the snapshot below to get only the text on pop-up
Click the pop-up href
See the text the popup text is on the this page also, to extract only the text shown on popup simply use this class and display the contents
When you click on the link href, a new page open with the same text with red font this is the text you need as it is the pop-up text, now you have just use
String Href=Scomment.attr("href");
Document doc=Jsoup.connect(Href).get();
Element element= doc.getElementById("p101");
String dialogtext=element.text();
This is the solution to you question.
Hope it'll help you

Use sComment = commentElement.text(); instead.

Related

Why does formatting a JList cell using HTML lead to the text being centered in the cell?

I have a JList in which I am formatting the text before it is added to the JList cell using HTML. I'm doing this because I'm lazy and don't want to get complex with a cellRenderer. I only need 2 separate lines in each JList cell so HTML just seemed quicker and easier for such a simple requirement, however when I run this it correctly formats it, however, the text does not start on the edge of the button. I'm assuming this is because the HTML takes up space in whitespaces in which case I assume I can't fix that?
public static void receiveDataEmailList(String data) {
Scanner scLine = new Scanner(data).useDelimiter("&");
int num = scLine.nextInt();
String[] emails = new String[num];
for (int i = 0; i < num; i++) {
emails[i] = "" + "<html><ul style=\"list-style-type:none;\"><li style=\"font-size:10px\">" + scLine.next() + "</li><li style=\"font-size:8px\">" + "Subject: " + "Hello" + "</li></ul></html>";
}
EmailList.setListOfEmails(emails);
}

How can I set link inside a text in Android?

So, I am using Jsoup for web scraping. I can scrape the data from the web, But, the problem is I am getting the links and text separately. I want those links to set inside my texts. I am using SpannableStringBuilder so, there are a lot of links and a lot of texts. so I can't figure out how to deal with the problem as I am new to android development.
private void getWebsite() {
new Thread(new Runnable() {
#Override
public void run() {
final SpannableStringBuilder
builder = new SpannableStringBuilder();
try {
Document doc = Jsoup.
connect("https://www.wikipedia.org/").get();
String title = doc.title();
Elements links = doc.select("a[href]");
builder.append(title).append("\n");
for (Element link : links) {
final String url = link.attr("href");
builder.append("\n")
.append("Link: ")
.append(url, new URLSpan(url),
Spannable.SPAN_EXCLUSIVE_EXCLUSIVE)
.append("\n")
.append("Text: ")
.append(link.text());
}
} catch (IOException e) {
builder.append("Error : ")
.append(e.getMessage()).append("\n");
}
runOnUiThread(new Runnable() {
#Override
public void run() {
textView.setText(builder.toString());
textView.setMovementMethod
(LinkMovementMethod.getInstance());
}
});
}
}).start();}
I am getting output like this format.
Link : //en.wikipedia.org/
Text : English 5 678 000+ articles
Link : //ja.wikipedia.org/
Text : 日本語 1 112 000+ 記事
Link : //es.wikipedia.org/
Text : Español 1 430 000+ artículos
......
......
I want to have an output like this format,
** Texts: English 5 678 000+ articles**,
inside that line, I want to
join this link
** Link://en.wikipedia.org/**
as hyperlinked or in some way so that I can click this text and go to the webpage directly like in MS Word.
You are looking for setting text values using HTML. Here is the documentation and Here is some sample code:
String str = "Do you want to search on " + "<a href=http//www.google.com>" +
"Google" + "</a>" + " or " + "<a href=http//www.yahoo.com>" +
"Yahoo" + "</a>" + "?";
if(Build.VERSION.SDK_INT >= 24) {
viewToSet.setText(Html.fromHtml(str, Html.FROM_HTML_MODE_LEGACY));
} else {
viewToSet.setText(Html.fromHtml(str));
}
In it, you can set values using HTML. You can also update colors, bold, italics, etc, as long as you utilize HTML properties.

xPath not finding selector when using for-each loop variable, but works otherwise

I'm using Selenium to loop through an ArrayList of Strings in order to use each string in an xPath expression in order to select its appropriate checkbox on a website.
The problem is, when I use the for loop, the variable containing the string doesn't seem to create a valid xPath, yet when I simply substitute the string in myself it works fine.
For example, here is my ArrayList declaration with some values added.
ArrayList<String> fieldList = new ArrayList<String>();
fieldList.add("Street");
fieldList.add("City");
fieldList.add("Country");
If I then use the following code, it goes into the catch block
WebDriverWait waitForElement = new WebDriverWait(driver, 1);
for (String cField: fieldList) {
try {
waitForElement.until(ExpectedConditions.elementToBeClickable(By.xpath("//td[following-sibling::td[2] = " + cField + "]/input")));
WebElement checkBox = driver.findElement(By.xpath("//td[following-sibling::td[2] = " + cField + "]/input"));
checkBox.click();
} catch (Exception error) {
System.out.println("Couldn't find " + cField);
}
}
Telling me it couldn't find "Street" for example.
Yet when my try block contains the following, with the value explicitly stated, it works:
waitForElement.until(ExpectedConditions.elementToBeClickable(By.xpath("//td[following-sibling::td[2] = 'Street']/input")));
WebElement checkBox = driver.findElement(By.xpath("//td[following-sibling::td[2] = 'Street']/input"));
What am I doing wrong? Thanks a lot.
You are forgetting to quote your strings in the XPath expression. Add single quotes around cField:
waitForElement.until(ExpectedConditions.elementToBeClickable(
By.xpath("//td[following-sibling::td[2] = '" + cField + "']/input")));
// quotes added here ---^ and here ---^
WebElement checkBox =
driver.findElement(By.xpath("//td[following-sibling::td[2] = '" + cField + "']/input"));
// quotes added here ---^ and here ---^

jTextPane color with some exception in chat

I am using jTextPane to use sender and receiver chat color. All works fine but javax.swing.text.DefaultStyledDocument#123456with every chat message.
here Jhon is revceiver and peter is sender
here peter is revceiver and Jhon is sender
may be I m doing some mistake in code.
Here is the code for Sender
DateFormat dateFormat = new SimpleDateFormat("HH:mm:ss\n\t dd/MM/yyyy ");
Date date = new Date();
StyledDocument doc = GUI.jTextPane1.getStyledDocument();
Style style = GUI.jTextPane1.addStyle("a style", null);
StyleConstants.setForeground(style, Color.red);
try {
doc.insertString(doc.getLength(), "\t " + "You" + " : " + GUI.getSendMessage() +("\n \t "+dateFormat.format(date)) ,style);
GUI.appendReceivedMessages(""+doc);
}
catch (BadLocationException e){}
Here is the code for Receiver
DateFormat dateFormate = new SimpleDateFormat("HH:mm:ss\ndd/MM/yyyy ");
Date datee = new Date();
StyledDocument doc1 = GUI.jTextPane1.getStyledDocument();
Style styler = GUI.jTextPane1.addStyle("a style", null);
StyleConstants.setForeground(styler, Color.blue);
try { doc1.insertString(doc1.getLength(),"recevier" ,styler);
GUI.appendReceivedMessages(fromHeader.getAddress().getDisplayName() + " : "
+ new String(request.getRawContent()) +("\n"+dateFormate.format(datee)));
}
catch (BadLocationException e){}
here is Main GUI where I get these
public void appendReceivedMessages(String s) {
try {
Document doce = jTextPane1.getDocument();
doce.insertString(doce.getLength(), s+"\n", null);
} catch(BadLocationException exc) {
}
}
This is so obvious - not sure if qualifies for an answer. Anyway
Why are you doing GUI.appendReceivedMessages(""+doc); ? that is causing the doc object's default toString to appear. Hope that helps
EDIT:
so what can I do here
I guess you can do it like this :
Note that StyledDocument's insertString API updates the view. Meaning it provides you the output you need on JTextPane so:
doc.insertString(doc.getLength(), "\t " + "You" + " : " + GUI.getSendMessage() +("\n \t "+dateFormat.format(date)) ,style);
Is sufficient to bring the output on to the text pane. Remove the call to GUI.appendReceivedMessages(""+doc);
I believe your aim is to display the message text on the text pane component - jTextPane1. you just need to update property of jTextPane1 for that. You do not need to update anything else. If you need to send the text data around, just get the text from that object and pass it around to methods that expects the value : example :
String text = jTextPane1.getDocument()
.getText(0, jTextPane1.getDocument()
.getLength());
aMethodThatExpectsAString(text);

Search Function in HTML

How can I search text in HTMLDocument and then return the index and last index of that word/sentence but ignoring tags when searching..
Searching: stackoverflow
html: <p class="red">stack<b>overflow</b></p>
this should return index 15 and 31.
Just like in browsers when searching in webpages.
If you want to do that in Java, here are rough example using Jsoup. But of course you should implement the detail so that the code can parse properly for any given html.
String html = "<html><head><title>First parse</title></head>"
+ "<body><p class=\"red\">stack<b>overflow</b></p></body></html>";
String search = "stackoverflow";
Document doc = Jsoup.parse(html);
String pPlainText = doc.body().getElementsByTag("p").first().text(); // will return stackoverflow
if(search.matches(pPlainText)){
System.out.println("text found in html");
String pElementString = doc.body().html(); // this will return <p class="red">stack<b>overflow</b></p></body>
String firstWord = doc.body().getElementsByTag("p").first().ownText(); // "stack"
String secondWord = doc.body().getElementsByTag("p").first().children().first().ownText(); // "overflow"
//search the text in pElementString
int start = pElementString.indexOf(firstWord); // 15
int end = pElementString.lastIndexOf(secondWord) + secondWord.length(); // 31
System.out.println(start + " >> " + end);
}else{
System.out.println("cannot find searched text");
}

Categories