How to parse "text" from span class with Jsoup - java

I want to parse the text in span class with Jsoup.
Here is my Html code portion.
<html>
<head></head>
<body>
<div>
<div class = "abcd">
<span> This is text </span>
</div>
<div>
</body>
</html>
I wrote something like that
Element element = doc.select("div.abcd > span");
System.out.println("Text = "+element.text());
This isn't working. Is there any other way to do this?

Change "div.abcd > span"
to
"div.abcd span"

Related

Selenium - Find element using xpath or cssSelector

I need to click on or find element "Compute vmSwitch". I tried many ways using xpath (class & contains), cssSelector as well, but could not able to locate element:
driver.findElement(By.xpath("//span[contains(#class,'nopadding vm-create-text-style-3 block-with-text-4 ng-binding') and contains(text(), 'Compute vmSwitch')]")).click();
The code is given below:
<div class="w-full"><br>
<img class="img-responsive center-block m-t-47" src="/src/icon/background/create_vm_img5.png">
<div class="col-md-12 m-t-md wordwrap">
<p class="nopadding vm-create-text-style-3 block-with-text-4 ng-binding">
Compute vmSwitch</p>
</div>
Why do you try with the span tag?
If this is your html:
<html>
<head></head>
<body>
<div class="w-full">
<br>
<img class="img-responsive center-block m-t-47" src="/src/icon/background/create_vm_img5.png">
<div class="col-md-12 m-t-md wordwrap">
<p class="nopadding vm-create-text-style-3 block-with-text-4 ng-binding"> Compute vmSwitch</p>
</div>
</div>
</body>
</html>
you could try:
WebElement elem2= driver.findElement(By.xpath("//div[#class='w-full']"));
elem2.findElement(By.xpath(".//p[text()=' Compute vmSwitch']")).click();

Replace HTML span + style tag to HTML 4 font tag

What is the fastest way to replace Html5 span style tags:
<html>
<body>
<p>
This is My text and <span style = "color:red; font-weight:bold">this is some text</span> even more
</p>
</body>
</html>
to HTML4 font tag:
<html>
<body>
<p>
This is My text and <font color="red">this is some text</font> even more
</p>
</body>
</html>
ignoring all style properties, leaving only color.

Display a string that contains HTML in Thymeleaf template

How can I display a string that contains HTML tags in Thymeleaf?
So this piece of code:
<div th:each="content : ${cmsContent}">
<div class="panel-body" sec:authorize="hasRole('ROLE_ADMIN')">
<div th:switch="${content.reference}">
<div th:case="'home.admin'">
<p th:text="${content.text}"></p>
</div>
</div>
</div>
//More code....
And at this line of piece of code ${content.text} it literally generates this on the browser:
<p>test</p>
But I want to show this instead on the browser:
test
You can use th:utext (unescaped text) for such scenarios.
Simply change
<p th:text="${content.text}"></p>
to
<p th:utext="${content.text}"></p>
I will suggest to also have a look into documentation here to know all about using Thymeleaf.

Jsoup parsing for nested html

I have an HTML to parse with Jsoup and I lose track after the HTML's weird structure. I can summarize HTML like this(Every line is one level inside of the above):
<html>
<body class="page3078">
<div id="mainCapsule">
<div id="contentCapsule" class="capsule">
<div id="content">
<div id="subCapsule" class="clearFix" xmlns="">
<div id="contentLeft">
<iframe width="635" height="1000" frameborder="0" src="apps/Results.aspx">
#document
<html xmlns="http://www.w3.org/1999/xhtml">
<body style="background:none;">
<form id="form1" action="Results.aspx" method="post" name="form1">
<div class="pressContent">
<div class="tableCapsule details">
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr class="even">
Basically I want to get text inside of the tag with class "even". I tried directly calling class even like this:
doc.getElementsByClass("even")
It didn't work. I tried parent > child relationship with selector method. It didn't work either. I tried this inside of second html tag:
doc.select("body.page3078 > html > body > #form1 > th");
Didn't work either. Where am I wrong?
One comment summarizes the start of a solution here:
As mentioned here you need to get the page from the iframe in a separate jsoup parser. This page isn't weird at all - it's just a separate page is shown in the iframe. – Boris the Spider

How to get content of iframe using GWT Query?

I trying to do like this:
$("iframe.cke_dialog_ui_input_file").contents()
but it returns:
< #document(gquery, error getting the element string representation: (TypeError) #com.google.gwt.dom.client.DOMImplMozilla::toString(Lcom/google/gwt/dom/client/Element;)([JavaScript object(8570)]): doc is null)/>
But document is not null!
Help me please to solve this problem :(
UPD. HTML CODE:
<iframe id="cke_107_fileInput" class="cke_dialog_ui_input_file" frameborder="0" src="javascript:void(0)" title="Upload Image" role="presentation" allowtransparency="0">
<html lang="en" dir="ltr">
<head>
<body style="margin: 0; overflow: hidden; background: transparent;">
<form lang="en" action="gui/ckeditor/FileUploadServlet?CKEditor=gwt-uid-7&CKEditorFuncNum=0&langCode=en" dir="ltr" method="POST" enctype="multipart/form-data">
<label id="cke_106_label" style="display:none" for="cke_107_fileInput_input">Upload Image</label>
<input id="cke_107_fileInput_input" type="file" size="38" name="upload" aria-labelledby="cke_106_label">
</form>
<script>
window.parent.CKEDITOR.tools.callFunction(90);window.onbeforeunload = function() {window.parent.CKEDITOR.tools.callFunction(91)}
</script>
</body>
</html>
</iframe>
First get the iframe element using javascript like your existing cod and store it into Iframe of GWT
IFrameElement iframe = (IFrameElement) element;
Now use iframe to get content
iframe.getContentDocument().getBody().getInnerText();
Hope it help you to get values.
The contents() method returns the HTMLDocument, so normally you have to find the <body> to manipulate it.
$("iframe.cke_dialog_ui_input_file").contents().find("body");
A common mistake is to query the iframe before it has been fully loaded, so code a delay using a Timer, Scheduler or GQuery.delay(). For instance:
$("iframe.cke_dialog_ui_input_file")
.delay(100,
lazy()
.contents().find("body")
.css("font-name", "verdana")
.css("font-size", "x-small")
.done());

Categories