Remove text from a node but not descendant nodes - java

I have an XML with HTML data, and trying to remove free text lying inside 'Body' tag without removing the child 'DIV' tag contents. Till now I have used removeChild(), which also removed everything else inside BODY.
Then tried getting the NODE_TYPE==3 for filtering and removing only text content, but I am getting NODE_TYPE==1 when running it.
When using setTextContent(), it is setting the whole tag data to my input string.
This is what my XML Looks like :
<?xml version="1.0" encoding="UTF-8"?>
<HTML>
<HEAD>
<META content="text/html; charset=utf-8" http-equiv="Content-Type"/>
</HEAD>
<BODY>
<DIV class="WordSection1">
<P>Enter Text here</P> <P>COMPLETED</P>
</DIV>
TEXT I WANT TO REMOVE
</BODY>
</HTML>
After changes, I need output like this :
<?xml version="1.0" encoding="UTF-8"?>
<HTML>
<HEAD>
<META content="text/html; charset=utf-8" http-equiv="Content-Type"/>
</HEAD>
<BODY>
<DIV class="WordSection1">
<P>Enter Text here</P> <P>COMPLETED</P>
</DIV>
</BODY>
</HTML>
Any suggestions ?

I understand you're using the 'old' org.w3c.dom library that comes with Java. Assuming you read the document content into a Document doc, you could do:
Node textNode = doc.getDocumentElement().getLastChild().getPreviousSibling().getLastChild();
doc.getDocumentElement().getLastChild().getPreviousSibling().removeChild(textNode);
...although this isn't quite robust with regards to changes to the input XML.
You might want to try a different XML API (e.g. JDom). The old one often doesn't make your life very easy.

Related

Replace custom tags in html with Java

is there a library on Java to help me to achieve custom tags replacement in html
like for example here is a simple template :
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Title</title>
</head>
<body>
<div>
<p>$welcome_title</p>
<p>$email_body</p>
<p>$footer_text</p>
</div>
</body>
</html>
Can i replace this custom tags ($welcome_title,$email_body,$footer_text) with values from java ?
The idea is to have template with tags which can be replaced at runtime with values from java objects :)
Also maybe (if there is a library) to generate straight away from html an PDF doc
Thanks :)
In Java world you can use https://www.thymeleaf.org/ or https://freemarker.apache.org/

Muliple body nodes present in HTML dom, so Selenium fails to find the element in dom

I'm trying to automate an application and that has multiple HTML head and body tags present. Below is the sample provided. I tried all possibility using xpath, id , class etc. It doesn't work for this application alone as it as embedded HTML page inside the DOM. I guess, JavaScript loads the a new HTML page inside the page.
Even-though the XPath works in Chrome browser, when I put it in script and run, it throws an exception:
Exception in thread "main" org.openqa.selenium.NoSuchElementException: Unable to locate element: //*[text()='Continue'].
How to tackle this problem?
HTML DOM Sample:
<html class="UShellFullHeight">
<head>
<style id="antiClickjackStyle" type="text/css">
body {
display : none !important;
}
</style>
</head>
<body class="UiBody UShellFullHeight" role="application">
<div id="canvas" class="UShellFullHeight"></div>
#document
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html id="home" lang="EN">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
.
.
.
<span id="WD8A-cnt" class="urNoUserSelect lsButton--contentlsControl--centeraligned urBtnCnt" style="pointer-events:none;">
<span class="lsButton__text " id="WD8A-caption" style="white-space:nowrap;">Continue</span>
</span>
</body>
</html>
</body>
</html>
try with tag:
//span[text()='Continue']
or
the best solution for this example is to use id, this element has id:
driver.findElement(By.id("WD8A-caption"));
Or this xpath which is the same
//span[#id='WD8A-caption']
Try with xpath by creating manually
Create Xpath Manually or
the element has id and also class name driver.findElement(By.className("lsButton__text "))
Thanks a lot for your time guys. I got the answer myself.
Answer is i need to switch the frame and do the operation on elements.
to switch frame.
driver.switchTo().frame("id");

Multiple body tags in Sitemesh 3

I have been using Sitemesh 3 for my project and so far it's been working great. Recently I came across a situation where I am stuck.
My final view has to be composed of 2 html files, both have their own and tags.
File1:
<html>
<head>Head1</head>
<body>body1</body>
</html>
File2:
<html>
<head>Head2</head>
<body>body2</body>
</html>
I am composing a view using freemarker include tag. So, the composed HTML looks like:
<html>
<head>Head1</head>
<body>body1</body>
</html>
<html>
<head>Head2</head>
<body>body2</body>
</html>
Following is my decorator:
<html>
<head>
<sitemesh:write property='head'/>
</head>
<body>
<div class="container">
<sitemesh:write property='body'/>
</div>
</body>
</html>
But once decorated, the final output I am getting is:
<html>
<head>
<head>Head1</head>
</head>
<body>
<div class="container">
<body>body1</body>
</div>
</body>
</html>
But the expected output is
<html>
<head>
<head>
Head1
Head2
</head>
</head>
<body>
<div class="container">
body1
body2
</div>
</body>
</html>
I came across a similar question, but that solution won't work for me because I don't want to create multiple decorators.
I just want to know if it's possible in Sitemesh 3. If yes, then how.
Thanks.
If you don't mind extending Sitemesh 3 then this is fairly easy to do by adding support for server side includes in your decorator template. I do exactly this in another library (UtterlyIdle).
I'm using StringTemplate as my decorator language but this should work in Freemarker or any other templating tool. I add in a PageMap and then in my decorator template call
$include("someUrl").body$
This does a include and then parses the output with the Sitemesh 3 engine. This allows you to have as many includes as you like.
Hope that makes sense

Play! framework. template "include"

I'm planning my website structure as following:
header.scala.html
XXX
footer.scala.html
now, instead of "xxx" there should be a specific page (i.e. "UsersView.scala.html").
what I need is to include (like with well-known languages) the source of the footer and the
header into the the middle page's code.
so my questions are:
How do you include a page in another with scala templating?
Do you think it's a good paradigm for Play! framework based website?
Just call another template like a method. If you want to include footer.scala.html:
#footer()
A common pattern is to create a template that contains the boilerplate, and takes a parameter of type HTML. Let's say:
main.scala.html
#(content: HTML)
#header
// boilerplate
#content
// more boilerplate
#footer
In fact, you don't really need to separate out header and footer with this approach.
Your UsersView.scala.html then looks like this:
#main {
// all your users page html here.
}
You're wrapping the UsersView with main by passing it in as a parameter.
You can see examples of this in the samples
My usual main template is a little more involved and looks roughly like this:
#(title: String)(headInsert: Html = Html.empty)(content: Html)(implicit user: Option[User] = None)
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<title>#title</title>
// bootstrap stuff here
#headInsert
</head>
<body>
#menu(user)
<div id="mainContainer" class="container">
#content
</div>
</body>
</html>
This way a template can pass in a head insert and title, and make a user available, as well as content of course.
Play provide a very convenient way to help implement that!
Layout part from official docs:
First we have a base.html (that's we call in django -_-)
// views/main.scala.html
#(title: String)(content: Html)
<!DOCTYPE html>
<html>
<head>
<title>#title</title>
</head>
<body>
<section class="content">#content</section>
</body>
</html>
How to use the base.html?
#main(title = "Home") {
<h1>Home page</h1>
}
More information here

JEditorPane saves HTML using entities instead diacritics

I have a file, containing czech text common file split to two lines:
<html>
<head>
<meta http-equiv="contet-type" content="text/html; charset=UTF-8"/>
</head>
<body>
<p>Běžný</p>
<p>soubor</p>
</body>
</html>
When I load this file to JEditorPane using HTMLEditorKit and then save it (like having it edited), the underlying model (HTML code) is changed to:
<html>
<head>
<meta http-equiv="contet-type" content="text/html; charset=UTF-8"/>
</head>
<body>
<p style="margin-top: 0">Běžný</p>
<p style="margin-top: 0">soubor</p>
</body>
</html>
Is there some way to get out of margins and entities? Must I inevitably override some methods of HMLEditorKit?
PS: Is there some another embedable (and free) simple Java HTML (WYSIWYG-like) editor? But I need to handle some special tags from my own XML-namespace. (Ideally HTML 4.0 compliant.)
Please use Net Beans IDE 7.0.
Downloads free
http://netbeans.org/downloads/

Categories