How can I get content of HTML <body> - java

when I have html:
<html>
<head>
</head>
<body>
text
<div>
text2
<div>
text3
</div>
</div>
</body>
</html>
how can I get with DOM parser in JAVA content of body:
text
<div>
text2
<div>
text3
</div>
</div>
becasuse method getTextContent return:text text2 text3. - so without tags.
It is possible with SAX, but it is possible with DOM, too?

The getTextContent is behaving as I would expect - getting the textural content of the HTML fragment. Can you check the API docs for the DOM parser and see if there's a similar method with a name like getHtmlContent?

You would need to parse the document into a DOM and serialise only the portion of the DOM you wanted. Using the DOM Level 3 LS interfaces you can serialise the outer-XML of a single node with:
LSSerializer serializer= implementation.createLSSerializer();
String html= serializer.writeToString(node);
To get the inner-XML you would need to writeToString each child node in turn (eg. into a StringBuffer).
Depending on what DOM implementation you are using there may be alternative non-standard methods. There may also be risks with serialising HTML as XML, if that's what you're doing... eg. a standard XML serialiser may output a self-closing tag for an empty tag, which can confuse browsers parsing the output as legacy-HTML.

Related

Read an html attribute value from a Thymeleaf fragment

i'm using thymeleaf as the template engine on a Java - Spring web application which is already a completed website, what i'm working now is to introduce some <meta> tags in the header to optimize the interaction with social network platforms.
My goal is to accomplish this modifying as little as possible of the templates code, and to have some flexibility on which values goes into the tags content.
Right now the templates are structured as follows:
Layout.html which contains a couple fragments for head and body.
<head th:fragment="common_head(title, links, scripts)" th:assert="${!#strings.isEmpty(title)}">
...
</head>
<body th:fragment="common_body(content, body_end)">
...
</body>
And the templates for the actual pages are something like this.
<head th:replace="common/layout :: common_head(~{:: head/title}, ~{}, ~{:: head/script})">
<title>My page title</title>
<script>
console.log('some page specific JS code');
</script>
...
</head>
<body th:replace="common/layout :: common_body(~{ :: body/content }, ~{ :: body/bottom })">
<div class="wrapper" th:fragment="content">
some content here
</div>
<th:block th:fragment="bottom">
more content here
</div>
</body>
I know i can just add another parameter to common_head fragment and pass the <meta> tags to it the same way i'm doing with title, scripts, etc. but i was thinking on another approach which will lead to less repetition of code to inject the values into the header.
In the layout common_head fragment i have this:
<meta property="og:title" th:with="og_value = ~{this :: %og_title/text()}" th:content="${og_value ne null ? og_value : 'Lorem ipsum'}">
<meta property="og:description" th:with="og_value = ~{this :: %og_description/text()}" th:content="${og_value ne null ? og_value : 'Lorem ipsum'}">
<meta property="og:image" th:with="og_value = ~{this :: %og_image}" th:content="${og_value ne null ? og_value : 'path-to-default-img'}">
The idea is to use fragment selectors to pick the right content to inject from the page template, this way in a page i can just mark a tag (maybe a div, a paragraph, something relevant for that template) with th:ref="og_description" and it's value will become the content of the <meta> tag in the head.
This works really well for og_title and og_description tags, but the problem arises with the og:image meta tag, in which i need to inject the value of the src attribute of an <img> tag marked with th:ref="og_image".
I couldn't find any way to read an attributes value from the fragment, is there a way to do this?.
I can see that the selected fragment is actually an instance of org.thymeleaf.standard.expression.Fragment but i don't see any method i can use to access the html attributes in it.
If this is not technically possible, is there a better approach to this use case?

How to display java String with html tag appended, with the html behavior in angualrjs front end

I have a string in java,I need to append html tag to it dynamically so that when displayed in the frond it,the html tags behavior is felt.
Eg:
String content="Hello World,this is a test <em>content</em> to demonstrate the requirement";
In the above string content is wrapped inside the <em> tag.But when I am trying to display it in angularjs front end, the string is not taking the tag behavior and displayed as "Hello World,this is a test <em>content</em> to demonstrate the requirement".
use angular-sanitize.js for the same -
example
<div ng-controller="testCtrl">
<div ng-bind-html="stringTest"></div>
</div>
you can use ng-bind-html
<div ng-controller="testCtrl">
<div ng-bind-html="stringTest"></div>
</div>
However, if you find this directive too restrictive and when you absolutely trust the source of the content you are binding to, then you can also use ng-bind-html-unsafe.
<div ng-controller="testCtrl">
<div ng-bind-html-unsafe="stringTest"></div>
</div>

Replace html content with another html content

I have a jsp page new.jsp which has the following content say
<div class="parentClass">
<ul>#myLinks</ul>
</div>
More
I have to replace the #myLinks with another html that i generate in my action class.I read the content of the jsp page using htmlReader which is working correctly.But I tried using replaceAll function to replace my unwanted content.myPage contains the content of my jsp page.
subLinks content is a follows
<li>abc</li>
<li>abc</li>
<li>abc</li>
<li>abc</li>
But even after using this, the output of replacedContent is the same.
replacedContent = myPage.replaceAll("#myLinks", subLinks);
I need the final output as
<div class="parentClass">
<ul>
<li>abc</li>
<li>abc</li>
<li>abc</li>
<li>abc</li>
</ul>
</div>
More
Could someone help me on this.Thanks in advance.. :)
You can use different ways to do it, but the most simple way is to use some property or method in the action class that returns a string which contains a generated html. Then you you can write this html directly to the out.
<ul><s:property value="%{subLinks}" escapeHtml="false" /></ul>
of course you should provide a getter for subLinks property.
You can do this using javascript.
First give some id to sublink container tag <ul id="test">
then hope you get sublinks as string. like sublinks = "<li>abc</li><li>abc</li>..."
Now write following javascript code at last in body.
<script type="text/javascript">
document.getElementById('test').innerHTML = sublinks;
</script>
That's it.
Hope this will be helpful to you.

Dividing an one-line HTML file to well-formed HTML file

I have an HTML file in which all tags are in one line. I would like to separate each tag and put it on its own line. The end goal is to have a well-formed HTML file.
e.g.
<html><head><title>StackOverflow</title></head><body></body></html>
would be converted into:
<html>
<head>
<title>
StackOverflow
</title>
</head>
<body>
</body>
</html>
Is there an existing Java library that handles this already?
Your problem has nothing to do with well-formed HTML files. Even if html tags are on the same line, doesn't mean that the html is not well formed.
What you actually neeed is just a formatter, which basically will make your html more human-readable.
You could take a look at JTidy, which can optionally do also a syntax checking.

Embedded custom-tag in dynamic content (nested tag) not rendering

Embedded custom-tag in dynamic content (nested tag) not rendering.
I have a page that pulls dynamic content from a javabean and passes the list of objects to a custom tag for processing into html. Within each object is a bunch of html to be output that contains a second custom tag that I would like to also be rendered. The problem is that the tag invocation is rendered as plaintext.
An example might serve me better.
1 Pull information from a database and return it to the page via a javabean. Send this info to a custom tag for outputting.
<jsp:useBean id="ImportantNoticeBean" scope="page" class="com.mysite.beans.ImportantNoticeProcessBean"/> <%-- Declare the bean --%>
<c:forEach var="noticeBean" items="${ImportantNoticeBean.importantNotices}"> <%-- Get the info --%>
<mysite:notice importantNotice="${noticeBean}"/> <%-- give it to the tag for processing --%>
</c:forEach>
this tag should output a box div like so
*SNIP* class for custom tag def and method setup etc
out.println("<div class=\"importantNotice\">");
out.println(" " + importantNotice.getMessage());
out.println(" <div class=\"importantnoticedates\">Posted: " + importantNotice.getDateFrom() + " End: " + importantNotice.getDateTo()</div>");
out.println(" <div class=\"noticeAuthor\">- " + importantNotice.getAuthor() + "</div>");
out.println("</div>");
*SNIP*
This renders fine and as expected
<div class="importantNotice">
<p>This is a very important message. Everyone should pay attenton to it.</p>
<div class="importantnoticedates">Posted: 2008-09-08 End: 2008-09-08</div>
<div class="noticeAuthor">- The author</div>
</div>
2 If, in the above example, for instance, I were to have a custom tag in the importantNotice.getMessage() String:
*SNIP* "This is a very important message. Everyone should pay attenton to it. <mysite:quote author="Some Guy">Quote this</mysite:quote>" *SNIP*
The important notice renders fine but the quote tag will not be processed and simply inserted into the string and put as plain text/html tag.
<div class="importantNotice">
<p>This is a very important message. Everyone should pay attenton to it. <mysite:quote author="Some Guy">Quote this</mysite:quote></p>
<div class="importantnoticedates">Posted: 2008-09-08 End: 2008-09-08</div>
<div class="noticeAuthor">- The author</div>
</div>
Rather than
<div class="importantNotice">
<p>This is a very important message. Everyone should pay attenton to it. <div class="quote">Quote this <span class="authorofquote">Some Guy</span></div></p> // or wahtever I choose as the output
<div class="importantnoticedates">Posted: 2008-09-08 End: 2008-09-08</div>
<div class="noticeAuthor">- The author</div>
</div>
I know this has to do with processors and pre-processors but I am not to sure about how to make this work.
Just using
<bodycontent>JSP</bodycontent>
is not enough. You should do soimething like
JspFragment body = getJspBody();
StringWriter stringWriter = new StringWriter();
StringBuffer buff = stringWriter.getBuffer();
buff.append("<h1>");
body.invoke(stringWriter);
buff.append("</h1>");
out.println(stringWriter);
to get inner tags rendered (example is for SimpleTag doTag method).
However, in the question's code I see that inner tag is comming from a string which is not rendered as a part of JSP, but just some random string. I do not think you can force JSP translator to parse it.
You can use regexp in your case or try to redesign your code in a way to have a jsp like this:
<jsp:useBean id="ImportantNoticeBean" scope="page class="com.mysite.beans.ImportantNoticeProcessBean"/>
<c:forEach var="noticeBean" items="${ImportantNoticeBean.importantNotices}">
<mysite:notice importantNotice="${noticeBean}">
<mysite:quote author="Some Guy">Quote this</mysite:quote>
<mysite:messagebody author="Some Guy" />
</mysite:notice>
</c:forEach>
I whould go with regexp.
I would be inclined to change the "architecture of your tagging" in that the data you wish to achieve should not be by tag on the inside of the class as it is "markup" designed for a page(though in obscurity it is possible to get the evaluating program thread of the JSP Servlet engine).
What you would probably find better and more within standard procedure would be using "cooperating tags" with BodyTagSupport class extension and return EVAL_BODY_BUFFERED in doStartTag() method to repeat process the body and/or object sharing such as storing retrived data in the application hierarchy of the session or on the session for the user.
See oracle j2ee custom tags tutorial for more information.

Categories