JSoup - Formatting the <option> elements - java

Let's say I have this HTML :
<html>
<head>
</head>
<body>
<form method="post">
<select name="books">
<option value="111">111</option>
<option value="222">222</option>
</select>
</form>
</body>
</html>
I load it in Jsoup and get the result back :
Document doc = Jsoup.parse(html);
doc.outputSettings().indentAmount(4);
doc.outputSettings().charset("UTF-8");
doc.outputSettings().prettyPrint(true);
String result = doc.outerHtml();
This result is :
<html>
<head>
</head>
<body>
<form method="post">
<select name="books"> <option value="111">111</option> <option value="222">222</option> </select>
</form>
</body>
</html>
The <option> elements are all on the same line!
How can I have Jsoup to format the <option> elements so the result is the same than the input, in this example?

doc.outputSettings().charset("UTF-8");
When parsing just html from a string, the default charset is UTF-8, unless you otherwise set the charset using File or InputStream as your parse input.
Therefore, the charset on OutputSettings will default to the same as input, which is UTF-8, in your case. You only need to set this if you want it to be different from the input.
Document.OutputSettings.charset()
Get the document's current output charset, which is used to control which characters are escaped when generating HTML (via the html() methods), and which are kept intact.
Where possible (when parsing from a URL or File), the document's output charset is automatically set to the input charset. Otherwise, it defaults to UTF-8.
doc.outputSettings().prettyPrint(true);
You don't need to enable pretty print, it is on by default.
Document.OutputSettings.prettyPrint()
Get if pretty printing is enabled. Default is true. If disabled, the
HTML output methods will not re-format the output, and the output will
generally look like the input.
doc.outputSettings().outline(true);
This is the key tag. When this is not set, only block tags are displayed as such (option is not a block tag). When it is enabled, all tags are considered block elements.
Document.OutputSettings.outline()
Get if outline mode is enabled. Default is false. If enabled, the HTML output methods will consider all tags as block.
So your final block of code should look something like this:
Document doc = Jsoup.parse(html);
doc.outputSettings().indentAmount(4).outline(true);
String result = doc.outerHtml();
Output
<html>
<head>
</head>
<body>
<form method="post">
<select name="books">
<option value="111">111</option>
<option value="222">222</option>
</select>
</form>
</body>
</html>

Related

How to add customised tag to add a pagebreak in html2pdf conversion using iText 7

<div>
<h1>
This is a custom tag worker sample.
</h1>
<br>
<h1>
This is a custom tag worker sample.</h1>
<break/>
<h1>
This is a custom tag worker sample after page break.</h1>
<break/>
<h3>This is last page</h3>
</div>
According to the above html, I want the generated pdf to contain a pagebreak of type next_page wherever there is a break tag.

How to parse "text" from span class with Jsoup

I want to parse the text in span class with Jsoup.
Here is my Html code portion.
<html>
<head></head>
<body>
<div>
<div class = "abcd">
<span> This is text </span>
</div>
<div>
</body>
</html>
I wrote something like that
Element element = doc.select("div.abcd > span");
System.out.println("Text = "+element.text());
This isn't working. Is there any other way to do this?
Change "div.abcd > span"
to
"div.abcd span"

Pass selected value from dropdown using JSP

I need to implement some basic dropdown using jsp and java, but I can't find more info how to do that. So I never write something using JSP and when I didnt find nothing that help the last options for me was to ask.
I want to get the selected value and when click the button to send the value to anoher .jsp file ("selector.jsp in my case")
Please folks help me with some easy solution.
p.P.: Sorry for my english (:
index.jsp
<FORM method="post" action="selector.jsp">
<select name="select" id="dropdown">
<%
Test t = new Test();
t.getList().add("a");
t.getList().add("b");
t.getList().add("c");
for(int i=0; i < t.getList().size(); i++){
%>
<Option value="<%t.getList().get(i);%>"><%=t.getList().get(i)%></Option>
<%}%>
</select>
<input type="submit" value="click">
selector.jsp
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>Insert title here</title>
</head>
<body>
You selected:
<%
request.getParameter("select");
request.getParameterValues("select");
%>
</body>
</html>
I found a solution by removing
value="<%t.getList().get(i);%>"
from and leave the code just with
<Option><%=t.getList().get(i)%></Option>
but i don't know why... if someone can explain will be great.
Thx! (:
As you have indicated in your post, the problem is solved by replacing
value="<%t.getList().get(i);%>"
with
<Option><%=t.getList().get(i)%></Option>
The reason that works is as follows:
In your first form, <%t.getList().get(i);%>, you have a JSP scriptlet. This is Java code that is executed inline. In your case, this executes the "get" method. Note however that the get method returns a value, but this value is not output into the response stream.
In your second form, you have formed a JSP expression by using "<%=". "<%=" is shorthand for "out.println", thus you have provided shorthand for the following:
<Option><% out.println(t.getList().get(i)) %></Option>
This writes the return value of the method call to the output stream. So that when this output reaches the browser, there is an actual value within the Option tags.

getting a check box value

I am sending a request to the servlet it returning some data from db am constructing table and check box with the resulted data in, servlet itself using out.println and now i need to do select the data for further manipulation using check box and now i dono how to get a value of selected text boxes.
here is my servlet code,
ps=connection.prepareStatement("select t.tc_name,s.scenario_name,t.scenario_id from testcase t, scenario s where t.scenario_id=s.scenario_id;");
ResultSet rs=ps.executeQuery();
out.println("<table>");
/*out.println(executionValues.append("<tr><td>").append("Test Case Name :").append("</td><td>").append("Scenario Name :").append("</td></tr>"));*/
while(rs.next()){
out.println("<li class='panel' value='"+rs.getInt("scenario_id")+"'><b>Scenario Name:</b>"+rs.getString("scenario_name")+"</li><b>Test Case Name:</b>"+rs.getString("tc_name")+"<input type=\"checkbox\" name=\"checkbox\"></li>");
}
you should remove the ;
your sql query:
("select t.tc_name,s.scenario_name,t.scenario_id from testcase t, scenario s where t.scenario_id=s.scenario_id;");
you should change like:
("select t.tc_name,s.scenario_name,t.scenario_id from testcase t, scenario s where t.scenario_id=s.scenario_id");
You're printing a whole new <html> and <form> around every single checkbox. Your HTML ends up in browser like as:
<html>
<head></head>
<body>
<html><body><form><input type="checkbox"></form></body></html>
<form><input type="submit"></form>
</body>
</html>
This is syntactically invalid HTML. You need to rewrite your code so that all checkboxes and the submit button ends up in the same form:
<html>
<head></head>
<body>
<form>
<input type="checkbox">
<input type="submit">
</form>
</body>
</html>
Then you also don't need those ugly JavaScript workarounds. You just give the checkboxes the same name, but a different value. This way you can just grab the checked values by HttpServletRequest#getParameterValues().
String[] users = request.getParameterValues("user");
For example:
<form name="input" action="html_form_action" method="get">
<input type="checkbox" name="vehicle" value="Bike">I have a bike<br>
<input type="checkbox" name="vehicle" value="Car">I have a car
<br><br>
<input type="submit" value="Submit">
</form>
If you check both of checkboxes your server will receive this parameters like so:
http://sitename.com/your_page.jsp?vehicle=Bike&vehicle=Car
After that you can get values like this:
String checkboxValues = request.getParameter("vehicle");
checkboxValues gets all values separated by comma.
Refer this link:
http://theopentutorials.com/examples/java-ee/servlet/getting-checkbox-values-from-html-form-in-servlet/
Dear suganth It is not appropriate now to put html code in servlet. Instead you use jsp pages.
but if you have to do in this way then You may provide
if(conditionMatch){
// code for checked box
}
else{
// code for unchecked box
}
Hope this helps

how to Encode and decode text in Jsp

<%#page import="java.net.URLDecoder"%>
<%# page language="java" contentType="text/html; charset=ISO-8859-1"
pageEncoding="ISO-8859-1"%>
<%#page import="java.net.URLDecoder"%>
<%#page import="java.net.URLEncoder"%>
<html>
<form action="index.jsp">
<body>
First INPUT:
<input name="firstinput" type="text" name="fname">
<br>
<input type="submit" value="Submit">
<%
String first = request.getParameter("firstinput");
String Searchtext=URLDecoder.decode(first,"UTF-8");
out.println(Searchtext);
out.println(URLEncoder.encode(Searchtext,"UTF-8"));
%>
</body>
</form>
</html>
This is My code I want to Encode and Decode text in Jsp Actully I want that when Input Text " ",' ',/ /...any special charter it should print same as it is text like if Input "hello" or hello then it should Print hello or if input 'hello' then also it should Print hello... special charter should Not display please help me i am Unable to do this ...
I think you need this:
String lWithoutSpecials = first.replaceAll("[^\\p{Alpha}]+","");
For me it works great:
String s = "\\Hello\\ \"Hello\" 'Hello'";
String lWithoutSpecials = s.replaceAll("[^\\p{Alpha}]+", "");
System.out.println(lWithoutSpecials);
Output:
HelloHelloHello
You are not using full Unicode but Latin-1, ISO-8859-1. This Latin-1 will browsers
interprete as MS Windows Latin-1, or "Cp-1252"/"Windows-1252". This charset has some special characters like comma like quotes, € (euro), etcetera.
URL encoding/decoding is done automatically. The data entry of the input may
cause numeric HTML entities to arrive at the server, like Ӓ when having a restricted charset like Latin-1. With UTF-8 for the entire Unicode characters you need to add to the <form accept-charset="UTF8"> to prevent substitution by numeric entities.
A HTML 5 form:
<%#page language="java" contentType="text/html; charset=Windows-1252"
pageEncoding="Windows-1252"
import="java.net.URLDecoder"
import="java.net.URLEncoder"
%><!DOCTYPE html>
<html>
<head>
<title>First Input</title>
<meta charset="ISO-8859-1">
</head>
<body>
<form action="index.jsp">
First INPUT:
<input name="firstinput" type="text"
value="${param.firstinput}">
<br>
<input type="submit" value="Submit">
<%
String first = request.getParameter("firstinput");
String searchtext = first;
out.println(searchtext);
%>
</form>
</body>
</html>
It lies saying its charset is the limited ISO-8859-1, but java delivers the larger charset Windows-1252.
The tag <form> must be inside the <body>. If you did that for form margins and such, use CSS styles.

Categories