I need to create a method that reads a html file then display the number of word occurrence.
for example: String [] words = {"happy", "nice", "good"};
The word happy was used 7 times.
The word nice was used 1 times.
The word happy was used 2 times.
This is what I did:
public static void ReadWriteDisplay() {
Path in = Paths.get("E:\\TextToHTML.html");
Path out = Paths.get("E:\\HTMLToText.txt");
String s = "";
String str = "";
try {
InputStream input = new BufferedInputStream(Files.newInputStream(in));
BufferedReader reader = new BufferedReader(new InputStreamReader(input));
OutputStream output = new BufferedOutputStream(Files.newOutputStream(out, CREATE, WRITE, TRUNCATE_EXISTING));
BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(output));
s = reader.readLine();
while(s != null) {
str += s;
writer.write(s);
writer.newLine();
s = reader.readLine();
}
reader.close();
writer.close();
String a[] = str.split(" ");
System.out.println("str: "+str);
String [] positive = {"happy", "nice", "good", "joy", "love"};
int [] count = {0, 0, 0, 0, 0};
for (int i = 0; i < a.length; i++) {
if(positive[0].equalsIgnoreCase(a[i]))
count[0]++;
if(positive[1].equalsIgnoreCase(a[i]))
count[1]++;
if(positive[2].equalsIgnoreCase(a[i]))
count[2]++;
if(positive[3].equalsIgnoreCase(a[i]))
count[3]++;
if(positive[4].equalsIgnoreCase(a[i]))
count[4]++;
}
for (int x = 0; x < 5; x++) {
System.out.println("The word "+positive[x]+" was used "+count[x]+" times.");
}
} catch(Exception e) {
System.err.println("Message: "+ e);
}
}
My method runs but it does not provide accurate number of occurrence. The reason because some words in html are enclosed in <> which caused <>Hello<> to be stored in my string array instead of the word Hello.
Here is the sample output:
str: <!DOCTYPE html><html lang="en"><head> <meta charset="utf-8"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> <meta http-equiv="content-language" content="en" /> <meta name="viewport" content="width=device-width, initial-scale=1"> <meta name="google-site-verification" content="rUp8isOBygjhxPJ2qyy6QtBi9vWRFhIboMXucJsCtrE" /> <title>JustPaste.it - Share Text & Images the Easy Way</title> <link rel="preload" href="/static/img/jp_logo_1_en_v4.png" as="image" /> <meta name="robots" content="noindex, nofollow" /> <meta name="googlebot" content="noindex, nofollow" /> <link rel="preload" href="/build/global.395f53d0.css" as="style" /> <link rel="stylesheet" type="text/css" href="/build/global.395f53d0.css" /> <link rel="shortcut icon" href="/static/other/fav.ico" /> <!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries --> <!-- WARNING: Respond.js doesn't work if you view the page via file:// --> <!--[if lt IE 9]> <script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script> <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script> <![endif]--> <script> window.article = {"id":42017684,"url":"https:\/\/justpaste.it\/6fn9m","shortUrl":"https:\/\/jpst.it\/2wiek","pdfUrl":"https:\/\/justpaste.it\/6fn9m\/pdf","qrCodeData":"data:image\/png;base64,iVBORw0KGgoAAAANSUhEUgAAAFcAAABXCAIAAAD+qk47AAAACXBIWXMAAA7EAAAOxAGVKw4bAAACCklEQVR4nO2by27DMAwEx0X\/\/5fTAwFdaNB8SEmB7BzjSDEWy4ikpOv1evH1\/Hz6Bf4FUgGkgiEVQCoYv\/6j67omM65FJzOPX6HWKD9PaebSj8oLIBWMm4hYlBIq79Jg+Pqyd3vpR4dvuJAXQCoYUUQsAi9lPOlt74dnloZzbygvgFQwUhExpJft9EKjh7wAUsF4R0QE+Bh5g\/898gJIBSMVEUNzDjOiDMN55AWQCkYUEcOWTqlrtL18KCEvgFQwbiJie7qSMXkpELa\/obwAUsFI7UcEpXHw397bmMh0cXtJVzBKXgCpYFyB3xYlT\/Ye3bzZ7q264EflBZAKRmqHLmPyYJR\/5IeXEqrt8SgvgFQwojoiY9feEpN5VCLo4maQF0AqGLVzTcM\/50UpEdpVj+sUxwNSAao7dJk6erHrhN65umYhL4BUMGoRUTJ56TsBw\/UoM0peAKlg1CrrRamgLnEu6VLW9IBUgLj7Ouz\/DJePHr16RF4AqWA096yDc92lCXs3hjzDyJIXQCoYB+\/Q9Q4vDS9cBPOojnhAKsDRO3R+nl3dp94uhrKmB6QCHL1Dlznp1GsWbUdeAKlgvOPGUK8juqt5mymx5QWQCsbBiCglS5+9KCEvgFQwDt6hO3djdHtfV14AqWAcvEO36B1M6mVNvQpFXgCpYNzs0H0h8gJIBUMqgFQwpALAH\/JvmLtnlWjnAAAAAElFTkSuQmCC"}; window.statsUrl = 'https\u003A\/\/stats.justpaste.it'; window.viewKey = 'x6ER'; window.barOptions = {"isLoggedIn":false,"hasPublicProfile":false,"displayOwnership":false,"isArticleOwner":false,"isPasswordProtected":false,"isCaptchaRequired":null,"isCaptchaEntered":false,"captchaSettings":null,"premiumUserData":null,"isPrivate":false,"isExpired":false,"expireAfterRead":false,"isShared":false,"defaultAvatar":"\/static\/img\/avatar60.jpg","createdText":"6h","showLastEdit":false,"modifiedText":"6h","isInTrash":false,"viewsText":"2","favouritesCount":0,"onlineText":"1","getFavouriteArticleUrl":"https:\/\/justpaste.it\/api\/account\/v1\/favourite-article\/42017684","addFavouriteArticleUrl":"https:\/\/justpaste.it\/api\/account\/v1\/favourite-article","removeFavouriteArticleUrl":"https:\/\/justpaste.it\/api\/account\/v1\/favourite-article-delete\/42017684","apiShowArticleDynamicUrl":"\/api\/v1\/article-dynamic","voteUrl":"\/api\/account\/v1\/vote","contentLang":"en","positiveVotes":0,"negativeVotes":0,"currentVote":"empty","linkSharingUrl":null,"linkSharingSecret":null}; </script> <script src="/build/runtime.a1e5a72a.js" async></script> <script src="/build/1676.2c557867.js" async></script> <script src="/build/8452.a9a1e0c5.js" async></script> <script src="/build/5936.ad26e56d.js" async></script> <script src="/build/9412.4a605741.js" async></script> <script src="/build/showarticlewidget.3bbca334.js" async></script> </head><body marginwidth="0" dir="ltr" marginheight="0"><!-- Static navbar --><div class="navbar navbar-default navbar-static-top mainTableTopMiddle" role="navigation"> <div class="container"> <div class="navbar-header pull-left"> <img src="/static/img/jp_logo_1_en_v4.png" width="186px" height="54px" alt="JustPaste.it" /> </div> <div class="navbar-header pull-left"> <div class="nav navbar-nav mainTableTopMiddleRight hidden-xs hidden-sm"> <img src="/static/img/jp_logo_2_en_v5.png" width="390px" height="54px" /> </div> </div> <div class="navbar-header pull-right" style="padding-top:8px"> <div id="mainPanelButtons"></div> </div> </div><!--/.nav-collapse --></div><div id="headContainer" class="container" style="max-width: 960px"> <div class="row"> <div class="col-md-12"> <div id="mainTableContent"> <div style="max-width: 960px; vertical-align: top"> <div id="showArticleWidget"><div class="showArticleWidgetPlaceholder"></div></div> <div id="articleContent"> <p>happy</p> <p>nice nice</p> <p>good good good</p> <p>joy Joy joy Joy joy</p> <p>Love love Love love Love</p> </div> <div id="showArticleBottomWidget"><div class="articleBottomWidgetPlaceholder"></div></div> <span style="visibility:hidden" class="glyphicon glyphicon-link"></span></div> </div> </div> </div> <!-- /row --></div> <!-- /container --><div id="footer" style="min-height: 30px;"> <div class="container" style="vertical-align: middle"> <div class="col-md-3 col-xs-5 col-sm-4 text-muted" style="font-size: 95%;" align="left"> © 2021 <span class="hidden-xs">justpaste.it</span> </div> <div class="col-md-9 col-xs-7 col-sm-8 text-muted" align="right"> <ul class="list-inline basePageFooterList"> <li class="hidden-xs"> Account </li> <li class="hidden-xs"> Terms </li> <li class="hidden-xs"> Privacy </li> <li class="hidden-xs"> Cookies </li> <li> Blog </li> <li> About </li> </ul> </div> </div></div> <script> window.mainPanelOptions = { addArticleUrl: '/', loginUrl: '/login', logoutUrl: '/logout', favouriteArticlesUrl: '/account/favourite', subscribedArticlesUrl: '/account/subscribed', sharedArticlesUrl: '/account/shared', manageAccountUrl: '/account/manage', messagesUrl: '/account/messages', articlesStatsUrl: '/account/articles-stats', premiumUrl: '/premium/subscription', unreadMessagesUrl: 'https://msg.justpaste.it/api/v1/conversation/unread', profileSettings: '/account/settings', isLoggedIn: false, userEmail: null, userPermalink: null, userProfileIsPublic: false, userProfileLink: null }; </script> <script src="/build/mainpanelwidget.80530742.js" async></script> </body></html>
The word happy was used 0 times.
The word nice was used 0 times.
The word good was used 1 times.
The word joy was used 3 times.
The word love was used 3 times.
How do I properly split or count the number of occurrence? Thank you!
You can simply use jsoup: Java HTML Parser library to fetch all text of html structure.
Download jar file from: https://jsoup.org/download
Below code will count occurrences of words:
static void countOccurance(String htmlStructure) {
String[] positive = { "happy", "nice", "good", "joy", "love" };
Document document = Jsoup.parse(htmlStructure);
String[] text = document.body().text().split("\\s+");
for (String word : positive) {
int wordCount = countWord(text, word);
System.out.println("The word " + word + " was used " + wordCount + " times.");
}
}
static int countWord(String[] documentText, String wordToFind) {
int count = 0;
for (int i = 0; i < documentText.length; i++) {
if (wordToFind.equalsIgnoreCase(documentText[i]))
count++;
}
return count;
}
This will help you to remove special characters, this will only allow alphabets for example : <>Hello<> will be replaced like Hello
String alphaOnly = input.replaceAll("[^a-zA-Z]+","");
I'm using java swt browser for java single page application. In the browser the webpages is showing the PDF successfully, but in the swt browser PDFs are not shown properly.
I'm using <object/> tag to display the PDFs.
$(document).on('click', 'input[type=button]', function(){
var datas = '<object data="files/Documents/'+$(this).attr('class')+'#page=1&zoom=130" type="application/pdf" width="100%" height="100%"></object>';
$('#custom-size-dialogBox').dialogBox({
width: screen.width-100,
height: screen.height-100,
hasMask: true,
title: 'View Details',
hasClose: true,
content: datas
});
});
If you can assume that a system-pdf viewer is installed, you can use the Browser widget by setting the following HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/loose.dtd">
<head>
</head>
<body onResize="fit();">
<embed
type="application/pdf"
src="http://www.adobe.com/security/pdfs/riskcompliance_faq.pdf"
id="pdfDocument">
</embed>
<script type="text/javascript">
fit();
function fit() {
var myWidth = 0, myHeight = 0;
if( typeof( window.innerWidth ) == 'number' ) {
//Non-IE
myWidth = window.innerWidth;
myHeight = window.innerHeight;
} else if( document.documentElement && ( document.documentElement.clientWidth || document.documentElement.clientHeight ) ) {
//IE 6+ in 'standards compliant mode'
myWidth = document.documentElement.clientWidth;
myHeight = document.documentElement.clientHeight;
} else if( document.body && ( document.body.clientWidth || document.body.clientHeight ) ) {
//IE 4 compatible
myWidth = document.body.clientWidth;
myHeight = document.body.clientHeight;
}
document.getElementById('pdfDocument').width = myWidth;
document.getElementById('pdfDocument').height = myHeight;
}</script>
</body>
</html>
The src of the embed tag must point to the desired pdf, for local files: file://myPath/../test.pdf
I read some mails out with javax.
Then I want to save the content of a message.
For example, I read a mail with the simple content of By: Test.
Now I read the content with the .getContent() method:
Object body = message.getContent();
String content = ((body instanceof String) ? (String) body : "NO STRING CONTENT");
But the problem here is, the simple e-mail content of By: Test gets displayed by the whole Outlook-source code of the message:
<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
#font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-fareast-language:EN-US;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.E-MailFormatvorlage17
{mso-style-type:personal-compose;
font-family:"Arial","sans-serif";
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";
mso-fareast-language:EN-US;}
#page WordSection1
{size:612.0pt 792.0pt;
margin:70.85pt 70.85pt 2.0cm 70.85pt;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="DE-CH" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial","sans-serif"">By: Test<o:p></o:p></span></p>
</div>
</body>
</html>
So how can I read out a mail-content without getting the whole mail source-code?
First, I would start by extracting the content in the <body> section of the String. Afterwards, it depends on your liking, but you could remove every HTML-tag, for example, but beware that any formatting (line breaks!) code is gone and you get only a big chunk of text.
I just remember the simple and better way. You can just take a plain/text piece of the email.
String content = getPlainText((Part)message);
private String getPlainText(Part p) throws MessagingException, IOException {
if (p.isMimeType("text/plain")) {
return (String) p.getContent();
} else if (p.isMimeType("multipart/*")) {
Multipart mp = (Multipart) p.getContent();
for (int i = 0; i < mp.getCount(); i++) {
String s = getPlainText(mp.getBodyPart(i));
if (s != null) return s;
}
}
return null;
}
I've created autocomplete with Jquery UI library and try to get the text box value in java, but not getting the value instead of getting null value. Please help to get value from text box. This is the line String query = (String)request.getParameter("country"); not getting values ?
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<link rel="stylesheet" href="http://code.jquery.com/ui/1.10.3/themes/smoothness/jquery-ui.css" />
<script src="http://code.jquery.com/jquery-1.9.1.js"></script>
<script src="http://code.jquery.com/ui/1.10.3/jquery-ui.js"></script>
<style>
input {
font-size: 120%; }
</style>
</head>
<body>
<h3>Feature</h3>
<input type="text" id="country" name="country"/>
<script>
//$("#country").autocomplete("getdata.jsp");
$("#country").autocomplete({
source: "getdata.jsp",
minLength: 2,
select: function( event, ui ) {
log( ui.item ?
"Selected: " + ui.item.value + " aka " + ui.item.id :
"Nothing selected, input was " + this.value );
}
});
</script>
</body>
</html>
getdata.jsp
<%#page contentType="text/html" pageEncoding="UTF-8"%>
<%#page import="java.sql.*"%>
<%#page import="java.util.*"%>
<%
String query = (String)request.getParameter("country");
System.out.println("query"+query);
try{
String s[]=null;
Class.forName("oracle.jdbc.driver.OracleDriver");
Connection con =DriverManager.getConnection("XXXXX");
Statement st=con.createStatement();
ResultSet rs = st.executeQuery("select name from table1 where name like '"+query+"%'");
List li = new ArrayList();
while(rs.next())
{
li.add(rs.getString(1));
}
String[] str = new String[li.size()];
Iterator it = li.iterator();
int i = 0;
while(it.hasNext())
{
String p = (String)it.next();
str[i] = p;
i++;
}
//jQuery related start
int cnt=1;
for(int j=0;j<str.length;j++)
{
if(str[j].toUpperCase().startsWith(query.toUpperCase()))
{
out.print(str[j]+"\n");
if(cnt>=5)// 5=How many results have to show while we are typing(auto suggestions)
break;
cnt++;
}
}
//jQuery related end
rs.close();
st.close();
con.close();
}
catch(Exception e){
e.printStackTrace();
}
%>
it's not a form,so don't get the value use getParameter().
source: "getdata.jsp?country="+$("#country").val(),
I want to make an image viewer (for my website) like the one in Facebook (the old one). When the user click the next or back arrow it will change the picture and the URL of the page.
This is an example of what I want (http://www.facebook.com/pages/Forest-Ville/307556775942281)
Most importantly I want the page to reload with each click with new (URL, comment box, ads, etc.) I do not want to use any Cookies.
Now I am using this, but its completely different from what I want.
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Untitled Document</title>
</head>
<script language="JavaScript">
var NumberOfImages = 10
var img = new Array(NumberOfImages)
img[0] = "http://damnthisfunny.site40.net/1.jpg"
img[1] = "http://damnthisfunny.site40.net/2.jpg"
img[2] = "http://damnthisfunny.site40.net/3.jpg"
img[3] = "http://damnthisfunny.site40.net/4.jpg"
img[4] = "http://damnthisfunny.site40.net/5.jpg"
img[5] = "http://damnthisfunny.site40.net/6.jpg"
img[6] = "http://damnthisfunny.site40.net/7.jpg"
img[7] = "http://damnthisfunny.site40.net/8.jpg"
img[8] = "http://damnthisfunny.site40.net/9.jpg"
img[9] = "http://damnthisfunny.site40.net/10.jpg"
var imgNumber = 0
function NextImage()
{
imgNumber++
if (imgNumber == NumberOfImages)
imgNumber = 0
document.images["VCRImage"].src = img[imgNumber]
}
function PreviousImage()
{
imgNumber--
if (imgNumber < 0)
imgNumber = NumberOfImages - 1
document.images["VCRImage"].src = img[imgNumber]
}
</script>
<body>
<center>
<img name="VCRImage" src="http://damnthisfunny.site40.net/1.jpg" /></dr>
<br />
<a href="javascript:PreviousImage()">
<img border="0" src="left1.jpg" /></a>
<a href="javascript:NextImage()">
<img border="0" src="right1.jpg" /></a>
</center>
</body>
</html>
Any ideas ?