I need to parse through a page by jsoup. The page has elements with tags div,h3,a etc. I want to parse through the elements and select a (i.e. title) to be displayed in jList.
As an example, the page looks like:
<div class="start">
<div class="g">
<div class="abc">
<a class="picture" href="www.img.com"><img src="img" alt="image1"></a>
<div class="xyz">
<h3 class="_r">
<a class="title" href="www.example.com" onmousedown="return rwt(this,'','','','1','adf','','ahahh','','',event)">THIS IS <em>example</em>1</a>
</h3>
</div>
</div>
</div>
<div class="g">
<div class="abc">
<a class="picture" href="www.img.com"><img src="img" alt="image2"></a>
<div class="xyz">
<h3 class="_r">
<a class="title" href="www.example.com" onmousedown="return rwt(this,'','','','1','adf','','ahahh','','',event)">lead by this<em>example</em></a>
</h3>
</div>
</div>
</div>
<div class="g">
<div class="abc">
<a class="picture" href="www.img.com"><img src="img" alt="image3"></a>
<div class="xyz">
<h3 class="_r">
<a class="title" href="www.example.com" onmousedown="return rwt(this,'','','','1','adf','','ahahh','','',event)">showed<em>example</em>for the people</a>
</h3>
</div>
</div>
</div>
<div class="g">
<div class="abc">
<a class="picture" href="www.img.com"><img src="img" alt="image4"></a>
<div class="xyz">
<h3 class="_r">
<a class="title" href="www.example.com" onmousedown="return rwt(this,'','','','1','adf','','ahahh','','',event)">we set<em>example</em>for people</a>
</h3>
</div>
</div>
</div>
</div>
This is the code:
String url = "http://www.google.com/search?q=example&tbm=nws&source=lnms";
String title = "";
try {
Document doc = Jsoup.connect(url).userAgent("Chrome").timeout(5000).get();
Elements e = doc.select("div.g");
for (Element e1 : e) {
title = e1.getElementsByTag("a").text();
}
DefaultListModel<String> listModel = new DefaultListModel<>();
listModel.addElement(title);
jList.setModel(listModel);
} catch (IOException ex) {
Logger.getLogger(MainUI.class.getName()).log(Level.SEVERE, null, ex);
}
The output that I got was the title of the last element div.g:
we set example for people
I want to select the title from each div.g and display each title separately in jList as item like this:
THIS IS example 1
lead by this example
showed example for the people
we set example for people
Currently you assign the scraped data to title in a loop and then outside the loop you assign title to the jlist. So, the value of title once the loop has completed will always be the last value.
Replace this ...
for (Element e1 : e) {
title = e1.getElementsByTag("a").text();
}
DefaultListModel<String> listModel = new DefaultListModel<>();
listModel.addElement(title);
With this ...
DefaultListModel<String> listModel = new DefaultListModel<>();
for (Element e1 : e) {
listModel.addElement(e1.getElementsByTag("a").text());
}
You actually don't add title each time. The loop replace each time title with the new value found and after the loop you add it in the list. Something like this might work the way you want it :
DefaultListModel<String> listModel = new DefaultListModel<>();
for (Element e1 : e) {
listModel.addElement(e1.getElementsByTag("a").text());
}
Related
I have a button with drop down item,
if i click the button it's open the list and choose the item
below is the html
<button id="btn-append-to-body" class="btn btn-primary mobile-quick-button dropdown-toggle" type="button" uib-dropdown-toggle="" aria-haspopup="true" aria-expanded="false">
<div class="clearfix">
<span class="pull-left text-left ng-binding" tabindex="0"> Select one </span>
<span class="pull-right text-right ng-binding">
</div>
</button>
<ul class="uib-dropdown-menu dropdown-menu" role="menu" aria-labelledby="btn-append-to-body">
<!-- ngRepeat: option in dropOptions -->
<li id="quickOption" class="ng-scope" role="presentation" name="quickOption" ng-repeat="option in dropOptions" ng-click="selectOption(option)" required="" tabindex="0" style="">
<a href="">
<div class="clearfix">
<span class="pull-left ng-binding">frame number</span>
</div>
</a>
</li>
<!-- end ngRepeat: option in dropOptions -->
<li id="quickOption" class="ng-scope" role="presentation" name="quickOption" ng-repeat="option in dropOptions" ng-click="selectOption(option)" required="" tabindex="0">
<a href="">
<div class="clearfix">
<span class="pull-left ng-binding">serial number</span>
</div>
</a>
</li>
</ul>
I want to choose any one of the item from this list,
public void lookupSearch (String item){
driver.findElement(By.xpath("//*[#id='btn-append-to-body']")).click();
//then i choose/click the parameter item (i.e frame number or serial number)
}
passing the item as parameter
please guide me how should i choose the item
To click on the button with drop down item and choose any one of the item from this list you can use the following code block :
public void lookupSearch (String item)
{
driver.findElement(By.xpath("//button[#id='btn-append-to-body']/div/span[contains(.,'Select one')]")).click();
WebDriverWait wait4elements = new WebDriverWait(driver, 10);
List<WebElement> myElements = wait4elements.until(ExpectedConditions.numberOfElementsToBe(By.xpath("//ul[#class='uib-dropdown-menu dropdown-menu']/li/a/div/span"), 2));
for(WebElement elem:myElements)
if(elem.getAttribute("innerHTML").contains(item))
{
elem.click();
break;
}
System.out.println("Element with text as "+ item +" is selected");
}
below is the answer from ul find the Li count, then getting each li text and compare with item string
WebElement uList = driver.findElement(By.xpath("//*[#id='quick-search-dropdown']/ul"));
List<WebElement> listCount = uList.findElements(By.tagName("li"));
for (int i = 1; i <= listCount.size(); i++) {
WebElement lookupItem = driver.findElement(By.xpath("(//li[#id='quickOption']/a/div/span[1])[" + i + "]"));
String lookupItemValue = lookupItem.getText();
if (lookupItemValue.equalsIgnoreCase(Item)) {
lookupItem.click();
}
}
The input fields I am needing to grab are within this #id="contractorsWrapper".
In this example, there are 2 input fields within that wrapper (but this number is dynamic depending on the case) located at #class="contactEntry".
What I'm trying to do is say, how many className=contactEntry fields are there within the id=contractorsWrapper. Then, be able to input text into them independently.
<div id="contractorsWrapper" class="contactInputAndInfoDisplays_wrapper">
<div id="contractorsRow_5d1532ba-b37e-4aac-85c2-4a5e6c6c2796" class="contactInputAndInfoDisplay">
<div class="contactName">
<div class="contactFlag"/>
<a class="smallRemove removeAContact" href="#"/>
<span class="littleGreyTitles">
Name
<br/>
</span>
<input class="contactEntry " type="text" value=""/>
</div>
<div class="descriptionInput littleGreyTitles">
Description
<br/>
<input type="text"/>
</div>
<a class="contactLink" href="#" style="display: none;"/>
</div>
<div class="spacerDiv1"/>
<div id="contractorsRow_5fc58f1a-906f-4239-93ae-b0a2e4b8b70c" class="contactInputAndInfoDisplay">
<div class="contactName">
<div class="contactFlag"/>
<a class="smallRemove removeAContact" href="#"/>
<span class="littleGreyTitles">
Name
<br/>
</span>
<input class="contactEntry " type="text" value=""/>
</div>
<div class="descriptionInput littleGreyTitles">
Description
<br/>
<input type="text"/>
</div>
<a class="contactLink" href="#" style="display: none;"/>
</div>
<div class="spacerDiv1"/>
</div>
Find your wrapper:
WebElement wrapperElement = driver.findElement(By.id("contractorsWrapper"));
Number of input elements:
wrapperElement.findElements(By.className("contactEntry ")).size();
I don't know what you mean with "input text into them independently" but here's how you could enter the same thing in all of them:
for (WebElement element : wrapperElement.findElements(By.className(className))) {
element.sendKeys("keysToSend");
};
update
after more details from OP
If you want to insert some "unique" Strings to the element, you can use an ArrayList
// create as much array entries as you need
List<String> namesList = new ArrayList<String>();
namesList.add("John Doe");
namesList.add("Jane Doe");
...
// then
int count = 0;
for (WebElement element : wrapperElement.findElements(By.className(className))) {
element.sendKeys(namesList.get(count++));
};
of course you would then need to make sure, that your list is always longer than the number of input elements...
I am trying to display a list of members in a group using a accordion. I am using jsp,el and servlets for this purpose.
Here are two database table for understanding the problem
above are two tables -groups and group_members respectively ..I am accessing data from them and storing them in the two java objects and making a list of these two objects.
my servlet for these is as follows;;
//current user's user-name
String currentUser = request.getParameter("username");
String group_name=null;
String GetGroupInfo = "select * from groups where creator_username=?";
//get all the groups where the creator's user-name is currentUser
try {
List<Group> groups = new ArrayList<Group>();
List<GroupDetails> groupList = new ArrayList<GroupDetails>();
ps3 = currentCon.prepareStatement(GetGroupInfo);
ps3.setString(1,currentUser);
rs3 = ps3.executeQuery();
//set values for Group object
while(rs3.next())
{
Group groupObj = new Group();
groupObj.setGroup_id(rs3.getString("group_id"));
groupObj.setGroup_name(rs3.getString("group_name"));
groupObj.setGroup_description(rs3.getString("group_description"));
groupObj.setCreator_username(rs3.getString("creator_username"));
groupObj.setCreated_on(rs3.getString("created_on"));
groups.add(groupObj);
String query = "select * from group_members where creator_username=? and group_id=?";
ps = currentCon.prepareStatement(query);
ps.setString(1,currentUser);
ps.setString(2, groupObj.getGroup_id());
rs = ps.executeQuery();
while(rs.next())
{
GroupDetails groupInfo = new GroupDetails();
groupInfo.setIs_admin(rs.getString("is_admin"));
groupInfo.setAdded_on(rs.getString("added_on"));
groupInfo.setCreator_username(rs.getString("creator_username"));
groupInfo.setGroup_name(rs.getString("group_name"));
groupInfo.setGroup_id(rs.getString("group_id"));
groupInfo.setUser_name(rs.getString("user_name"));
groupInfo.setMember_id(rs.getString("member_id"));
String memberFullname = "select firstname,lastname,user_type from users where username ='" + groupInfo.getUser_name() + "'";
ps2 = currentCon.prepareStatement(memberFullname);
rs2 = ps2.executeQuery();
if(rs2.next())
{
String member_fullname = rs2.getString("firstname") + " " + rs2.getString("lastname");
groupInfo.setMember_name(member_fullname);
groupInfo.setMember_usertype(rs2.getString("user_type"));
}
group_name = groupInfo.getGroup_name();
groupList.add(groupInfo);
}
}
//request group name
request.setAttribute("group_name",group_name);
request.setAttribute("individual_group", groups);
request.setAttribute("groupList", groupList);
request.setAttribute("currentUser", currentUser);
RequestDispatcher rd = request.getRequestDispatcher("/ViewAllExistingGroups");
rd.forward(request, response);
} catch (SQLException e) {
e.printStackTrace();
}
Jsp page to display the groups in accordion style - group name is displayed in the accordion title and members in the accordion content.
<body>
<% String username = (String)request.getAttribute("currentUser"); %>
<div class="container">
<div class="panel panel-default">
<div class="panel panel-heading">
</div>
<div class="panel panel-body">
<div id="accordion">
<c:forEach var="group1" items="${individual_group}">
<h5> ${group1.group_name}</h5>
<div>
<ul class="list-group">
<li class="list-group-item title">
<strong style="display:inline;"> About: <h5 style="display: inline;"> ${group1.group_description}</h5></strong>
</li>
<c:forEach var="group" items="${groupList}">
<li class="list-group-item title">
<img src="${pageContext.request.contextPath}/css/images/user.png" class="img img-circle" style="display: inline" />
<strong style="display:inline;"> ${group.member_name} <h5 style="display: inline;">(${group.member_usertype})</h5></strong>
</li>
</c:forEach>
</ul>
</div>
</c:forEach>
</div>
<br/>
<form action="ListConnectedUsers" method="get">
<input type="hidden" name="username" value="<%=username %>" />
<input type="submit" class="btn btn-default pull-left" value="Back"/>
</form>
</div>
<div class="panel panel-footer">
</div>
</div>
</div>
</body>
This is what i m getting in my jsp page. but the all the members are being displayed in each group of accordion panel
Each of the accordion section is showing all the members from the group_members table.
I want it to display three members in 'BEIT classroom' , three members in '3 BUDDIES' and 1 members 'SUDD GROUP'
I want the help with the jsp el or any condition if needed to present in manner i explained.
Sorry for such a long question but i tried explaining it properly. Please help
I think i can create a object where i can bundle the groupname and the group members full name.
I am new in thymeleaf, and trying to iterate values using thymeleaf th:each attribute, but get wrong output. I am using <div> instead of table, when thymeleaf render the page, all objects values override the first row values and rest of the rows are empty show. Following is my code:
My Spring MVC controller code
ProductCategory category = new ProductCategory();
category.setId(BigInteger.valueOf(558711));
category.setTitle("Category 1");
category.setStatus(FinEnum.STATUS.IN_ACTIVE.getStatus());
ProductCategory category2 = new ProductCategory();
category.setId(BigInteger.valueOf(558722));
category.setTitle("Category 2");
category.setStatus(FinEnum.STATUS.ACTIVE.getStatus());
List<ProductCategory> categories = new ArrayList<ProductCategory>();
categories.add(category);
categories.add(category2);
model.addAttribute("categories", categories);
return "admin/product/view-categories";
My thymeleaf code:
<div class="row-area" th:each="category: ${categories}">
<div class="column2 tariff-date" style="width: 15%;"><span th:text="${category.id}">Dummy Data</span></div>
<div class="column2 tariff-date" style="width: 15%;"><span th:text="${category.title}">Dummy Data</span></div>
<div class="column2 tariff-date" style="width: 13%;"><span th:text="${category.status}">Dummy Data</span></div>
<div class="column5 icons middle-area" style="margin-left: 7px; width: 40%;">
<a class="icon7" href="javascript:void(0)" style="width: 140px;">View Sub Category</a>
<a class="icon2" href="javascript:void(0)"><p>Edit</p></a>
<div th:switch="${category.status}" style="margin-left: 195px;">
<a class="icon8" href="javascript:void(0)" th:case="'Inactive'" style="width: 88px;">Deactivate</a>
<a class="icon9" href="javascript:void(0)" th:case="'Active'">Active</a>
</div>
<a class="icon14" href="javascript:void(0)" style="width: 60px;"><p>Delete</p></a>
</div>
</div>
My Output is:
The problem has nothing to do with Thymeleaf, it's just a simple typo. After the line:
ProductCategory category2 = new ProductCategory();
you are still modifying (overwriting) the category object instead of category2. Therefore, the properties of category2 never got set. Corrected code should be:
category2.setId(BigInteger.valueOf(558722));
category2.setTitle("Category 2");
category2.setStatus(FinEnum.STATUS.ACTIVE.getStatus());
Tested this locally and saw we "rows" of data after the fix.
I want to parse the data out of this HTML (CompanyName, Location, jobDescription,...) using JSoup (java). I get stuck when trying to iterate the joblistings
The extract from the HTML is one of many "JOBLISTING" divs which I want to iterate and extract the Data out of it. I just can't handle how to iterate the specific div objects. Sorry for this noob question, but maybe someone can help me who already knows which function to use. Select?
<div class="between_listings"><!-- local.spacer --></div>
<div id="joblisting-2944914" class="joblisting listing-even listing-even company-98028 " itemscope itemtype="http://schema.org/JobPosting">
<div class="company_logo" itemprop="hiringOrganization" itemscope itemtype="http://schema.org/Organization">
<a href="/stellenangebote-des-unternehmens--Delivery-Hero-Holding-GmbH--98028.html" title="Jobs Delivery Hero Holding GmbH" itemprop="url">
<img src="/upload_de/logo/D/logoDelivery-Hero-Holding-GmbH-98028DE.gif" alt="Logo Delivery Hero Holding GmbH" itemprop="image" width="160" height="80" />
</a>
</div>
<div class="job_info">
<div class="h3 job_title">
<a id="jobtitle-2944914" href="/stellenangebote--Junior-Business-Intelligence-Analyst-CRM-m-f-Berlin-Delivery-Hero-Holding-GmbH--2944914-inline.html?ssaPOP=204&ssaPOR=203" title="Arbeiten bei Delivery Hero Holding GmbH" itemprop="url">
<span itemprop="title">Junior Business Intelligence Analyst / CRM (m/f)</span>
</a>
</div>
<div class="h3 company_name" itemprop="hiringOrganization" itemscope itemtype="http://schema.org/Organization">
<span itemprop="name">Delivery Hero Holding GmbH</span>
</div>
</div>
<div class="job_location_date">
<div class="job_location target-location">
<div class="job_location_info" itemprop="jobLocation" itemscope itemtype="http://schema.org/Place">
<div class="h3 locality" itemprop="address" itemscope itemtype="http://schema.org/PostalAddress">
<span itemprop="addressLocality"> Berlin</span>
</div>
<span class="location_actions">
<a href="javaScript:PopUp('http://www.stepstone.de/5/standort.html?OfferId=2944914&ssaPOP=203&ssaPOR=203','resultList',800,520,1)" class="action_showlistingonmap showlabel" title="Google Maps" itemprop="maps">
<span class="location-icon"><!-- --></span>
<span class="location-label">Google Maps</span>
</a>
</span>
</div>
</div>
<div class="job_date_added" itemprop="datePosted"><time datetime="2014-07-04">04.07.14</time></div>
</div>
<div class="job_actions">
</div>
</div>
<div class="between_listings"><!-- local.spacer --></div>
File input = new File("C:/Talend/workspace/WEBCRAWLER/output/keywords_SOA.txt"); // Load file into extraction1 Document ParseResult = Jsoup.parse(input, "UTF-8", "http://example.com/"); Elements jobListingElements = ParseResult.select(".joblisting"); for (Element jobListingElement: jobListingElements) { jobListingElement.select(".companyName span[itemprop=\"name\"]"); // other element properties System.out.println(jobListingElements);
Java code:
File input = new File("C:/Talend/workspace/WEBCRAWLER/output/keywords_SOA.txt");
// Load file into extraction1
Document ParseResult = Jsoup.parse(input, "UTF-8", "http://example.com/");
Elements jobListingElements = ParseResult.select(".joblisting");
for (Element jobListingElement: jobListingElements) {
jobListingElement.select(".companyName span[itemprop=\"name\"]");
// other element properties
System.out.println(jobListingElements);
}
Thank you!
So you got your Jsoup document right? Than it seems pretty easy if the css class joblisting does not appear anywhere else.
Document document = Jsoup.parse(new File("d:/bla.html"), "utf-8");
Elements elements = document.select(".joblisting");
for (Element element : elements) {
Elements jobTitleElement = element.select(".job_title span");
Elements companyNameElement = element.select(".company_name spanspan[itemprop=name]");
String companyName = companyNameElement.text();
String jobTitle = jobTitleElement.text();
System.out.println(companyName);
System.out.println(jobTitle);
}
I don't know why the attribute [itemprop*=\"name\"] selector does not find the span (Further reading: http://jsoup.org/cookbook/extracting-data/selector-syntax )
Got it: span[itemprop=name] without any quotes or escapes. Other attributes or values also should work to get a more specific selection.