Jsoup select command operation

Jsoup select command operation - java

I am new to Jsoup.
I am able to use select command
Elements media = doc.select("[src]");
if you see this : en.wikipedia.org/wiki/States_and_territories_of_India . In that I want to have all the Names in States of India only. But there are other tables also , when I do doc.select("area[title]"); I am getting all the table information . so I am looking if in select I can tell how it is used to only for a particular table.
I think Jsoup might not address this if that is the case,can you please tell me how to achieve this

Try something like this
Element indiaTable = doc.select("table").get(2); //India table is the third (index is 2) table on page
Elements myTds = indiaTable.select("td:eq(0)"); //The states are the first column
//or you can replace the two lines of code above with this
Elements myTds = doc.select("table.wikitable td:eq(0)");
for (Element td: myTds ) {
System.out.println(td.text());
}

Related

Finding exact match with equal sign with append function in SQL

sqlQueryString.append(" and upper(username) like upper(:searchString) ");
This code returns data like exampleusername, exampleusername1, exampleusername12...
I want it to return data with an exact match to the username that is being searched.
For example when I put in bobJacobs (an example username), I want it to return only bobJacobs records, not other records that may contain bobJacobs in them, for example samandbobJacobs24, bobJacobs23, etc.
I've tried:
sqlQueryString.append(" and upper(username) = upper(:searchString) ");
But it doesn't work. Any solutions?

How to read attributes out of multiple nested documents in MongoDB Java?

I need some help with a project I am planing to do. At this stage I am trying to learn using NoSQL Databases in Java.
I've got a few nested documents looking like this:
MongoDB nesting structure
Like you can see on the image, my inner attributes are "model" and "construction".
Now I need to iterate through all the documents in my collection, whose keynames are unknown, because they are generated in runtime, when a user enters some information.
At the end I need to list them in a TreeView, keeping the structure they have already in the database.
What I've tried is getting keySets from documents, but I cannot pass the second layer of the structure. I am able to print the whole Object in Json format, but I cannot access the specific attributes like "model" or "construction".
MongoCollection collection= mongoDatabase.getCollection("test");
MongoCursor<Document> cursor = collection.find().iterator();
for(String keys: document.keySet()) {
Document vehicles = (Document) document.getString(keys);
//System.out.println(keys);
//System.out.println(document.get(keys));
}
/Document cars = (Document) vehicle.get("cars");
Document types = (Document) cars.get("coupes");
Document brands = (Document) types.get("Ford");
Document model = (Document) brands.get("Mustang GT");
Here I tried to get some properties, by hardcoding the keynames of the documents, but I can't seem to get any value either. It keeps telling me that it could not read from vehicle, because it is null.
The most tutorials and posts in forums, somehow does not work for me. I don't know if they have any other version of MongoDB Driver. Mine is: mongodb driver 3.12.7. if this helps you in any way.
I am trying to get this working for days now and it is driving me crazy.
I hope there is anyone out there who is able to help me with this problem.

Here is a way you can try using the Document class's methods. You use the Document#getEmbedded method to navigate the embedded (or sub-document) document's path.
try (MongoCursor<Document> cursor = collection.find().iterator()) {
while (cursor.hasNext()) {
// Get a document
Document doc = (Document) cursor.next();
// Get the sub-document with the known key path "vehicles.cars.coupes"
Document coupes = doc.getEmbedded(
Arrays.asList("vehicles", "cars", "coupes"),
Document.class);
// For each of the sub-documents within the "coupes" get the
// dynamic keys and their values.
for (Map.Entry<String, Object> coupe : coupes.entrySet()) {
System.out.println(coupe.getKey()); // e.g., Mercedes
// The dynamic sub-document for the dynamic key (e.g., Mercedes):
// {"S-Class": {"model": "S-Class", "construction": "2011"}}
Document coupeSubDoc = (Document) coupe.getValue();
// Get the coupeSubDoc's keys and values
coupeSubDoc.keySet().forEach(k -> {
System.out.println("\t" + k); // e.g., S-Class
System.out.println("\t\t" + "model" + " : " +
coupeSubDoc.getEmbedded(Arrays.asList(k, "model"), String.class));
System.out.println("\t\t" + "construction" + " : " +
coupeSubDoc.getEmbedded(Arrays.asList(k, "construction"), String.class));
});
}
}
}
The above code prints to the console as:
Mercedes
S-Class
model : S-Class
construction : 2011
Ford
Mustang
model : Mustang GT
construction : 2015

I think it's not the complete answer to his question.
Here he says:
Now I need to iterate through all the documents in my collection, whose keynames are unknown, because they are generated in runtime, when a user enters some information.
Your answer #prasad_ just refers to his case with vehicles, cars and so on. He needs a way to handle unknown key/value pairs i guess. For example, in this case he only knows the keys:vehicle,cars,coupe,Mercedes/Ford and their subkeys. If another user inserts some new key/value paairs in the collection he will have problems because he can't navigate trough the new document without to have a look into the database.
I'm also interested in the solution because I never nested my key/value pairs and cant see the advantage of it. Am I wrong or does it make the programming more difficult?

Java how to get rid of redundant code

In Java I parse a XML document. This XML is a Purchase Order and from this XML I create a PO document in our ERP-system.
I use domparser to parse the XML.
So eventually I have code like this:
--this is an excerpt --
//ShipTo
Element shipToElement = CXMLHandlerObj.getChildElement(elementOrderRequestHeader, "ShipTo");
//Address
Element shipToAddressElement = CXMLHandlerObj.getChildElement(shipToElement, "Address");
/*get attributes of Address*/
notesHandlerObj.docOrder.replaceItemValue("ShipToParty_addressID", shipToAddressElement.getAttribute("addressID"));
notesHandlerObj.docOrder.replaceItemValue("ShipToParty_addressIDDomain", shipToAddressElement.getAttribute("addressIDDomain"));
notesHandlerObj.docOrder.replaceItemValue("ShipToParty_isoCountryCode", shipToAddressElement.getAttribute("isoCountryCode"));
But the XML also contains at the top a OrderRequestHeader which has a type attribute in it:
<OrderRequestHeader orderDate="2017-04-04T12:00:00+00:00" orderID="4550144777" orderType="regular" orderVersion="1" type="new">
Below this element all the details of the order are found.
The "type" attribute can have values like : New or Update.
The type will be "new" if the PO XML is send for the first time and the type will be "update" if the same PO is sent but then with an update contained within it.
Note that the XML structure is the same but only the type is different.
When the type is "New", I will just parse the XML and create the PO document. But if the type is "Update" then I want to check every element and update the document and mail the changes accordingly..
Now the problem is that for the parsing of the XML I need to create a new PO or update an existing one. This I can do by the following ways:
1. creating two methods :
1. create new PO
2. update PO
In the create method I can parse the xml and add values from element to the document.
In the update method I can parse again all elements but also check which data has been changed.
2. I can put a if and else statement before every element
The methods of above are a bit redudant is there any simpler way of doing this?

Android : Retrieve multiples Elements from Html using JSoup

I want to retrieve a title from a div, a start hour and an end hour all of that from a big div called day and inside another div called event
I need to had these items to a list but right now i'am stuck here because it can't retrieve my 3 elements.
Document doc = Jsoup.connect("http://terry.gonguet.com/cal/?g=tp11").get();
Elements days = doc.select("div[class=day]");
Elements event = doc.select("div[class=event]");
for(Element day : days)
{
System.out.println(" : " + day.text());
for(Element ev : event)
{
Element title = ev.select("div[class=title]").first();
Element starthour = ev.select("div[class=bub right top]").first();
Element endhour = ev.select("div[class=bub right bottom]").first();
System.out.println(title.text()+starthour.text()+endhour.text());
}
}

None of there is no div in that document which have only day as class argument. They all have day class combined with another class which prevents div[class=day] from finding such div. Same problem applies to div[class=event] selector.
To solve it use CSS query syntax in which . operator is used to describe class attribute
(hint: if you want to select element which has few classes you can use element.class1.class2).
So instead of
select("div[class=day]");
select("div[class=event]");
use
select("div.day");
select("div.event");
Also instead of
ev.select("div[class=bub right top]");
ev.select("div[class=bub right bottom]");
you could try using
ev.select("div.bub.right.top");
ev.select("div.bub.right.bottom]");
This will allow you to find div which has all these classes (even if they are not in same order or there are more classes then mentioned in selector).

Faceting using SolrJ and Solr4

I've gone through the related questions on this site but haven't found a relevant solution.
When querying my Solr4 index using an HTTP request of the form
&facet=true&facet.field=country
The response contains all the different countries along with counts per country.
How can I get this information using SolrJ?
I have tried the following but it only returns total counts across all countries, not per country:
solrQuery.setFacet(true);
solrQuery.addFacetField("country");
The following does seem to work, but I do not want to have to explicitly set all the groupings beforehand:
solrQuery.addFacetQuery("country:usa");
solrQuery.addFacetQuery("country:canada");
Secondly, I'm not sure how to extract the facet data from the QueryResponse object.
So two questions:
1) Using SolrJ how can I facet on a field and return the groupings without explicitly specifying the groups?
2) Using SolrJ how can I extract the facet data from the QueryResponse object?
Thanks.
Update:
I also tried something similar to Sergey's response (below).
List<FacetField> ffList = resp.getFacetFields();
log.info("size of ffList:" + ffList.size());
for(FacetField ff : ffList){
String ffname = ff.getName();
int ffcount = ff.getValueCount();
log.info("ffname:" + ffname + "|ffcount:" + ffcount);
}
The above code shows ffList with size=1 and the loop goes through 1 iteration. In the output ffname="country" and ffcount is the total number of rows that match the original query.
There is no per-country breakdown here.
I should mention that on the same solrQuery object I am also calling addField and addFilterQuery. Not sure if this impacts faceting:
solrQuery.addField("user-name");
solrQuery.addField("user-bio");
solrQuery.addField("country");
solrQuery.addFilterQuery("user-bio:" + "(Apple OR Google OR Facebook)");
Update 2:
I think I got it, again based on what Sergey said below. I extracted the List object using FacetField.getValues().
List<FacetField> fflist = resp.getFacetFields();
for(FacetField ff : fflist){
String ffname = ff.getName();
int ffcount = ff.getValueCount();
List<Count> counts = ff.getValues();
for(Count c : counts){
String facetLabel = c.getName();
long facetCount = c.getCount();
}
}
In the above code the label variable matches each facet group and count is the corresponding count for that grouping.

Actually you need only to set facet field and facet will be activated (check SolrJ source code):
solrQuery.addFacetField("country");
Where did you look for facet information? It must be in QueryResponse.getFacetFields (getValues.getCount)

In the solr Response you should use QueryResponse.getFacetFields() to get List of FacetFields among which figure "country". so "country" is idenditfied by QueryResponse.getFacetFields().get(0)
you iterate then over it to get List of Count objects using
QueryResponse.getFacetFields().get(0).getValues().get(i)
and get value name of facet using QueryResponse.getFacetFields().get(0).getValues().get(i).getName()
and the corresponding weight using
QueryResponse.getFacetFields().get(0).getValues().get(i).getCount()

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.