Load an url only one time - java

I have a table of url I want to load, the table can have one or more time an url.
For example, a table with three values : url1, url2 url1.
So, after, I load an url, an extract one of his html piece(for example a ).
I have this :
HtmlPage page=null;
for (int i = 0; i < tableUrlSource.length; i++) {
try {
page = webClient.getPage(tabUrlSource[i]);
List<HtmlElement> nbElements = (List<HtmlElement>) page.getByXPath(tabXpathSource[i]);
if (null != nbElements && !nbElements.isEmpty()) {
htmlResult = nbElements.get(0).asText();
}
...
But this is not the more efficient, because it will load url1 two times and url one time.
So it will like there is three url to load, and then, make the treatment longer.
How can I load an url only one time and keep the same final result?
I hope my english is clear, so as my question.
Regards.
Thank you.

What Keppil answered is correct but you would have to use the Set in place of tabUrlSource[i] rather than for Set<HtmlElement>
EDIT:
Okay what is the content of tabUrlSource[i]?Is it of type URL or custom?
This is how it would look like if it is URL
Set <URL>uniqueURLs = new HashSet <URL>();
for (int i = 0; i < tableUrlSource.length; i++) {
uniqueURLs.add(tableUrlSource[i])
}
And then iterate over this Set instead of tableUrlSource array like this
for(Iterator itr = uniqueURLs.iterator(); itr.hasNext(); ){
page = webClient.getPage((URL)itr.next());
.............
.............
Continue the rest of the code
Also you said you are using index 'i' to associate url and xpath. Will that xpath be same for same url? If so you can use HashMap instead with key as URL and value as xpath so that duplicate keys will be overridden. Then you can iterate over this hashmap keys to get the 'page' and use the 'value' for to fetch HTMLELEMENT
If they are not same you can still use a HashSet like this
Set <URL>uniqueURLs = new HashSet <URL>();
HtmlPage page=null;
for (int i = 0; i < tableUrlSource.length; i++) {
try {
if(uniqueURLs.contains(tabUrlSource[i]) continue;
else
uniqueURLs.add( tabUrlSource[i] );
page = webClient.getPage(tabUrlSource[i]);
List<HtmlElement> nbElements = (List<HtmlElement>)
page.getByXPath(tabXpathSource[i]);
if (null != nbElements && !nbElements.isEmpty()) {
htmlResult = nbElements.get(0).asText();
}
Hope this helps :)

You could use a Set<HtmlElement> instead of a List. This will remove duplicates automatically.
This of course is dependant on the fact that HtmlElements are comparable. If they aren't, you could instead add all the URLs to a Set<String> and then iterate over that.
Update
To clarify the second part:
A Set is declared like this in the Javadocs:
A collection that contains no duplicate elements. More formally, sets
contain no pair of elements e1 and e2 such that e1.equals(e2), and at
most one null element. As implied by its name, this interface models
the mathematical set abstraction.
In other words, to ensure that there are no duplicates, it relies on the elements being comparable via the equals() method. If HtmlElement hasn't overridden this method, the Set will just use the Object.equals() method, which just compares object references instead of the actual data in the HtmlElements.
However, String has overridden the equals() method, and you can therefor be certain that duplicate Strings will be removed from a Set<String>.

Related

I have a query in mongodb and the reference key's are in hashmap list , I need to process a simple query using java in mongodb

mongodb query is db.test.find({"col1":{"$ne":""}}).count(), I have tried many sources to find the solution, the "col1" must be populated from list array, please help me
I have pasted a part of my code
`
List<String> likey = new ArrayList<String>();
for (DBObject o : out.results())
{
likey.add(o.get("_id").toString());
}
Iterator<String>itkey = likey.iterator();
DBCursor cursor ;
//cursor = table.find();
HashMap<String, String> hashmap = new HashMap<String, String>();
while (itkey.hasNext())
{
System.out.println((String)itkey.next());
String keys = itkey.next().toString();
//System.out.println("keys --> "+keys);
String nullvalue = "";
Boolean listone = table.distinct(keys).contains(nullvalue);
hashmap.put(keys, listone.toString());
//System.out.println("distinct --> "+keys+" "+listone);
//System.out.println("proper str --- >"+ '"'+keys+'"');
}
Iterator<String> keyIterator = hashmap.keySet().iterator();
Iterator<String> valueIterator = hashmap.values().iterator();
while (keyIterator.hasNext()) {
//System.out.println("key: " + keyIterator.next());
while (valueIterator.hasNext()) {
//System.out.println("value: " + valueIterator.next());
//System.out.println("Key: " + keyIterator.next() +""+"value: "+valueIterator.next());
String hashkey = valueIterator.next();
}
}
`
When you post code, it helps if you indent it, so it is more readable. As I mentioned to you on another forum, you need to go back and review the Java collection classes, since you have multiple usage errors in the above code.
Here are a few things you need to do to clean up your code:
1) You don't need to use the itkey iterator. Instead, use:
for (String key : likey)
and get rid of all the itkey.next calls. Your current code only processes every second element of the List. The other ones are printed out.
2) Your HashMap will map a key to a Boolean. Is that what you want? You said you want to count the number of non-zero values for the key. So, the line:
Boolean listone = table.distinct(keys).contains(nullvalue);
is almost certainly in error.
3) When you iterate over the HashMap, you don't need the valueIterator. Instead, get the key (either from the keyIterator, or a variable you define using the simpler iterator syntax above), then use the key to get the matching value using hashmap.get(key).
This will not make your code work, but it will clean it up somewhat - at the moment it is difficult to understand what you are intending it to do.

Iterating through a map many times efficiently

I have a cookie manager class that stores Lists of cookies by their domain in a Map. The size will stay below 100 most of the time.
Map<String, CookieList> cookieMap;
Every time I set up cookies for connections, it needs to iterate through all domains(String), check if it's acceptable, then insert the CookieList. I will be iterating through the map many times. I have a separate List holding the domains and search that, then get the CookieList by the Key.
List<String> domainList;
// host is from the connection being set up
for (String domain : domainList) {
if (host.contains(domain)) {
CookieList list = cookieMap.get(domain);
// set up cookies
}
}
Since I'm using contains, I can't directly get the Key from cookieMap. Is this a good way or should I just be iterating Map's EntrySet? If so, would LinkedHashMap be good in this example?
Instead of maintaining a Map and a List, you could use Map.keySet to get the domains.
for (String domain : cookieMap.keySet()) {
if (host.contains(domain)) {
CookieList list = cookieMap.get(domain);
}
}
There is nothing inefficient about this, since the for loop is O(n), and the call to cookieMap is O(1).
Map<String, CookieList> coockieMap = new HashMap<String, CookieList>();
for (Map.Entry<Integer, CookieList> entry : coockieMap.entrySet()) {
if (host.contains(entry.getKey())) {
CookieList list = entry.getValue();
}
}
Hope this helps you.
I think your code is pretty optimized, if you want, you can use
domainList.retainAll(hosts)
before your for loop, so stop doing a check every loop. Effetively, your code will look as follows :
List<String> hostList = new ArrayList<String>(domainList); // we don't want to edit domains
hostList.retainAll(host);
for (String hostEntry : hostList) { // I'd rename "host" so I can use it here
CookieList list = cookieMap.get(hostEntry);
// set up cookies
}

Why does my XML parser only returns one string, instead of multiple ones?

I got a problem regarding parsing XML data. I have divided my program into 3 different java files, each containing a class. One of them is rssparser.java. This file holds a function called iterateRSSFeed(String URL), this function returns a string containing the parsed description tag. In my main.java files where my main method is, I call this iterateRSSFeed function this way:
rssparser r = new rssparser();
String description = r.iterateRSSFeed();
And then I am planning to add this String to a JLabel, this way:
JLabel news = new JLabel(description);
which obviously works great, my program runs. BUT there are more description tags in my XML file, the JLabel only contains one(1) parsed description tag. I should say that my return statement in the iterateRSSFeed function is "packed" in a for-loop, which in my head should return all of the description tags. But no.
Please ask if something is uncleared or showing of the source code is a better way to provide a solution to my answer. Thanks in advance! :)
When Java executes a return statement, it will leave the method, and not continue running the loop.
If you want to return multiple values from a method, you have to put them in some object grouping them together. Normally one would use a List<String> as return type.
Then your loop will fill the list, and the return statement (after the loop) can return the whole list at once.
If you want to have one large string instead of multiple ones, you'll have to merge them into one.
The easiest would be to simply use the .toString() method on the list, this will give (if you are using the default list implementations) something like [element1, element2, element3].
If you don't like the [,], you could simply concatenate them:
List<String> list = r.iterateRSSFeed();
StringBuilder b = new StringBuilder();
for(String s : list) {
b.append(s);
}
String description = b.toString();
This will give element1element2element3.
As Java's JLabel has some rudimentary HTML support, you could also use this to format your list as a list:
List<String> list = r.iterateRSSFeed();
StringBuilder b = new StringBuilder();
b.append("<html><ul>");
for(String s : list) {
b.append("<li>");
b.append(s);
b.append("</li>");
}
b.append("</ul>");
String description = b.toString();
The result will be <html><ul><li>element1</li><li>element2</li><li>element3</li></ul>, which will be formatted by the JLabel as something like this:
element1
element2
element3

How to find duplicates in an ArrayList which is in the form of JSON object?

I am having an option in my website for the user i.e: "Settings" in that I given 3 options(TextBoxes) to enter details: 1.E-mail, 2.SMS, 3.MMS.. in this user can enter another mail id: its an optional thing but, if he enter the both or same which is neccesary e-mail and optional or same then, I have to tell that "given e-mail" alredy exist.
I am sending this data as ArrayList that to coverted as JSON object.
What is the best way to find the duplicate and notify that to user
Help me in this
Thanks in advance
Either parse it into Java collections with a JSON framework of your choice, then check for duplicates or use JavaScript to directly work on the JSON.
If you have the ArrayList anyway, why don't iterate over that?
Please do the following
HashSet hashSet = new HashSet(arrayList1);
ArrayList arrayList2 = new ArrayList(hashSet) ;
if(arrayList2.size()<arrayList1.size()){
//duplicates exits
}
You can do what Ammu posted, but this will not identify the duplicate entry. If you have the ArrayList as a Java object (if not, convert it into one), convert the ArrayList into a HashSet, compare the size to identify if there are duplicate entries. If so, you need to sort the ArrayList in order to find the duplicate entry.
Collections.sort(arrayList);
for(int i = 1; i < arrayList.size() - 1; i++){
if (arrayList.get(i).equals(arrayList.get(i - 1))){
// found duplicate
System.out.println("Duplicate!");
}
}
this works only if the entries of the ArrayList implement the sortable interface. But since your ArrayList is filled with strings this is the case.
Based on what you described
"... in this user can enter another
mail id: its an optional thing but, if
he enter the both or same which is
neccesary e-mail and optional or same
then, I have to tell that "given
e-mail" alredy exist."
I would alert the user using Javascript and avoid the HTTP Request/Response round-trip to the server:
...
// before submitting the form
if (document.getElementById('requiredEmail').value == document.getElementById('optionalEmail').value) {
alert("The optional email must be different than the required email");
}
...
As suggested before by other user, you can just create a Set based on the ArrayList if you are validating the input in the backend...
String[] parsedInput = new String[] { "SMS-Value", "MMS-Value", "email#domain.com", "email#domain.com" }
List<String> receivedList = Arrays.asList(parsedInput);
Set<String> validatedList = new HashSet<String>(receivedList);
if (validatedList.size() < receivedList.size()) {
throw new IllegalArgumentException("The email addresses provided are incorrect.");
}
If you want to find the duplicates then you can iterate over the list and find.
like:
Map<Object, Integer> map = new HashMap<Object, Integer>();
for(Object obj : list)
{
if(map.containsKey(obj))
{
map.put(obj, map.get(obj)+1);
}
else
{
map.put(obj, 1);
}
}
Objects in the map having value more than 1 are duplicate.
If you just want to get rid of duplicates (rather than knowing which are actually duplicates)
Ex:
Set set = new HashSet(list);
set can not have duplicate elements, so it will remove all duplicates.

How we get the List objects in backward direction?

Hi i am getting List object that contains pojo class objects of the table. in my case i have to show the table data in reverse order. mean that, for ex
i am adding some rows to particular table in database when i am added recently, the data is storing at last row in table(in database). here i have to show whole content of the table in my jsp page in reverse order, mean that what i inserted recently have to display first row in my jsp page.
here my code was like,
List lst = tabledate.getAllData();//return List<Table> Object
Iterator it = lst.iterator();
MyTable mt = new MyTable();//pojo class
while(it.hasNext())
{
mt=(MyTable)it.next();
//getting data from getters.
System.out.println(mt.getxxx());
System.out.println(mt.getxxx());
System.out.println(mt.getxxx());
System.out.println(mt.getxxx());
}
Use a ListIterator to iterate through the list using hasPrevious() and previous():
ListIterator it = lst.listIterator(lst.size());
while(it.hasPrevious()) {
System.out.println(it.previous());
}
You cannot use an iterator in this case. You will need to use index based access:
int size = lst.size();
for (int i=size - 1; i >= 0; i --)
{
MyTable mt = (MyTable)lst.get(i);
....
}
Btw: there is no need to create a new MyTable() before the loop. This is an instance that will be thrown away immediately and serves no purpose.

Categories