Jsoup - How can i get all links and titles in one array? - java

this is my first question so far. I need to get links and titles from certain html page in one 2D Array. Here is my code:
public String[][] data;
descs = doc.select("a");
data= new String [spaceCount][2];
int count=0;
for (Element e : descs ) {
data[count][0]=descs.attr("href");
data[count][1]=descs.attr("title");
count++;
}
String svalues = data[0][0]+"\n"+data[0][1]+data[1][0]+"\n"+data[1][1];
output.setText(svalues);
But my problem is that it keeps getting the same data in every place. I mean that in every cell here is only one, same link and one, same title. I am newbie in java, but I think things in loop are not moving (and they should). Can anyone explain how to make it work?

You are not using Element e. Change
data[count][0]=descs.attr("href");
data[count][1]=descs.attr("title");
to
data[count][0]=e.attr("href");
data[count][1]=e.attr("title");
and add as last line of the for loop:
if ( count == spaceCount )
break;

Related

Java: Comparing strings

I have an array list named myArraylist that contains items of the class named TStas. Tstas has a string variable named st_name. I want to search the array list, looking for the TStas instance whose st_name is equal to the string look for and when found return the position (found places) of the TStas in the array list.
public static List<Integer> findplace_byname(String lookfor){
List<Integer> foundplaces = new ArrayList<>(); //list to place posistions of found items
for(int k=0; k<myArraylist.size(); k++) {
TStas a=myArraylist.get(k);
JOptionPane.showMessageDialog(null, "#"+a.st_name+"#"+lookfor+ "#"); //just to check if everything is read,
if ((a.st_name).equals(lookfor)){
foundplaces .add(k);
}
}
return foundplaces;
}
My problem is that the code fails to detect the equality when comparing to the a.st_name of the first item in myArraylist.
For example:
if I have in myarrailist an item with a.st_name=S9, an item with a.st_name=K9 and another with a.st_name=G4. When lookfor is K9 or G4 all is ok. When searching for the first item in the array having a.st_name=S9 the code fails to "see" the equality.
I am using the showMessageDialog to check that the variable is realy read and it is so. Also I tried to delete or change the 1st item in the arraylist, but the same problem goes on: The 1rst item is not found.
What is happening here?
EDIT
I used the trim() to remove any possible spaces but nothing changed. I then used .length() on the "trimed" string to get the length of each string to be compared and I found that for some reason the 1st element while being "S9" without any spaces has a length of 3!! Is it possible that some king of character is hidden? (I have no idea, a paragraph character or what?)
There is no issue in your current code, check this code your self.
List<Integer> foundplaces = new ArrayList<>();
List<String> myArraylist=new ArrayList<>();
myArraylist.add("S9");
myArraylist.add("K9");
myArraylist.add("G4");
for(int k=0; k<myArraylist.size(); k++) {
String a=myArraylist.get(k);
JOptionPane.showMessageDialog(null, "#" + a + "#" + "S9" + "#");
if ((a).equals("S9")){
foundplaces .add(k);
System.out.println(k);
}
}
You can see it is working fine. same as your current code.
I found where the problem is.
As I mentioned is a comment I am using a txt file to populate myarraylist . Windows notepad ads automatically to the beginning of text files a BOM character.
(http://en.wikipedia.org/wiki/Byte_Order_Mark.). This character is the problem because I may read "S9" (the first text in the txt file) but it actually is the \65279 character plus "S9".
So using the following when reading the text file that is used to populate myarraylist the problem is solved.
if((int)readingstring.charAt(0)==65279){
readingstring=readingstring.substring(1);
}
Thanks for your help.

Replace strings with incrementing value

I have an ArrayList that is built from posts in a forum, each entry is a post.
When I am building the entry I am replacing all elements with the string "Image[x]".
After the array list is filled I want to go back and replace all instances of "[x]" with an incrementing integer.
I am building my array list using this code:
Elements wholePosts = doc.select("div.post_body");
for (Element wholePost : wholePosts) {
Elements texts = wholePost.select("div[itemprop=commentText]");
for (Element text : texts) {
String nobr = text.html().replaceAll("(?i)<br[^>]*>", "newlineplaceholder");
String formatted1 = nobr.replaceAll("<img src=.*?>", "Image[x]");
Document finalPost = Jsoup.parse(formatted1);
String almostfinalText = finalPost.text();
String finalText = almostfinalText.replace("newlineplaceholder", "\n");
datumList.add(finalText);
}
I have tried to replace the string and increment it in the above code but it only increments for each post element, so if there are multiple images in a post, post 1 would contain "Image1, Image1" and post 2 would contain "Image2, Image2". What I am looking for is for post 1 to contain "Image1, Image2" and post 2 to contain "Image3, Image4"

How to get same index value from other array?

I have two arrays and i want to access data of same index value from other array.
Two Array list :
ArrayList<Integer> Position = new ArrayList<Integer>();
ArrayList<String> List_Data = new ArrayList<String>();
Now my Position array contains Integer value like index of data i.e 0,3,5 out of 10 Records.
i want to get only those string whose index should be i.e 0,3,5 out of 10 .
Example :
String Array >> [A,B,C,D,E,F,G,H,J,K];
Index >> Now i am selecting 2 ,5 index data.
Final Output as string >> C,F
So at the end i get actual string from array.
I get this and some other link also but not get exact idea how to do this.
Please anyone help me.
Try this, If I understand what you want correctly (otherwise let me know)
String sr=Lista_Data.get(Position.get(INDEX YOU NEED; EG 1, 5, 1000...))
You can get object from ArrayList using get function. Then you can use it as an index to another ArrayList.
String res = "";
for (Integer pos : Position) {
res += List_Data.get(Position.get(pos));
}
The only thing you need is method indexOf(...) of List.
public String getStringByIndex(Integer index) {
return List_Data.get(Position.indexOf(index));
}
I am not not saying that above code is wrong but its not working according my needs. or i can't handle because of my other code limitation.
Finally, I get as i want like :
for (int i = 0; i < Poisition.size(); i++)
{
System.out.println("Selected Data --->"+ List_Data.get(Poisition.get(i)));
}

Array of linked lists of arrays for hash table

So I am creating a Hash Table that uses an Array of Linked Lists of Arrays. Let me take a second to explain why this is.
So I have previously implemented Hash Tables by creating an Array, and each element of the array is a Linked List. This way I could quickly look up a LL of 450,000 elements by searching for the hash value first in the array, and searching the elements of this LL. I should add that this is a project for school and I cannot just use the Hash Tables that comes with java.
Now I want to do something similar... but I massive have a LL of Arrays that I need to search. Here each element of the LL is line of a text file, which represented by a 4 element array, where each of the 4 elements is a different string that was tab delimited in the input file. I need to be able to quickly access the 2nd, 3rd, and 4th string that was located in each line, and that is now an element of this array.
So What I want is to be able to create an Array of LL of Arrays... first I will find the sum of the ascii values of the second element of an array. Then I will hash the entire array using this value into by Hash Table. Then when I later need to find this element, I will go to the corresponding element of the array, where I have a list of arrays. I will the search for the 2nd value of each array in the list. If i find the one I want, then I return that array, and use the 3rd and 4th element of this array.
As I said, I have this working fine for an Array of LL, but adding the extra dimension of Arrays inside has thrown me off completely. I think it is mostly just figuring out syntax, since I have successfully initialized a Array of LL of Arrays (public static LinkedList[] RdHashLL) so it appears that Java is okay with this in principal. However, I have no idea how to put elements into the Hash Table, and how to read them out.
Below is my code for a ARRAY OF LINKED LISTS that works FINE. I just need help getting it to work for an ARRAY OF LL OF ARRAYS!
public class TableOfHash{
public static LinkedList<String>[] HashLL;
//HASH FUNCTION - Finds sum of ascii values for string
public static int charSum(String s){
int hashVal = 0;
int size = 1019; //Prime Number around size of 8 char of 'z', (8 chars is amoung largest consistantly in dictionary)
for(int i = 0; i < s.length(); i++){
hashVal += s.charAt(i);
}
return hashVal % size;
}
//CREATE EMPTY HASH TABLE - Creates an array of LL
public static void makeHash(){
HashLL = new LinkedList[1019];
for(int i=0; i<HashLL.length; i++){
HashLL[i] = new LinkedList<String>();
}
}
//HASH VALUES INTO TABLE!
public static void dictionary2Hash(LinkedList<String> Dict){
for(String s : Dict){
HashLL[charSum(s)].add(s);
//Finds sum of char vales of dictionary element i,
//and then word at i to the HashLL at point defined
//by the char sum.
}
//Print out part of Hash Table (for testing! for SCIENCE!)
//System.out.println("HASH TABLE::");
//printHashTab();
}
//SEARCH HashTable for input word, return true if found
public boolean isWord(String s){
if(HashLL[charSum(s)].contains(s)){
wordsfound++;
return true;
}
return false;
}
}
I have made some attempts to change this, but for things like if(HashLL[charSum(s)].contains(s)) which searches the LL at the element returned by charsum(s)... I have no idea how to get it to work when it is a LL of Arrays and not of Strings. I have tired HashLL[charSum(s)].[1].contains(s)), and HashLL[charSum(s)][1].contains(s)), and various other things.
The fact that a Google search for "Array of Linked Lists of Arrays" (with quotes) turns up empty has not helped.
Last bit. I realize there might be another data structure that would do what I want, but unless you believe that a Array of LL of Arrays is a totally hopeless cause, I'd like to get it to work as is.
if you have
LinkedList<String[]>[] hashLL;
you can read a specific String like this (one of many ways)
String str = hashLL[outerArrayIndex].get(listIndex)[innerArrayIndex];
To write into the fields, this is possible (assuming everything is initialized correctly).
String[] arr = hashLL[outerArrayIndex].get(listIndex);
arr[index] = "value";

How to skip an else statement a certain number of times, then start again

I have a list of names in an array, and there is some redundancy in it. I was able to get only unique names to print, but I need a way to print the first line, skip the printing however many times there was a redundancy, then continue printing the next name (all redundant instances were always next to eachother). Here is what I have for that part so far:
int x = 1;
int skipCount = 0;
while (x<i){
if (titles[x].length() == titles[x-1].length()){
//do nothing
skipCount++;
}
else{
System.out.printf("%s\n", titles[x]);
}
x++;
}
So basically, how would I go about skipping the else statement 'skipCount' times, then have it start again? I haven't found much about this and am relatively new to java.
Why not just use a Set? ;-)
final Set<String> set = new HashSet<>(Arrays.asList(titles));
for (final String title : set) {
/* title is unique */
System.out.println(title);
}
Some of the changes include using println rather than printf("%s\n", ...), which is just clearer, and using an enhanced for loop, instead of manually tracking the position in the array in a loop.
To be honest, you might consider using a Set<String> in place of String[] for titles in the first place.

Categories