How to add to an arraylist of linkedlists? - java

I am sorry if this is a stupid question but I am new to Java linkedlists and arraylists.
What I wish to do is this:
I have a text file that I run through word for word. I want to create an Arraylist of linkedlists, which each uniqye word in the text followed in the linked list by the words that it is followed by in the text.
Consider this piece of text: The cat walks to the red tree.
I want the Arraylist of LinkedLists to be like this:
The - cat - red
|
cat - walks
|
to - the
|
red - tree
What I have now is this:
while(dataFile.hasNext()){
secondWord = dataFile.next();
nWords++;
if(nWords % 1000 ==0) System.out.println(nWords+" words");
//and put words into list if not already there
//check if this word is already in the list
if(follows.contains(firstWord)){
//add the next word to it's linked list
((LinkedList)(firstWord)).add(secondWord);
}
else{
//create new linked list for this word and then add next word
follows.add(new LinkedList<E>().add(firstWord));
((LinkedList)(firstWord)).add(secondWord);
}
//go on to next word
firstWord = secondWord;
}
And it gives me plenty of errors.
How can I do to this the best way? (With linkedlists, I know hashtables and binary trees are better but I need to use linked lists)

An ArrayList is not the best data structure for purpose of your outer list, and at least part of your difficulty stems from incorrect use of a list of lists.
In your implementation, presumably follows is an ArrayList of LinkedLists declared like this:
ArrayList<LinkedList<String>> follows = new ArrayList<>();
The result of follows.contains(firstWord) will never be true, because follows contains elements of type LinkedList, not String. firstWord is a String, and so would not be an element of follows, but would be the first element of an ArrayList which is an element of follows.
The solution offered below uses a Map, or more specifically a HashMap, for the outer list follows. A Map is preferable because when searching for the first word, the amortized look-up time will be O(1) using a map versus O(n) for a list.
String firstWord = dataFile.next().toLowerCase();
Map<String, List<String>> follows = new HashMap<>();
int nWords = 0;
while (dataFile.hasNext())
{
String secondWord = dataFile.next().toLowerCase();
nWords++;
if (nWords % 1000 == 0)
{
System.out.println(nWords + " words");
}
//and put words into list if not already there
//check if this word is already in the list
if (follows.containsKey(firstWord))
{
//add the next word to it's linked list
List list = follows.get(firstWord);
if (!list.contains(secondWord))
{
list.add(secondWord);
}
}
else
{
//create new linked list for this word and then add next word
List list = new LinkedList<String>();
list.add(secondWord);
follows.put(firstWord, list);
}
//go on to next word
firstWord = secondWord;
}
The map will look like this:
the: [cat, red]
cat: [walks]
to: [the]
red: [tree]
walks: [to]
I also made the following changes to your implementation:
Don't add duplicates to the list of following words. Note that a Set would be a more appropriate data structure for this task, but you clearly state that a requirement is to use LinkedList.
Use String.toLowerCase() to move all strings to lower case, so that "the" and "The" are treated equivalently. (Be sure you apply this to the initial value of firstWord as well, which doesn't appear in the code you provided.)
Note that both this solution and your original attempt assume that punctuation has already been removed.

You should not work using direct classes implementation, instead using their interfaces to ease the development (among other reasons). So, instead do the type casting every when and now, declare your variable as List and just define the class when initializing it. Since you haven't posted the relevant code to redefine it, I could give you an example of this:
List<List<String>> listOfListOfString = new LinkedList<>(); //assuming Java 7 or later used
List<String> listOne = new ArrayList<>();
listOne.add("hello");
listOne.add("world");
listOfListOfString.add(listOne);
List<String> listTwo = new ArrayList<>();
listTwo.add("bye);
listTwo.add("world");
listOfListOfString.add(listTwo);
for (List<String> list : listOfListOfString) {
System.out.println(list);
}
This will print:
[hello, world]
[bye, world]
Note that now you can change the implementation of any of listOne or listTwo to LinkedList:
List<String> listOne = new LinkedList<>();
//...
List<String> listTwo = new LinkedList<>();
And the code will behave the same. No need to do any typecast to make it work.
Related:
What does it mean to "program to an interface"?

Related

How to find the number of unique words in array list

So I am trying to create an for loop to find unique elements in a ArrayList.
I already have a ArrayList stored with user input of 20 places (repeats are allowed) but I am stuck on how to count the number of different places inputted in the list excluding duplicates. (i would like to avoid using hash)
Input:
[park, park, sea, beach, town]
Output:
[Number of unique places = 4]
Heres a rough example of the code I'm trying to make:
public static void main(String[] args) {
ArrayList<City> place = new ArrayList();
Scanner sc = new Scanner(System.in);
for(...) { // this is just to receive 20 inputs from users using the scanner
...
}
# This is where i am lost on creating a for loop...
}
you can use a Set for that.
https://docs.oracle.com/javase/7/docs/api/java/util/Set.html
Store the list data to the Set.Set will not have duplicates in it, so the size of set will be the elements without duplicates.
use this method to get the set size.
https://docs.oracle.com/javase/7/docs/api/java/util/Set.html#size()
Sample Code.
List<String> citiesWithDuplicates =
Arrays.asList(new String[] {"park", "park", "sea", "beach", "town"});
Set<String> cities = new HashSet<>(citiesWithDuplicates);
System.out.println("Number of unique places = " + cities.size());
If you are able to use Java 8, you can use the distinct method of Java streams:
int numOfUniquePlaces = list.stream().distinct().count();
Otherwise, using a set is the easiest solution. Since you don't want to use "hash", use a TreeSet (although HashSet is in most cases the better solution). If that is not an option either, you'll have to manually check for each element whether it's a duplicate or not.
One way that comes to mind (without using Set or hashvalues) is to make a second list.
ArrayList<City> places = new ArrayList<>();
//Fill array
ArrayList<String> uniquePlaces = new ArrayList<>();
for (City city : places){
if (!uniquePlaces.contains(city.getPlace())){
uniquePlaces.add(city.getPlace());
}
}
//number of unique places:
int uniqueCount = uniquePlaces.size();
Note that this is not super efficient =D
If you do not want to use implementations of Set or Map interfaces (that would solve you problem with one line of code) and you want to stuck with ArrayList, I suggest use something like Collections.sort() method. It will sort you elements. Then iterate through the sorted array and compare and count duplicates. This trick can make solving your iteration problem easier.
Anyway, I strongly recommend using one of the implementations of Set interface.
Use following answer. This will add last duplicate element in distinct list if there are multiple duplicate elements.
List<String> citiesWithDuplicates = Arrays.asList(new String[] {
"park", "park", "sea", "beach", "town", "park", "beach" });
List<String> distinctCities = new ArrayList<String>();
int currentIndex = 0;
for (String city : citiesWithDuplicates) {
int index = citiesWithDuplicates.lastIndexOf(city);
if (index == currentIndex) {
distinctCities.add(city);
}
currentIndex++;
}
System.out.println("[ Number of unique places = "
+ distinctCities.size() + "]");
Well if you do not want to use any HashSets or similar options, a quick and dirty nested for-loop like this for example does the trick (it is just slow as hell if you have a lot of items (20 would be just fine)):
int differentCount=0;
for(City city1 : place){
boolean same=false;
for(City city2 : place){
if(city1.equals(city2)){
same=true;
break;
}
}
if(!same)
differentCount++;
}
System.out.printf("Number of unique places = %d\n",differentCount);

Howto transform each set of two elements of a source list into a transformed list?

I have a List<String> with elements like:
"<prefix-1>/A",
"<prefix-1>/B",
"<prefix-2>/A",
"<prefix-2>/B",
"<prefix-3>/A",
"<prefix-3>/B",
that is, for every <prefix>, there are two entries: <prefix>/A, <prefix>/B. (My list is already sorted, the prefixes might have different length.)
I want the list of prefixes:
"<prefix-1>",
"<prefix-2>",
"<prefix-3>",
What is a good way to transform a source list, when multiple (but always a constant amount of elements) correspond to one element in the transformed list?
Thank you for your consideration
If the prefixes are always a constant length, you can trim them out and put them into a Set:
List<String> elements = // initialize here
Set<String> prefixes = new HashSet<String>();
for( String element : elements) {
String prefix = element.substring(0,"<prefix-n>".length());
prefixes.add(prefix);
}
// Prefixes now has a unique set of prefixes.
You can do the same thing with regular expressions if you have a variable length prefix, or if you have more complex conditions.
Here is a solution that does not change the order of prefixes in the result. Since the elements are pre-sorted, you can take elements until you find a prefix that differs from the last taken element, and add new elements to the result, like this:
List<String> res = new ArrayList<String>();
String last = null;
for (String s : src) {
String cand = s.substring(0, s.lastIndexOf('/'));
// initially, last is null, so the first item will always be taken
if (!cand.equals(last)) {
// The assignment of last happens together with addition.
// If you think it's not overly readable, you can move it out.
res.add(last = cand);
}
}
Here is a demo on ideone.
If the number if structurally similar elements is always the same, then you cam just loop over the beginning of the list to find out this number, and then skip elements to construct the rest.
public List<String> getMyList(prefix){
List<String> selected= new ArrayList<String>();
for(String s:mainList){
if(s.endsWith(prefix.toLower())) // or .contains(), depending on
selected.add(s); // what you want exactly
}
return selected;
}

How to find total number of different items within an arraylist.

I've done some searching but I wasn't able to find a valid solution. I have an arraylist storing Strings such as gum, socks, OJ, dog food...
I am having trouble iterating the list to determine the total number of differnt types of items.
ie.
ArrayList<String> Store = new ArrayList<String>();
this.Store.add("Gum");
this.Store.add("Gum");
this.Store.add("Socks");
this.Store.add("Candy");
The list has 4 total items, but only three different kinds of items (Gum, Sucks, Candy).
How would I design a method to calculate the 3?
What Bhesh Gurung said, but in code:
int numUnique = new HashSet<String>(Store).size();
If what you actually have is StoreItems and need to go through getName() then I would do
Set<String> itemNames = new HashSet<String>();
for (StoreItem item : Store)
itemNames.add(item.getName());
int numUnique = itemNames.size();
Use a Set (HashSet) whose size will give you what you are looking for.
This looks like a homework... So, if you do not understand the HashSet solution proposed above (or doning the same with a HashMap), think about doing something like this:
Create a new ArrayList
Take an element and check to see if it exists in the new ArrayList
If it is present in the new ArrayList, do nothing. Else add it.
Do this until you have examined the last element of the ArrayList.
Then, the size of the new array list should be the number you are looking for.
You can use the lastIndexOf method and loop through the arraylist.
char count=0;
for(char i=0; i<araylist.size(); i++){
if(i == araylist.lastIndexOf(araylist.get(i))){
count++;
}
}
Tested.

Merged List from Sources and Prefixes

I have been given this below problem in school, I have solved as per my understanding and solution is as below. Can someone please help me to give a better solution for the same.
Question:
Produce a software application that creates a filtered merging of two case-insensitive sorted lists. The first input list is designated as the Source list and the other as the Prefixes list. The application will produce a merged list containing items that come from the Source and Prefixes lists using the following algorithm:
An item X is in the merged list if and only if one of the following is true:
a) X is from the Source list and there is an item Y in the Prefixes list that is a case-insensitive string prefix for X.
b) X is from the Prefixes list and there is no item in the Source list for which X is a case-insensitive string prefix.
The completed merged list should be in the same sort order as the items in the original two lists.
My Solution:
public ArrayList<String> merge(List<String> srcList, List<String> preList) {
// If Prefixes list is empty then there cannot be a new merge list
if (preList.isEmpty()) {
return null;
}
int i = 0, j = 0;
int sourcesListSize = srcList.size();
int prefixesListSize = preList.size();
ArrayList<String> mergeList = new ArrayList<String>();
// Loop through Sources list until end of the list is reached
//ASSUMPTION: Both SourceList and PrefixList are already sorted.
while (i < sourcesListSize && j<prefixesListSize) {
mergeList.add(preList.get(j).concat(srcList.get(i)));
i++;
j++;
}
// If Prefixes list still have items, then add it to mergeList
while (j < prefixesListSize) {
mergeList.add(preList.get(j));
j++;
}
return mergeList;
}
Input:
Source list: {"pple","ow","enver",pic,"ull"}
PrefixList: {"a","c","d","e","f"}
MergeList={"apple",cow","denver","epic","full"}
Is my understanding correct? Is there any best other solution?
Since this is homework I'll try not to give too much away, but per the definition of prefix, here are some examples:
"a" is a prefix of "Apple"
"cow" is a prefix of "Cow"
"g" is not a prefix of "Zoo"
"col" is not a prefix of "cool"
Based on that, what will the MergeList be for the following? Hint: there will be items from both SourceList and PrefixList in the correct MergeList. Post your solution and I'll critique it. Once you understand how this part works, you'll have a much better idea of how to code the solution.
SourceList : {"Apple","Pepper","Denver", "Garage", "Zoo"}
PrefixList : {"a","d","pe","xylophone","e"}

How to remove all specific elements from Vector

In fact, regarding to the title in the question, I have a solution for this, but my approach seems to waste resources to create a List objects.
So my question is: Do we have a more efficient approach for this?
From the case, I want to remove the extra space " " and extra "a" from a Vector.
My vector includes:
{"a", "rainy", " ", "day", "with", " ", "a", "cold", "wind", "day", "a"}
Here is my code:
List lt = new LinkedList();
lt = new ArrayList();
lt.add("a");
lt.add(" ");
vec1.removeAll(lt);
As you can see the extra spaces in the list of Vector, the reason that happens is that I use Vector to read and chunk the word from word document, and sometimes the document may contain some extra spaces that caused by human error.
Your current approach does suffer the problem that deleting an element from a Vector is an O(N) operation ... and you are potentially doing this M times (5 in your example).
Assuming that you have multiple "stop words" and that you can change the data structures, here's a version that should (in theory) be more efficient:
public List<String> removeStopWords(
List<String> input, HashSet<String> stopWords) {
List<String> output = new ArrayList<String>(input.size());
for (String elem : input) {
if (!stopWords.contains(elem)) {
output.append(elem);
}
}
return res;
}
// This could be saved somewhere, assuming that you are always filtering
// out the same stopwords.
HashSet<String> stopWords = new HashSet<String>();
stopWords.add(" ");
stopWords.add("a");
... // and more
List<String> newList = removeStopwords(list, stopWords);
Points of note:
The above creates a new list. If you have to reuse the existing list, clear it and then addAll the new list elements. (This another O(N-M) step ... so don't if you don't have to.)
If there are multiple stop words then using a HashSet will be more efficient; e.g. if done as above. I'm not sure exactly where the break even point is (versus using a List), but I suspect it is between 2 and 3 stopwords.
The above creates a new list, but it only copies N - M elements. By contrast, the removeAll algorithm when applied to a Vector could copy O(NM) elements.
Don't use a Vector unless you need a thread-safe data structure. An ArrayList has a similar internal data structure, and doesn't incur synchronization overheads on each call.

Categories