Merged List from Sources and Prefixes

Merged List from Sources and Prefixes - java

I have been given this below problem in school, I have solved as per my understanding and solution is as below. Can someone please help me to give a better solution for the same.
Question:
Produce a software application that creates a filtered merging of two case-insensitive sorted lists. The first input list is designated as the Source list and the other as the Prefixes list. The application will produce a merged list containing items that come from the Source and Prefixes lists using the following algorithm:
An item X is in the merged list if and only if one of the following is true:
a) X is from the Source list and there is an item Y in the Prefixes list that is a case-insensitive string prefix for X.
b) X is from the Prefixes list and there is no item in the Source list for which X is a case-insensitive string prefix.
The completed merged list should be in the same sort order as the items in the original two lists.
My Solution:
public ArrayList<String> merge(List<String> srcList, List<String> preList) {
// If Prefixes list is empty then there cannot be a new merge list
if (preList.isEmpty()) {
return null;
}
int i = 0, j = 0;
int sourcesListSize = srcList.size();
int prefixesListSize = preList.size();
ArrayList<String> mergeList = new ArrayList<String>();
// Loop through Sources list until end of the list is reached
//ASSUMPTION: Both SourceList and PrefixList are already sorted.
while (i < sourcesListSize && j<prefixesListSize) {
mergeList.add(preList.get(j).concat(srcList.get(i)));
i++;
j++;
}
// If Prefixes list still have items, then add it to mergeList
while (j < prefixesListSize) {
mergeList.add(preList.get(j));
j++;
}
return mergeList;
}
Input:
Source list: {"pple","ow","enver",pic,"ull"}
PrefixList: {"a","c","d","e","f"}
MergeList={"apple",cow","denver","epic","full"}
Is my understanding correct? Is there any best other solution?

Since this is homework I'll try not to give too much away, but per the definition of prefix, here are some examples:
"a" is a prefix of "Apple"
"cow" is a prefix of "Cow"
"g" is not a prefix of "Zoo"
"col" is not a prefix of "cool"
Based on that, what will the MergeList be for the following? Hint: there will be items from both SourceList and PrefixList in the correct MergeList. Post your solution and I'll critique it. Once you understand how this part works, you'll have a much better idea of how to code the solution.
SourceList : {"Apple","Pepper","Denver", "Garage", "Zoo"}
PrefixList : {"a","d","pe","xylophone","e"}

Related

Finding a word in file storing it in an array list and making sure that word isn't accounted for more than once?

public ArrayList<String> getWords()
{
int size1 = lines.size();
int size2 = 0;
int counter3 = 0;
ArrayList<Integer> checkthewords;
for (int x = 0; x < size1; x++)
{
size2 = lines.get(x).substring(x).length();
for (int y = 0; y < size2; y++)
{
if (Character.isLetter(charAt(((lines.get(x)).indexOf(x, z + x)))))
{
words.set(z, lines.get(x).substring(x,z + 1));
}
else
{
checkthewords.set(counter3, words);
counter3++;
}
if (checkthewords.get(x).equals(checkthewords.get(counter3)))
{
}
}
}
return words;
}
The method above is a called getWords(). I am trying to get a word from a file and store it in the arrayList checkthewords. I want to make sure that a word is not going to be stored in the arrayList checkthewords more than once.
I have the if statement:
if (Character.isLetter(charAt(((lines.get(x)).indexOf(x, z + x)))))
But, don't know where to go from there.

I'm pretty sure your code won't run at the moment. You are doing some strange things in there and I don't really understand it.
Try to approach this one step at a time.
The first step is to get the word from the file. Make sure you can parse the line and extract the word you want.
Then you need to check if the word exists in your checkthewords list. If it doesn't exist, add it. You can use the contains method provided by List to see if the list contains something.
if(!checkthewords.contains(word)) {
// it's not in the list yet, add it
checkthewords.add(word);
}
Also when you create your checkthewords list, you don't initialise it (so it's null):
ArrayList<String> checkthewords;
should be:
ArrayList<String> checkthewords = new ArrayList<String>();
And you shouldn't use checkthewords.set() like that. set is used to replace an existing element, not to add a new element. You could easily be setting an element that doesn't exist yet and throw an ArrayIndexOutOfBoundsException. Use checkthewords.add(word) to add something to your list.
See the ArrayList documentation.
set(int index, E element)
Replaces the element at the specified position in this list with the specified element.
It seems like you're overthinking this. Keep it simple. :)

You should use Set in Java to store elements when duplicates are not allowed.
A collection that contains no duplicate elements.
If you want to retain insertion order as well then use LinkedHashSet which is Hash table and linked list implementation of the Set interface, with predictable iteration order.
Kindly refer to below tutorials to understand application of Set in java.
Tutorial 1
Tutorial 2
Tutorial 3
See Also-:
HashSet vs TreeSet vs LinkedHashSet
HashSet vs LinkedHashSet

How to add to an arraylist of linkedlists?

I am sorry if this is a stupid question but I am new to Java linkedlists and arraylists.
What I wish to do is this:
I have a text file that I run through word for word. I want to create an Arraylist of linkedlists, which each uniqye word in the text followed in the linked list by the words that it is followed by in the text.
Consider this piece of text: The cat walks to the red tree.
I want the Arraylist of LinkedLists to be like this:
The - cat - red
|
cat - walks
|
to - the
|
red - tree
What I have now is this:
while(dataFile.hasNext()){
secondWord = dataFile.next();
nWords++;
if(nWords % 1000 ==0) System.out.println(nWords+" words");
//and put words into list if not already there
//check if this word is already in the list
if(follows.contains(firstWord)){
//add the next word to it's linked list
((LinkedList)(firstWord)).add(secondWord);
}
else{
//create new linked list for this word and then add next word
follows.add(new LinkedList<E>().add(firstWord));
((LinkedList)(firstWord)).add(secondWord);
}
//go on to next word
firstWord = secondWord;
}
And it gives me plenty of errors.
How can I do to this the best way? (With linkedlists, I know hashtables and binary trees are better but I need to use linked lists)

An ArrayList is not the best data structure for purpose of your outer list, and at least part of your difficulty stems from incorrect use of a list of lists.
In your implementation, presumably follows is an ArrayList of LinkedLists declared like this:
ArrayList<LinkedList<String>> follows = new ArrayList<>();
The result of follows.contains(firstWord) will never be true, because follows contains elements of type LinkedList, not String. firstWord is a String, and so would not be an element of follows, but would be the first element of an ArrayList which is an element of follows.
The solution offered below uses a Map, or more specifically a HashMap, for the outer list follows. A Map is preferable because when searching for the first word, the amortized look-up time will be O(1) using a map versus O(n) for a list.
String firstWord = dataFile.next().toLowerCase();
Map<String, List<String>> follows = new HashMap<>();
int nWords = 0;
while (dataFile.hasNext())
{
String secondWord = dataFile.next().toLowerCase();
nWords++;
if (nWords % 1000 == 0)
{
System.out.println(nWords + " words");
}
//and put words into list if not already there
//check if this word is already in the list
if (follows.containsKey(firstWord))
{
//add the next word to it's linked list
List list = follows.get(firstWord);
if (!list.contains(secondWord))
{
list.add(secondWord);
}
}
else
{
//create new linked list for this word and then add next word
List list = new LinkedList<String>();
list.add(secondWord);
follows.put(firstWord, list);
}
//go on to next word
firstWord = secondWord;
}
The map will look like this:
the: [cat, red]
cat: [walks]
to: [the]
red: [tree]
walks: [to]
I also made the following changes to your implementation:
Don't add duplicates to the list of following words. Note that a Set would be a more appropriate data structure for this task, but you clearly state that a requirement is to use LinkedList.
Use String.toLowerCase() to move all strings to lower case, so that "the" and "The" are treated equivalently. (Be sure you apply this to the initial value of firstWord as well, which doesn't appear in the code you provided.)
Note that both this solution and your original attempt assume that punctuation has already been removed.

You should not work using direct classes implementation, instead using their interfaces to ease the development (among other reasons). So, instead do the type casting every when and now, declare your variable as List and just define the class when initializing it. Since you haven't posted the relevant code to redefine it, I could give you an example of this:
List<List<String>> listOfListOfString = new LinkedList<>(); //assuming Java 7 or later used
List<String> listOne = new ArrayList<>();
listOne.add("hello");
listOne.add("world");
listOfListOfString.add(listOne);
List<String> listTwo = new ArrayList<>();
listTwo.add("bye);
listTwo.add("world");
listOfListOfString.add(listTwo);
for (List<String> list : listOfListOfString) {
System.out.println(list);
}
This will print:
[hello, world]
[bye, world]
Note that now you can change the implementation of any of listOne or listTwo to LinkedList:
List<String> listOne = new LinkedList<>();
//...
List<String> listTwo = new LinkedList<>();
And the code will behave the same. No need to do any typecast to make it work.
Related:
What does it mean to "program to an interface"?

Howto transform each set of two elements of a source list into a transformed list?

I have a List<String> with elements like:
"<prefix-1>/A",
"<prefix-1>/B",
"<prefix-2>/A",
"<prefix-2>/B",
"<prefix-3>/A",
"<prefix-3>/B",
that is, for every <prefix>, there are two entries: <prefix>/A, <prefix>/B. (My list is already sorted, the prefixes might have different length.)
I want the list of prefixes:
"<prefix-1>",
"<prefix-2>",
"<prefix-3>",
What is a good way to transform a source list, when multiple (but always a constant amount of elements) correspond to one element in the transformed list?
Thank you for your consideration

If the prefixes are always a constant length, you can trim them out and put them into a Set:
List<String> elements = // initialize here
Set<String> prefixes = new HashSet<String>();
for( String element : elements) {
String prefix = element.substring(0,"<prefix-n>".length());
prefixes.add(prefix);
}
// Prefixes now has a unique set of prefixes.
You can do the same thing with regular expressions if you have a variable length prefix, or if you have more complex conditions.

Here is a solution that does not change the order of prefixes in the result. Since the elements are pre-sorted, you can take elements until you find a prefix that differs from the last taken element, and add new elements to the result, like this:
List<String> res = new ArrayList<String>();
String last = null;
for (String s : src) {
String cand = s.substring(0, s.lastIndexOf('/'));
// initially, last is null, so the first item will always be taken
if (!cand.equals(last)) {
// The assignment of last happens together with addition.
// If you think it's not overly readable, you can move it out.
res.add(last = cand);
}
}
Here is a demo on ideone.

If the number if structurally similar elements is always the same, then you cam just loop over the beginning of the list to find out this number, and then skip elements to construct the rest.

public List<String> getMyList(prefix){
List<String> selected= new ArrayList<String>();
for(String s:mainList){
if(s.endsWith(prefix.toLower())) // or .contains(), depending on
selected.add(s); // what you want exactly
}
return selected;
}

Finding index of duplicated element in arraylist

I'm trying to find the index position of the duplicates in an arraylist of strings. I'm having trouble figuring out a way to efficiently loop through the arraylist and report the index of the duplicate. My initial thought was to use Collections.binarySearch() to look for a duplicate, but I'm not sure how I would be able to compare the elements of the arraylist to each other with binarySearch. The only other thought I had would involve looping through the list, which is quite massive, too many times to even be feasible. I have limited java knowledge so any help is appreciated.

Not elegant, but should work:
Map<String, List<Integer>> indexList = new HashMap<String, List<Integer>>();
for (int i = 0; i < yourList.size(); i++) {
String currentString = yourList.get(i);
List<String> indexes = indexList.get(currentString);
if (indexes == null) {
indexList.put(currentString, indexes = new LinkedList<Integer>());
}
indexes.add(i);
if (indexes.size() > 1) {
// found duplicate, do what you like
}
}
// if you skip the last if in the for loop you can do this:
for (String string : indexList.keySet()) {
if (indexList.get(string).size() > 1) {
// String string has multiple occurences
// List of corresponding indexes:
List<Integer> indexes = indexList.get(string);
// do what you want
}
}

It sounds like you're out of luck.
You will have to inspect every element (i.e. iterate through the whole list). Think about it logically - if you could avoid this, it means that there's one element that you haven't inspected. But this element could be any value, and so could be a duplicate of another list element.
Binary searches are a smart way to reduce the number of elements checked when you are aware of some relationship that holds across the list - so that checking one element gives you information about the others. For instance, for a sorted list if the middle element is greater than 5, you know that every element after it is also greater than five.
However, I don't think there's a way to make such an inference when it comes to duplicate checking. You'd have to sort the list in terms of "number of elements that this duplicates" (which is begging the question), otherwise no tests you perform on element x will give you insight into whether y is a duplicate.

Now this may not be a memory efficient solution but yes I guess this is what you were looking for.. May be this program could be further improved.
import java.io.*;
import java.util.*;
class ArrayList2_CountingDuplicates
{
public static void main(String[] args)throws IOException
{
ArrayList<String> als1=new ArrayList<String>();
ArrayList<String> als2=new ArrayList<String>();
int arr[];
int n,i,j,c=0;
String s;
BufferedReader p=new BufferedReader(new InputStreamReader(System.in));
n=Integer.parseInt(p.readLine());
arr=new int[n];
for(i=0;i<n;i++)
als1.add(p.readLine());
for(i=0;i<n;i++)
{
s=als1.get(i);
als1.remove(i);
als2.add(s);
arr[c]=1;
while(als1.contains(s))
{
j=als1.indexOf(s);
als1.remove(j);
arr[c]=arr[c]+1;
}
n=n-arr[c];
c=c+1;
i=-1;
}
for(i=0;i<c;i++)
System.out.println(als2.get(i)+" has frequency "+arr[i]);
}
}

I was looking for such a method and eventually I came up with my own solution with a more functional approach to solve the problem.
public <T> Map<T, List<Integer>> findDuplicatesWithIndexes(List<T> elems) {
return IntStream.range(0, elems.size())
.boxed()
.collect(Collectors.groupingBy(elems::get))
.entrySet().stream()
.filter(e -> e.getValue().size() > 1)
.collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
}
It returns a map consisting of duplicated elements as the keys and list of all indexes of repeating element as the value.

Remove multiple elements from ArrayList

I have a bunch of indexes and I want to remove elements at these indexes from an ArrayList. I can't do a simple sequence of remove()s because the elements are shifted after each removal. How do I solve this?

To remove elements at indexes:
Collections.sort(indexes, Collections.reverseOrder());
for (int i : indexes)
strs.remove(i);
Or, using the Stream API from Java 8:
indexes.sort(Comparator.reverseOrder());
indexes.stream().mapToInt(i -> i).forEach(l::remove);

Sort the indices in descending order and then remove them one by one. If you do that, there's no way a remove will affect any indices that you later want to remove.
How you sort them will depend on the collection you are using to store the indices. If it's a list, you can do this:
List<Integer> indices;
Collections.sort(indices, new Comparator<Integer>() {
public int compare(Integer a, Integer b) {
//todo: handle null
return b.compareTo(a);
}
}
Edit
#aioobe found the helper that I failed to find. Instead of the above, you can use
Collections.sort(indices, Collections.reverseOrder());

I came here for removing elements in specific range (i.e., all elements between 2 indexes), and found this:
list.subList(indexStart, indexEnd).clear()

You can remove the elements starting from the largest index downwards, or if you have references to the objects you wish to remove, you can use the removeAll method.

you might want to use the subList method with the range of index you would like to remove and
then call clear() on it.
(pay attention that the second parameter is exclusive - for example in this case, I pass 2 meaning only index 0 and 1 will be removed.):
public static void main(String[] args) {
ArrayList<String> animals = new ArrayList<String>();
animals.add("cow");
animals.add("dog");
animals.add("chicken");
animals.add("cat");
animals.subList(0, 2).clear();
for(String s : animals)
System.out.println(s);
}
}
the result will be:
chicken
cat

You can remove the indexes in reverse order. If the indexes are in order like 1,2,3 you can do removeRange(1, 3).

I think nanda was the correct answer.
List<T> toRemove = new LinkedList<T>();
for (T t : masterList) {
if (t.shouldRemove()) {
toRemove.add(t);
}
}
masterList.removeAll(toRemove);

You can sort the indices as many said, or you can use an iterator and call remove()
List<String> list = new ArrayList<String>();
list.add("0");
list.add("1");
list.add("2");
list.add("3");
list.add("4");
list.add("5");
list.add("6");
List<Integer> indexes = new ArrayList<Integer>();
indexes.add(2);
indexes.add(5);
indexes.add(3);
int cpt = 0;
Iterator<String> it = list.iterator();
while(it.hasNext()){
it.next();
if(indexes.contains(cpt)){
it.remove();
}
cpt++;
}
it depends what you need, but the sort will be faster in most cases

Use guava! The method you are looking is Iterators.removeAll(Iterator removeFrom, Collection elementsToRemove)

If you have really many elements to remove (and a long list), it may be faster to iterate over the list and add all elements who are not to be removed to a new list, since each remove()-step in a array-list copies all elements after the removed one by one. In this case, if you index list is not already sorted (and you can iterate over it parallel to the main list), you may want to use a HashSet or BitSet or some similar O(1)-access-structure for the contains() check:
/**
* creates a new List containing all elements of {#code original},
* apart from those with an index in {#code indices}.
* Neither the original list nor the indices collection is changed.
* #return a new list containing only the remaining elements.
*/
public <X> List<X> removeElements(List<X> original, Collection<Integer> indices) {
// wrap for faster access.
indices = new HashSet<Integer>(indices);
List<X> output = new ArrayList<X>();
int len = original.size();
for(int i = 0; i < len; i++) {
if(!indices.contains(i)) {
output.add(original.get(i));
}
}
return output;
}

order your list of indexes, like this
if 2,12,9,7,3 order desc to 12,9,7,3,2
and then do this
for(var i = 0; i < indexes.length; i++)
{
source_array.remove(indexes[0]);
}
this should resolve your problem.

If the elements you wish to remove are all grouped together, you can do a subList(start, end).clear() operation.
If the elements you wish to remove are scattered, it may be better to create a new ArrayList, add only the elements you wish to include, and then copy back into the original list.
Edit: I realize now this was not a question of performance but of logic.

If you want to remove positions X to the Size
//a is the ArrayList
a=(ArrayList)a.sublist(0,X-1);

Assuming your indexes array is sorted (eg: 1, 3, 19, 29), you can do this:
for (int i = 0; i < indexes.size(); i++){
originalArray.remove(indexes.get(i) - i);
}

A more efficient method that I guess I have not seen above is creating a new Arraylist and selecting which indices survive by copying them to the new array. And finally reassign the reference.

I ended up here for a similar query and #aioobe's answer helped me figure out the solution.
However, if you are populating the list of indices to delete yourself, might want to consider using this:
indices.add(0, i);
This will eliminate the need for (the costly) reverse-sorting of the list before iterating over it, while removing elements from the main ArrayList.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Merged List from Sources and Prefixes - java

Related

Finding a word in file storing it in an array list and making sure that word isn't accounted for more than once?

How to add to an arraylist of linkedlists?

Howto transform each set of two elements of a source list into a transformed list?

Finding index of duplicated element in arraylist

Remove multiple elements from ArrayList

Categories

Resources