Random accessing array, how to skip duplicates? - java

I have an XML array that I access to pull a random question from. How would I go about making sure there is no duplicates pulled? My current code follows.
private void getQuestion() {
// TODO Auto-generated method stub
res = getResources();
qString = res.getStringArray(R.array.questions);
rQuestion = qString[rgenerator.nextInt(qString.length)];
tokens = new StringTokenizer(rQuestion, ":");
wordCount = tokens.countTokens();
sep = new String[wordCount];
wArray = 0;
while (tokens.hasMoreTokens()) {
sep[wArray] = tokens.nextToken();
wArray++;
}
}
Any help would be appreciated.

The Fisher-Yates shuffle is an algorithm that is more or less designed for this purpose.

You are better off putting that array of questions in a list and use Collections.shuffle(). After that, simply iterate through the list. More information can be found at this related answer.
This solution will cost some memory for duplicating the list, but remember that the strings themselves won't be copied, only the references to the questions are. For maximum performance, use a list with random access (ArrayList), or use that as a replacement for the array. If you don't theshuffle method will create one internally.

If you want a fast way of getting only unique values from an array this link has a very fast method. Below uses an ArrayList, but it will not be hard for you to convert from string array to an ArrayList - or just use ArrayLists instead.
e.g. new ArrayList(Arrays.asList(myArray));
In short you use a hashset to only get unique values using this method
public static ArrayList GetUniqueValues(Collection values)
{
return new ArrayList(new HashSet(values));
}
Then use it like so
ArrayList x = new ArrayList();
x.add("abc");
x.add("abc");
x.add("abc");
x.add("def");
x.add("def");
x.add("ghi");
for (Object y : GetUniqueValues(x))
Log.d("something", y); //ok lets print the value
To yield the result of "abc, def, and ghi"
To be clear I agree with Travis to ask why you have duplicates. The above is to answer the question.

I figured it out. I switched it to
private void getQuestion() {
res = getResources();
qString = res.getStringArray(R.array.questions);
arrayLength = qString.length;
qTotal = arrayLength;
}
private void getRandom() {
rnd = rgenerator.nextInt(arrayLength);
rQuestion = qString[rnd];
qString[rnd] = "used";
seperate();
}
private void seperate() {
if (rQuestion != "used") {
tokens = new StringTokenizer(rQuestion, ":");
wordCount = tokens.countTokens();
sep = new String[wordCount];
wArray = 0;
while (tokens.hasMoreTokens()) {
sep[wArray] = tokens.nextToken();
wArray++;
}
qNumber++;
} else {
if (qNumber < qTotal) {
getRandom();
} else {
startActivity(new Intent("com.example.END"));
}
}
}
It gets the array from resources, then pulls a random question from the array. It then sets that one to "used" and splits it. It also checks to see if the pulled question is "used, and if it is, it pulls another question. It also goes to the end game activity if all questions are "used"

Related

How can I retrieve the value in a Hashmap stored in an arraylist type hashmap?

I am a beginner in Java. Basically, I have loaded each text document and stored each individual words in the text document in the hasmap. Afterwhich, I tried storing all the hashmaps in an ArrayList. Now I am stuck with how to retrieve all the words in my hashmaps that is in the arraylist!
private static long numOfWords = 0;
private String userInputString;
private static long wordCount(String data) {
long words = 0;
int index = 0;
boolean prevWhiteSpace = true;
while (index < data.length()) {
//Intialise character variable that will be checked.
char c = data.charAt(index++);
//Determine whether it is a space.
boolean currWhiteSpace = Character.isWhitespace(c);
//If previous is a space and character checked is not a space,
if (prevWhiteSpace && !currWhiteSpace) {
words++;
}
//Assign current character's determination of whether it is a spacing as previous.
prevWhiteSpace = currWhiteSpace;
}
return words;
} //
public static ArrayList StoreLoadedFiles()throws Exception{
final File f1 = new File ("C:/Users/Admin/Desktop/dataFiles/"); //specify the directory to load files
String data=""; //reset the words stored
ArrayList<HashMap> hmArr = new ArrayList<HashMap>(); //array of hashmap
for (final File fileEntry : f1.listFiles()) {
Scanner input = new Scanner(fileEntry); //load files
while (input.hasNext()) { //while there are still words in the document, continue to load all the words in a file
data += input.next();
input.useDelimiter("\t"); //similar to split function
} //while loop
String textWords = data.replaceAll("\\s+", " "); //remove all found whitespaces
HashMap<String, Integer> hm = new HashMap<String, Integer>(); //Creates a Hashmap that would be renewed when next document is loaded.
String[] words = textWords.split(" "); //store individual words into a String array
for (int j = 0; j < numOfWords; j++) {
int wordAppearCount = 0;
if (hm.containsKey(words[j].toLowerCase().replaceAll("\\W", ""))) { //replace non-word characters
wordAppearCount = hm.get(words[j].toLowerCase().replaceAll("\\W", "")); //remove non-word character and retrieve the index of the word
}
if (!words[j].toLowerCase().replaceAll("\\W", "").equals("")) {
//Words stored in hashmap are in lower case and have special characters removed.
hm.put(words[j].toLowerCase().replaceAll("\\W", ""), ++wordAppearCount);//index of word and string word stored in hashmap
}
}
hmArr.add(hm);//stores every single hashmap inside an ArrayList of hashmap
} //end of for loop
return hmArr; //return hashmap ArrayList
}
public static void LoadAllHashmapWords(ArrayList m){
for(int i=0;i<m.size();i++){
m.get(i); //stuck here!
}
Firstly your login wont work correctly. In the StoreLoadedFiles() method you iterate through the words like for (int j = 0; j < numOfWords; j++) { . The numOfWords field is initialized to zero and hence this loop wont execute at all. You should initialize that with length of words array.
Having said that to retrieve the value from hashmap from a list of hashmap, you should first iterate through the list and with each hashmap you could take the entry set. Map.Entry is basically the pair that you store in the hashmap. So when you invoke map.entrySet() method it returns a java.util.Set<Map.Entry<Key, Value>>. A set is returned because the key will be unique.
So a complete program will look like.
import java.io.File;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map.Entry;
import java.util.Scanner;
public class FileWordCounter {
public static List<HashMap<String, Integer>> storeLoadedFiles() {
final File directory = new File("C:/Users/Admin/Desktop/dataFiles/");
List<HashMap<String, Integer>> listOfWordCountMap = new ArrayList<HashMap<String, Integer>>();
Scanner input = null;
StringBuilder data;
try {
for (final File fileEntry : directory.listFiles()) {
input = new Scanner(fileEntry);
input.useDelimiter("\t");
data = new StringBuilder();
while (input.hasNext()) {
data.append(input.next());
}
input.close();
String wordsInFile = data.toString().replaceAll("\\s+", " ");
HashMap<String, Integer> wordCountMap = new HashMap<String, Integer>();
for(String word : wordsInFile.split(" ")){
String strippedWord = word.toLowerCase().replaceAll("\\W", "");
int wordAppearCount = 0;
if(strippedWord.length() > 0){
if(wordCountMap.containsKey(strippedWord)){
wordAppearCount = wordCountMap.get(strippedWord);
}
wordCountMap.put(strippedWord, ++wordAppearCount);
}
}
listOfWordCountMap.add(wordCountMap);
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} finally {
if(input != null) {
input.close();
}
}
return listOfWordCountMap;
}
public static void loadAllHashmapWords(List<HashMap<String, Integer>> listOfWordCountMap) {
for(HashMap<String, Integer> wordCountMap : listOfWordCountMap){
for(Entry<String, Integer> wordCountEntry : wordCountMap.entrySet()){
System.out.println(wordCountEntry.getKey() + " - " + wordCountEntry.getValue());
}
}
}
public static void main(String[] args) {
List<HashMap<String, Integer>> listOfWordCountMap = storeLoadedFiles();
loadAllHashmapWords(listOfWordCountMap);
}
}
Since you are beginner in Java programming I would like to point out a few best practices that you could start using from the beginning.
Closing resources : In your while loop to read from files you are opening a Scanner like Scanner input = new Scanner(fileEntry);, But you never closes it. This causes memory leaks. You should always use a try-catch-finally block and close resources in finally block.
Avoid unnecessary redundant calls : If an operation is the same while executing inside a loop try moving it outside the loop to avoid redundant calls. In your case for example the scanner delimiter setting as input.useDelimiter("\t"); is essentially a one time operation after a scanner is initialized. So you could move that outside the while loop.
Use StringBuilder instead of String : For repeated string manipulations such as concatenation should be done using a StringBuilder (or StringBuffer when you need synchronization) instead of using += or +. This is because String is an immutable object, meaning its value cannot be changed. So each time when you do a concatenation a new String object is created. This results in a lot of unused instances in memory. Where as StringBuilder is mutable and values could be changed.
Naming convention : The usual naming convention in Java is starting with lower-case letter and first letter upper-case for each word. So its a standard practice to name a method as storeLoadedFiles as opposed to StoreLoadedFiles. (This could be opinion based ;))
Give descriptive names : Its a good practice to give descriptive names. It helps in later code maintenance. Say its better to give a name as wordCountMap as opposed to hm. So in future if someone tries to go through your code they'll get a better and faster understanding about your code with descriptive names. Again opinion based.
Use generics as much as possible : This avoid additional casting overhead.
Avoid repetition : Similar to point 2 if you have an operation that result in the same output and need to be used multiple times try moving it to a variable and use the variable. In your case you were using words[j].toLowerCase().replaceAll("\\W", "") multiple times. All the time the result is the same but it creates unnecessary instances and repetitions. So you could move that to a String and use that String elsewhere.
Try using for-each loop where ever possible : This relieves us from taking care of indexing.
These are just suggestions. I tried to include most of it in my code but I wont say its the perfect one. Since you are a beginner if you tried to include these best practices now itself it'll get ingrained in you. Happy coding.. :)
for (HashMap<String, Integer> map : m) {
for(Entry<String,Integer> e:map.entrySet()){
//your code here
}
}
or, if using java 8 you can play with lambda
m.stream().forEach((map) -> {
map.entrySet().stream().forEach((e) -> {
//your code here
});
});
But before all you have to change method signature to public static void LoadAllHashmapWords(List<HashMap<String,Integer>> m) otherwise you would have to use a cast.
P.S. are you sure your extracting method works? I've tested it a bit and had list of empty hashmaps all the time.

separating unique values in an algorithm

I am decomposing a series of 90,000+ strings into a discrete list of the individual, non-duplicated pairs of words that are included in the strings with the rxcui id values associated with each string. I have developed a method which tries to accomplish this, but it is producing a lot of redundancy. Analysis of the data shows there are about 12,000 unique words in the 90,000+ source strings, after I clean and format the contents of the strings.
How can I change the code below so that it avoids creating the redundant rows in the destination 2D ArrayList (shown below the code)?
public static ArrayList<ArrayList<String>> getAllWords(String[] tempsArray){//int count = tempsArray.length;
int fieldslenlessthan2 = 0;//ArrayList<String> outputarr = new ArrayList<String>();
ArrayList<ArrayList<String>> twoDimArrayList= new ArrayList<ArrayList<String>>();
int idx = 0;
for (String s : tempsArray) {
String[] fields = s.split("\t");//System.out.println(" --- fields.length is: "+fields.length);
if(fields.length>1){
ArrayList<String> row = new ArrayList<String>();
System.out.println("fields[0] is: "+fields[0]);
String cleanedTerms = cleanTerms(fields[1]);
String[] words = cleanedTerms.split(" ");
for(int j=0;j<words.length;j++){
String word=words[j].trim();
word = word.toLowerCase();
if(isValidWord(word)){//outputarr.add(word);
System.out.println("words["+j+"] is: "+word);
row.add(word_id);//WORD_ID NEEDS TO BE CREATED BY SOME METHOD.
row.add(fields[0]);
row.add(word);
twoDimArrayList.add(row);
idx += 1;
}
}
}else{fieldslenlessthan2 += 1;}
}
System.out.println("........... fieldslenlessthan2 is: "+fieldslenlessthan2);
return twoDimArrayList;
}
The output of the above method currently looks like the following, with many rxcui values for some name values, and with many name values for some rxcui:
How do I change the code above so that the output is a list of unique pairs of name/rxcui values, summarizing all relevant data from the current output while removing only the redundancies?
If you just need a Collection of all words, use a HashSet Sets are primarily used for contains logic. If you need to associate a value with your string use a HashMap
public HashSet<String> getUniqueWords(String[] stringArray) {
HashSet<String> uniqueWords = new HashSet<String>();
for (String str : stringArray) {
uniqueWords.add(str);
}
return uniqueWords;
}
This will give you a collection of all the unique Strings in your array. If you need an ID use a HashMap
String[] strList; // your String array
int idCounter = 0;
HashMap<String, Integer> stringIDMap = new HashMap<String, Integer>();
for (String str : strList) {
if (!stringIDMap.contains(str)) {
stringIDMap.put(str, new Integer(idCounter));
idCounter++;
}
}
This will provide you a HashMap with unique String keys and unique Integer values. To get an id for a String you do this:
stringIDMap.get("myString"); // returns the Integer ID associated with the String "myString"
UPDATE
Based on the question update from the OP. I recommend creating an object that holds the String value and the rxcui. You can then place these in a Set or HashMap using a similar implementation to the one provided above.
public MyObject(String str, int rxcui); // The constructor for your new object
MyObject mo1 = new MyObject("hello", 5);
Either
mySet.add(myObject);
will work or
myMap.put(mo1.getStr, mo1.getRxcui);
What is the purpose of the unique word ID? Is the word itself not unique enough since you are not keeping duplicates?
A very basic way would be to keep a counter going as you are checking new words. For each word that doesn't already exist you could increase the counter and use the new value as the unique id.
Lastly, might I suggest you use a HashMap instead. It would allow you to both insert and retrieve words in O(1) time. I am not entirely sure what you are going for, but I think the HashMap might give you more range.
Edit2:
It would be something a little more along these lines. This should help you out.
public static Set<DataPair> getAllWords(String[] tempsArray) {
Set<DataPair> set = new HashSet<>();
for (String row : tempsArray) {
// PARSE YOUR STRING DATA
// the way you were doing it seemed fine but something like this
String[] rowArray = row.split(" ");
String word = row[1];
int id = Integer.parseInt(row[0]);
DataPair pair = new DataPair(word, id);
set.add(pair);
}
return set;
}
class DataPair {
private String word;
private int id;
public DataPair(String word, int id) {
this.word = word;
this.id = id;
}
public boolean equals(Object o) {
if (o instanceof DataPair) {
return ((DataPair) o).word.equals(word) && ((DataPair) o).id == id;
}
return false;
}
}

Using a string to write to an array

Attempting to tidy up code, originally I was using this method of writing to arrays, which is ridiculously long when I have to repeat it 20 times
if (ant.getAntNumber() == 3)
{
numbers3.add(ant.getCol());
numbers3y.add(ant.getRow());
}
if (ant.getAntNumber() == 4)
{
numbers4.add(ant.getCol());
numbers4y.add(ant.getRow());
}
I attempted to use a for loop to do it but I cant figure out how to add to the array using the string value, because it thinks its a string rather than trying to use the array
for (int j = 0; j<maxAnts; j++)
{
String str = "numbers" + j;
String str2 = "numbers" + j + "y";
//this part doesnt work
str.add(ant.getCol());
}
Any suggestions would be helpful
In Java, you cannot use the value of a String object to reference an actual variable name. Java will think you're attempting to to call add on the String object, which doesn't exist and gives you the compiler error you're seeing.
To avoid the repetition, you need to add your Lists to two master lists that you can index.
In your question, you mention arrays, but you call add, so I'm assuming that you're really referring to Lists of some sort.
List<List<Integer>> numbers = new ArrayList<List<Integer>>(20);
List<List<Integer>> numbersy = new ArrayList<List<Integer>>(20);
// Add 20 ArrayList<Integer>s to each of the above lists in a loop here.
Then you can bounds-check ant.getAntNumber() and use it as an index into your master lists.
int antNumber = ant.getAntNumber();
// Make sure it's within range here.
numbers.get(antNumber).add(ant.getCol());
numbersy.get(antNumber).add(ant.getRow());
How about this?
Ant[] aAnt = new Ant[20];
//Fill the ant-array
int[] aColumns = new int[aAnt.length];
int[] aRows = new int[aAnt.length];
for(int i = 0; i < aAnt.length; i++) {
aColumns[i] = aAnt[i].getCol();
aRows[i] = aAnt[i].getRow();
}
or with lists:
List<Integer> columnList = new List<Integer>(aAnt.length);
List<Integer> rowList = new List<Integer>(aAnt.length);
for(Ant ant : aAnt) {
columnList.add(ant.getCol());
rowList.add(ant.getRow());
}
or with a col/row object:
class Coordinate {
public final int yCol;
public final int xRow;
public Coordinate(int y_col, int x_row) {
yCol = y_col;
xRow = x_row;
}
}
//use it with
List<Coordinate> coordinateList = new List<Coordinate>(aAnt.length);
for(Ant ant : aAnt) {
coordinateList.add(ant.getCol(), ant.getRow());
}
A straight-forward port of your code would be to use two Map<Integer, Integer> which store X and Y coordinates. From your code it seems like ant numbers are unique, i.e., we only have to store a single X and Y value per ant number. If you need to store multiple values per ant number, use a List<Integer> as value type of the Map instead.
Map<Integer, Integer> numbersX = new HashMap<Integer, Integer>();
Map<Integer, Integer> numbersY = new HashMap<Integer, Integer>();
for(Ant ant : ants) {
int number = ant.getAntNumber();
numbersX.put(number, ant.getCol());
numbersY.put(number, ant.getRow());
}

Can I create an array of sets?

Here is what I am trying to do.
I am reading in a list of words with each having a level of complexity. Each line has a word followed by a comma and the level of the word. "watch, 2" for example. I wish to put all of the words of a given level into a set to ensure their uniqueness in that level. There are 5 levels of complexity, so ideally I'd like an array with 5 elements, each of which is a set.
I can then add words to each of the sets as I read them in. Later on, I wish to pull out a random word of a specified level.
I'm happy with everything except how to create an array of sets. I've read several other posts here that seem to agree that this can't be done exactly as I would hope, but I can't find a good work around. (No, I'm not willing to have 5 sets in a switch statement. Goes against the grain.)
Thanks.
You can use a map . Use level as key and value as the set which contains the words. This will help you to pull out the value for a given level, When a random word is requested from a level, get the value(set in this case) using the key which is the level and pick a random value from that. This will also scale if you increase the number of levels
public static void main(String[] args) {
Map<Integer, Set<String>> levelSet = new HashMap();
//Your code goes here to get the level and word
//
String word="";
int level=0;
addStringToLevel(levelSet,word,level);
}
private static void addStringToLevel(Map<Integer, Set<String>> levelSet,
String word, int level) {
if(levelSet.get(level) == null)
{
// this means this is the first string added for this level
// so create a container to hold the object
levelSet.put(level, new HashSet());
}
Set<String> wordContainer = levelSet.get(level);
wordContainer.add(word);
}
private static String getStringFromLevel(Map<Integer, Set<String>> levelSet,
int level) {
if(levelSet.get(level) == null)
{
return null;
}
Set<String> wordContainer = levelSet.get(level);
return "";// return a random string from wordContainer`
}
If you are willing to use Guava, try SetMultimap. It will take care of everything for you.
SetMultimap<Integer, String> map = HashMultimap.create();
map.put(5, "value");
The collection will take care of creating the inner Set instances for you unlike the array or List solutions which require either pre-creating the Sets or checking that they exist.
Consider using a List instead of an array.
Doing so might make your life easier.
List<Set<String>> wordSetLevels = new ArrayList();
// ...
for ( i = 0; i < 5; i++ ) {
wordSetLevels.add(new HashSet<String>());
}
wordSetLevels = Collections.unmodifiableList(wordSetLevels);
// ...
wordSetLevels.get(2).add("watch");
import java.util.HashSet;
import java.util.List;
import java.util.Set;
public class Main {
private Set<String>[] process(List<String> words) {
#SuppressWarnings("unchecked")
Set<String>[] arrayOfSets = new Set[5];
for(int i=0; i<arrayOfSets.length; i++) {
arrayOfSets[i] = new HashSet<String>();
}
for(String word: words) {
int index = getIndex(word);
String val = getValue(word);
arrayOfSets[index].add(val);
}
return arrayOfSets;
}
private int getIndex(String str) {
//TODO Implement
return 0;
}
private String getValue(String str) {
//TODO Implement
return "";
}
}

Get the string with the highest value in java (strings have the same 'base' name but different suffixes)

I have a list with some strings in it:
GS_456.java
GS_456_V1.java
GS_456_V2.java
GS_460.java
GS_460_V1.java
And it goes on. I want a list with the strings with the highest value:
GS_456_V2.java
GS_460_V1.java
.
.
.
I'm only thinking of using lots of for statements...but isn't there a more pratical way? I'd like to avoid using too many for statements...since i'm using them a lot when i execute some queries...
EDIT: The strings with the V1, V2,.... are the names of recent classes created. When someone creates a new version of GS_456 for example, they'll do it and add its version at the end of the name.
So, GS_456_V2 is the most recent version of the GS_456 java class. And it goes on.
Thanks in advance.
You will want to process the file names in two steps.
Step 1: split the list into sublists, with one sublist per file name (ignoring suffix).
Here is an example that splits the list into a Map:
private static Map> nameMap = new HashMap>();
private static void splitEmUp(final List names)
{
for (String current : names)
{
List listaly;
String[] splitaly = current.split("_|\\.");
listaly = nameMap.get(splitaly[1]);
if (listaly == null)
{
listaly = new LinkedList();
nameMap.put(splitaly[1], listaly);
}
listaly.add(current);
}
Step 2: find the highest prefix for each name. Here is an example:
private static List findEmAll()
{
List returnValue = new LinkedList();
Set keySet = nameMap.keySet();
for (String key : keySet)
{
List listaly = nameMap.get(key);
String highValue = null;
if (listaly.size() == 1)
{
highValue = listaly.get(0);
}
else
{
int highVersion = 0;
for (String name : listaly)
{
String[] versions = name.split("_V|\\.");
if (versions.length == 3)
{
int versionNumber = Integer.parseInt(versions[1]);
if (versionNumber > highVersion)
{
highValue = name;
highVersion = versionNumber;
}
}
}
}
returnValue.add(highValue);
}
return returnValue;
}
I guess you don't want simply the lexicographic order (the solution would be obvious).
First, remove the ".java" part and split your string on the character "_".
int dotIndex = string.indexOf(".");
String []parts = split.substring(0, dotIndex).split("_");
You are interested in parts[1] and parts[2]. The first is easy, it's just a number.
int fileNumber = Integer.parseInt(parts[1]);
The second one is always of the form "VX" with X being a number. But this part may not exist (if it's the base version of the file). In which case we can say that version is 0.
int versionNumber = parts.length < 2 ? 0 : Integer.parseInt(parts[2].substring(1));
Now you can compare based on these two numbers.
To make things simple, build a class FileIdentifier based on this:
class FileIdentifier {
int fileNumber;
int versionNumber;
}
Then a function that create a FileIdentifier from a file name, with logic based on what I explained earlier.
FileIdentifier getFileIdentifierFromFileName(String filename){ /* .... */ }
Then you make a comparator on String, in which you get the FileIdentifier for the two strings and compare upon FileIdentifier members.
Then, to get the string with "the highest value", you simply put all your strings in a list, and use Collections.sort, providing the comparator.

Categories