Getting rid of duplicates with arrayHash - java

I need to implement a hashing algorithm where the collision handling is done via linear probing. I know that by definition the hash table won’t store duplicates.I am not allowed to pre-process the word list file to eliminate duplicates first and then try to load the words into the hash table. My class arrayHashWithLinearProbing() is soppose to handle this part. Below is my class and some other information to help understand what I am doing. (Note: I am reading from a text file) I need help so that the hash table will not store duplicates.
import java.io.*;
import java.util.*;
import java.math.*;
public class HashWithLinearProbing {
int tableSize = 423697;
BigInteger ts = new BigInteger(Integer.toString(tableSize));
Hash[] hashTable;
public HashWithLinearProbing() {
hashTable = new Hash[tableSize];
for (int i = 0; i < tableSize; i++) {
hashTable[i] = new Hash();
}
}
public int hashVal(String p) {
String s = "";
for (int i = 0; i < p.length(); i++) {
s = s + (int) p.charAt(i);
}
BigInteger bi = new BigInteger(s);
BigInteger k = bi.remainder(ts);
int j = k.intValue();
return j;
}
public void arrayHashWithLinearProbing(String s) {
// you will implement a hash table using the hashVal() method described
// above.
// Linear probing will use the "next" field to chain all the values that
// have collisions.
// this method will store String s at a linearly probed location
// starting from hashVal(s) in the table
// if s is not already i the table. Essentially, you will not have any
// duplicates.
int hashIndex = hashVal(s);
int previousI = -1;
int i = hashIndex;
// Find the next slot
do {
if (hashTable[i].val.isEmpty()) {
hashTable[i].val = s;
break;
}
Hash hash = hashTable[i];
if (hash.val.equals(s)) {
hash.val = s;
break;
}
previousI = i;
i = (i + 1) % tableSize;
hashTable[previousI].next = i;
} while (i != hashIndex);
}
Thank you so much for your help in advance

Related

How do I improve the runtime of my algorithm?

The aim is given a file, with the 1st line as the number of lines available, find how many pair of lines are permutations of each other. Example would be that AABA is a permutation of BAAA. The code is written in java. This is my current code:
import java.io.BufferedReader;
import java.io.FileReader;
import java.util.Arrays;
public class SpeedDemon {
public class Data{
byte[] dataValues;
byte duplicate=1;
int hashcode;
public Data(byte[] input) {
dataValues= new byte[128];
for (byte x : input) {
if (x==10){
break;
}
dataValues[x]++;
}
hashcode = Arrays.hashCode(dataValues);
}
public boolean equal(Data o){
return this.hashcode==o.hashcode&&Arrays.equals(o.dataValues, this.dataValues);
}
}
public int processData(String fileName){
try {
BufferedReader reader = new BufferedReader(new FileReader(fileName));
int size = Integer.parseInt(reader.readLine());
int arr_size = 2;
while (arr_size < size) {
arr_size *= 2;
}
Data[] map = new Data[arr_size];
int z = 0;
Data data;
int j;
for (int i = 0; i < size; i++) {
data = new Data(reader.readLine().getBytes());
j = data.hashcode;
j ^= (j >>> 16);
j &= (arr_size - 1);
while (true) {
if (map[j] == null) {
map[j] = data;
break;
} else {
if (map[j].equal(data)) {
z += map[j].duplicate++;
break;
} else {
j = j == arr_size - 1 ? 0 : j + 1;
}
}
}
}
return z;
}catch(Exception ex){ }
return 0;
}
public static void main(String[] args) {
System.out.println(new SpeedDemon().processData(args[0]));
}
}
I would like to know if there is any way to improve the time efficiency of the program? It is part of my class contest and some people have managed runtimes of around 25% faster. I tried different array sizes and this seem to work the best.
Multiply arr_size by 4. You need a lot of free slots to make open addressing efficient, and depending on what size is you may not be getting very many right now.
Specify a larger buffer size on your buffered reader to reduce the I/O count. 32768 would be reasonable.
Then work on efficiency in Data Both the hashing and comparison operations need to iterate through all 128 possible byte values, which is unnecessary.
Are you sure your code even gets the correct answer? It doesn't seem likely.
The easiest way to determine if two strings are permutations of each other is to sort the strings and compare them. With that in mind, an easier and faster way to code this up would be to use a Map. Something like this:
Create a new Map where the key and value are both strings
for each line of the file
s = read string from file
sortedString = sort(s) // sort characters in the string
if (map.contains(sortedString))
you found a duplicate
else
map.insert(sortedString, string) // the key is the sorted string
end for
There are other ways to do this, but that's the easiest way I know of, and probably the fastest.

Altering the value of k in kNN algorithm - Java

I have applied the KNN algorithm for classifying handwritten digits. the digits are in vector format initially 8*8, and stretched to form a vector 1*64..
As it stands my code applies the kNN algorithm but only using k = 1. I'm not entirely sure how to alter the value k after attempting a couple of things I kept getting thrown errors. If anyone could help push me in the right direction it would be really appreciated. The training dataset can be found here and the validation set here.
ImageMatrix.java
import java.util.*;
public class ImageMatrix {
private int[] data;
private int classCode;
private int curData;
public ImageMatrix(int[] data, int classCode) {
assert data.length == 64; //maximum array length of 64
this.data = data;
this.classCode = classCode;
}
public String toString() {
return "Class Code: " + classCode + " Data :" + Arrays.toString(data) + "\n"; //outputs readable
}
public int[] getData() {
return data;
}
public int getClassCode() {
return classCode;
}
public int getCurData() {
return curData;
}
}
ImageMatrixDB.java
import java.util.*;
import java.io.*;
import java.util.ArrayList;
public class ImageMatrixDB implements Iterable<ImageMatrix> {
private List<ImageMatrix> list = new ArrayList<ImageMatrix>();
public ImageMatrixDB load(String f) throws IOException {
try (
FileReader fr = new FileReader(f);
BufferedReader br = new BufferedReader(fr)) {
String line = null;
while((line = br.readLine()) != null) {
int lastComma = line.lastIndexOf(',');
int classCode = Integer.parseInt(line.substring(1 + lastComma));
int[] data = Arrays.stream(line.substring(0, lastComma).split(","))
.mapToInt(Integer::parseInt)
.toArray();
ImageMatrix matrix = new ImageMatrix(data, classCode); // Classcode->100% when 0 -> 0% when 1 - 9..
list.add(matrix);
}
}
return this;
}
public void printResults(){ //output results
for(ImageMatrix matrix: list){
System.out.println(matrix);
}
}
public Iterator<ImageMatrix> iterator() {
return this.list.iterator();
}
/// kNN implementation ///
public static int distance(int[] a, int[] b) {
int sum = 0;
for(int i = 0; i < a.length; i++) {
sum += (a[i] - b[i]) * (a[i] - b[i]);
}
return (int)Math.sqrt(sum);
}
public static int classify(ImageMatrixDB trainingSet, int[] curData) {
int label = 0, bestDistance = Integer.MAX_VALUE;
for(ImageMatrix matrix: trainingSet) {
int dist = distance(matrix.getData(), curData);
if(dist < bestDistance) {
bestDistance = dist;
label = matrix.getClassCode();
}
}
return label;
}
public int size() {
return list.size(); //returns size of the list
}
public static void main(String[] argv) throws IOException {
ImageMatrixDB trainingSet = new ImageMatrixDB();
ImageMatrixDB validationSet = new ImageMatrixDB();
trainingSet.load("cw2DataSet1.csv");
validationSet.load("cw2DataSet2.csv");
int numCorrect = 0;
for(ImageMatrix matrix:validationSet) {
if(classify(trainingSet, matrix.getData()) == matrix.getClassCode()) numCorrect++;
} //285 correct
System.out.println("Accuracy: " + (double)numCorrect / validationSet.size() * 100 + "%");
System.out.println();
}
In the for loop of classify you are trying to find the training example that is closest to a test point. You need to switch that with a code that finds K of the training points that is the closest to the test data. Then you should call getClassCode for each of those K points and find the majority(i.e. the most frequent) of the class codes among them. classify will then return the major class code you found.
You may break the ties (i.e. having 2+ most frequent class codes assigned to equal number of training data) in any way that suits your need.
I am really inexperienced in Java, but just by looking around the language reference, I came up with the implementation below.
public static int classify(ImageMatrixDB trainingSet, int[] curData, int k) {
int label = 0, bestDistance = Integer.MAX_VALUE;
int[][] distances = new int[trainingSet.size()][2];
int i=0;
// Place distances in an array to be sorted
for(ImageMatrix matrix: trainingSet) {
distances[i][0] = distance(matrix.getData(), curData);
distances[i][1] = matrix.getClassCode();
i++;
}
Arrays.sort(distances, (int[] lhs, int[] rhs) -> lhs[0]-rhs[0]);
// Find frequencies of each class code
i = 0;
Map<Integer,Integer> majorityMap;
majorityMap = new HashMap<Integer,Integer>();
while(i < k) {
if( majorityMap.containsKey( distances[i][1] ) ) {
int currentValue = majorityMap.get(distances[i][1]);
majorityMap.put(distances[i][1], currentValue + 1);
}
else {
majorityMap.put(distances[i][1], 1);
}
++i;
}
// Find the class code with the highest frequency
int maxVal = -1;
for (Entry<Integer, Integer> entry: majorityMap.entrySet()) {
int entryVal = entry.getValue();
if(entryVal > maxVal) {
maxVal = entryVal;
label = entry.getKey();
}
}
return label;
}
All you need to do is adding K as a parameter. Keep in mind, however, that the code above does not handle ties in a particular way.

java programming & finding the mode of an array

i have a task where i need to find the mode of an array. which means i am looking for the int which is most frequent. i have kinda finished that, but the task also says if there are two modes which is the same, i should return the smallest int e.g {1,1,1,2,2,2} should give 1 (like in my file which i use that array and it gives 2)
public class theMode
{
public theMode()
{
int[] testingArray = new int[] {1,1,1,2,2,2,4};
int mode=findMode(testingArray);
System.out.println(mode);
}
public int findMode(int[] testingArray)
{
int modeWeAreLookingFor = 0;
int frequencyOfMode = 0;
for (int i = 0; i < testingArray.length; i++)
{
int currentIndexOfArray = testingArray[i];
int frequencyOfEachInArray = howMany(testingArray,currentIndexOfArray);
if (frequencyOfEachInArray > frequencyOfMode)
{
modeWeAreLookingFor = currentIndexOfArray;
frequencyOfMode = modeWeAreLookingFor;
}
}
return modeWeAreLookingFor;
}
public int howMany(int[] testingArray, int c)
{
int howManyOfThisInt=0;
for(int i=0; i < testingArray.length;i++)
{
if(testingArray[i]==c){
howManyOfThisInt++;
}
}
return howManyOfThisInt;
}
public static void main(String[] args)
{
new theMode();
}
}
as you see my algorithm returns the last found mode or how i should explain it.
I'd approach it differently. Using a map you could use each unique number as the key and then the count as the value. step through the array and for each number found, check the map to see if there is a key with that value. If one is found increment its value by 1, otherwise create a new entry with the value of 1.
Then you can check the value of each map entry to see which has the highest count. If the current key has a higher count than the previous key, then it is the "current" answer. But you have the possibility of keys with similar counts so you need to store each 'winnning' answer.
One way to approach this is to check each map each entry and remove each entry that is less than the current highest count. What you will be left with is a map of all "highest counts". If you map has only one entry, then it's key is the answer, otherwise you will need to compare the set of keys to determine the lowest.
Hint: You're updating ModeWeAreLookingFor when you find a integer with a strictly higher frequency. What if you find an integer that has the same frequency as ModeWeAreLookingFor ?
Extra exercice: In the first iteration of the main loop execution, you compute the frequency of '1'. On the second iteration (and the third, and the fourth), you re-compute this value. You may save some time if you store the result of the first computation. Could be done with a Map.
Java code convention states that method names and variable name should start with a lower case character. You would have a better syntax coloring and code easier to read if you follow this convention.
this might work with a little modification.
http://www.toves.org/books/java/ch19-array/index.html#fig2
if ((count > maxCount) || (count == maxCount && nums[i] < maxValue)) {
maxValue = nums[i];
maxCount = count;
}
since it seems there are no other way, i did a hashmap after all. i am stuck once again in the logics when it comes to comparing frequencys and and the same time picking lowest integer if equal frequencys.
public void theMode()
{
for (Integer number: intAndFrequencyMap.keySet())
{
int key = number;
int value = intAndFrequencyMap.get(number);
System.out.println("the integer: " +key + " exists " + value + " time(s).");
int lowestIntegerOfArray = 0;
int highestFrequencyOfArray = 0;
int theInteger = 0;
int theModeWanted = 0;
if (value > highestFrequencyOfArray)
{
highestFrequencyOfArray = value;
theInteger = number;
}
else if (value == highestFrequencyOfArray)
{
if (number < theInteger)
{
number = theInteger;
}
else if (number > theInteger)
{
}
else if (number == theInteger)
{
number = theInteger;
}
}
}
}
Completed:
import java.util.Arrays;
public class TheMode
{
//Probably not the most effective solution, but works without hashmap
//or any sorting algorithms
public TheMode()
{
int[] testingArray = new int[] {2,3,5,4,2,3,3,3};
int mode = findMode(testingArray);
System.out.println(Arrays.toString(testingArray));
System.out.println("The lowest mode is: " + mode);
int[] test2 = new int[] {3,3,2,2,1};
int mode2=findMode(test2);
System.out.println(Arrays.toString(test2));
System.out.println("The lowest mode is: " +mode2);
int[] test3 = new int[] {4,4,5,5,1};
int mode3 = findMode(test3);
System.out.println(Arrays.toString(test3));
System.out.println(The lowest mode is: " +mode3);
}
public int findMode(int[] testingArray)
{
int modeWeAreLookingFor = 0;
int frequencyOfMode = 0;
for (int i = 0; i < testingArray.length; i++)
{
int currentIndexOfArray = testingArray[i];
int countIntegerInArray = howMany(testingArray, currentIndexOfArray);
if (countIntegerInArray == frequencyOfMode)
{
if (modeWeAreLookingFor > currentIndexOfArray)
{
modeWeAreLookingFor = currentIndexOfArray;
}
}
else if (countIntegerInArray > frequencyOfMode)
{
modeWeAreLookingFor = currentIndexOfArray;
frequencyOfMode = countIntegerInArray;
}
}
return modeWeAreLookingFor;
}
public int howMany(int[] testingArray, int c)
{
int howManyOfThisInt=0;
for(int i=0; i < testingArray.length;i++)
{
if(testingArray[i]==c){
howManyOfThisInt++;
}
}
return howManyOfThisInt;
}
public static void main(String[] args)
{
new TheMode();
}
}
Glad you managed to solve it. As you will now see, there is more than one way to approach a problem. Here's what I meant by using a map
package util;
import java.util.HashMap;
import java.util.Map;
public class MathUtil {
public static void main(String[] args) {
MathUtil app = new MathUtil();
int[] numbers = {1, 1, 1, 2, 2, 2, 3, 4};
System.out.println(app.getMode(numbers));
}
public int getMode(int[] numbers) {
int mode = 0;
Map<Integer, Integer> numberMap = getFrequencyMap(numbers);
int highestCount = 0;
for (int number : numberMap.keySet()) {
int currentCount = numberMap.get(number);
if (currentCount > highestCount) {
highestCount = currentCount;
mode = number;
} else if (currentCount == highestCount && number < mode) {
mode = number;
}
}
return mode;
}
private Map<Integer,Integer> getFrequencyMap(int[] numbers){
Map<Integer, Integer> numberMap = new HashMap<Integer, Integer>();
for (int number : numbers) {
if (numberMap.containsKey(number)) {
int count = numberMap.get(number);
count++;
numberMap.put(number, count);
} else {
numberMap.put(number, 1);
}
}
return numberMap;
}
}

Implement reading of contact from phone book Using Hashtable in J2ME

I ran into a bind whereby I had to sort the data read from the phones PIM. In doing this I lost the other to which each contact field was referenced to the telephone number because I made use of 2 separate vectors as illustrated below
Before sorting
Nna - +445535533
Ex - +373773737
Ab - +234575757
After sorting.(Which shouldn't be)
Ab - +445535533
Ex - +373773737
Nna - +234575757
This gives an undesired behavior since the sort removes the index to index pointer of the vectors and a selected name (in a Multiple list Box) will get a wrong number.
Alternatively,
I used a hashtable, with the intention of using the names as keys and numbers as the values.
But this pairing means duplicate names being used as keys will not be allowed. Thus I made it a i.e the phone number as keys instead.
I don't want to sound like a cry baby so I stop here for a while and so you the code with a hope u guys would understand it
MY QUESTION
1. Is there a better way/algorithm to implement this?
2. How do I implement the getSelectedItems() in such a ways that it grabs the numbers of the selected indexes of a MULTIPLE CHOICE LIST from a hashTable
import java.util.Enumeration;
import java.util.Vector;
import java.util.Hashtable;
import javax.microedition.lcdui.List;
import javax.microedition.pim.Contact;
import javax.microedition.pim.ContactList;
import javax.microedition.pim.PIM;
import javax.microedition.pim.PIMException;
/**
*
* #author nnanna
*/
public class LoadContacts implements Operation {
private boolean available;
private Vector telNames = new Vector();
Vector telNumbers = new Vector();
Hashtable Listcontact = new Hashtable();
private String[] names;
public Vector getTelNames() {
return telNames;
}
public Hashtable getListcontact() {
return Listcontact;
}
public void execute() {
try {
// go through all the lists
String[] allContactLists = PIM.getInstance().listPIMLists(PIM.CONTACT_LIST);
if (allContactLists.length != 0) {
for (int i = 0; i < allContactLists.length; i++) {
System.out.println(allContactLists[i]);
System.out.println(allContactLists.length);
loadNames(allContactLists[i]);
System.out.println("Execute()");
}
} else {
available = false;
}
} catch (PIMException e) {
available = false;
} catch (SecurityException e) {
available = false;
}
}
private void loadNames(String name) throws PIMException, SecurityException {
ContactList contactList = null;
try {
contactList = (ContactList) PIM.getInstance().openPIMList(PIM.CONTACT_LIST, PIM.READ_ONLY, name);
// First check that the fields we are interested in are supported(MODULARIZE)
if (contactList.isSupportedField(Contact.FORMATTED_NAME) && contactList.isSupportedField(Contact.TEL)) {
Enumeration items = contactList.items();
Hashtable temp = new Hashtable();
while (items.hasMoreElements()) {
Contact contact = (Contact) items.nextElement();
int telCount = contact.countValues(Contact.TEL);
int nameCount = contact.countValues(Contact.FORMATTED_NAME);
if (telCount > 0 && nameCount > 0) {
String contactName = contact.getString(Contact.FORMATTED_NAME, 0);
// go through all the phone availableContacts
for (int i = 0; i < telCount; i++) {
System.out.println("Read Telno");
int telAttributes = contact.getAttributes(Contact.TEL, i);
String telNumber = contact.getString(Contact.TEL, i);
Listcontact.put(telNumber, contactName);
temp.put(contactName, telNumber);
}
names = getSortedList();
// Listcontact = temp;
System.out.println(temp + "-------");
System.out.println(Listcontact + "*******");
shortenName(contactName, 20);
}
available = true;
}
} else {
available = false;
}
} finally {
// always close it
if (contactList != null) {
contactList.close();
}
}
}
private void shortenName(String name, int length) {
if (name.length() > length) {
name = name.substring(0, 17) + "...";
}
}
public Vector getSelectedItems(List lbx) {
boolean[] arrSel = new boolean[lbx.size()];
Vector selectedNumbers = new Vector();
int selected = lbx.getSelectedFlags(arrSel);
String selectedString;
String result = "";
for (int i = 0; i < arrSel.length; i++) {
if (arrSel[i]) {
selectedString = lbx.getString(lbx.getSelectedFlags(arrSel));
result = result + " " + i;
System.out.println(Listcontact.get(selectedString));
// System.out.println(telNumbers.elementAt(i));
}
}
return selectedNumbers;
}
private String[] sortResults(String data[]) {
RecordSorter sorter = new RecordSorter();
boolean changed = true;
while (changed) {
changed = false;
for (int j = 0; j < (data.length - 1); j++) {
String a = data[j], b = data[j + 1];
if (a != null && b != null) {
int order = sorter.compare(a.getBytes(), b.getBytes());
if (order == RecordSorter.FOLLOWS) {
changed = true;
data[j] = b;
data[j + 1] = a;
}
}
}
}
return data;
}
public String[] getNames() {
return names;
}
Vector elements = new Vector();
private String[] getValueArray(Hashtable value) {
System.out.println(Listcontact + " c");
Enumeration e = value.elements();
while (e.hasMoreElements()) {
elements.addElement(e.nextElement());
}
String[] elementsArray = new String[elements.size()];
elements.copyInto(elementsArray);
elements.removeAllElements();
System.out.println(elementsArray + " k");
return elementsArray;
}
public void getDuplicates(Vector realValue) {
Vector duplicate = new Vector();
Enumeration e = realValue.elements();
for (int i = 0; e.hasMoreElements(); i++) {
if (duplicate.isEmpty() || !duplicate.elementAt(i).equals(e.nextElement())) {
break;
} else {
duplicate.addElement(e.nextElement());
}
}
}
public String[] getSortedList() {
return sortResults(getValueArray(Listcontact));
}
}
Let me reiterate you requirement: You want a method that will sort the contacts read from native phonebook, then alphabetically sort them on name.
Following is the approach,
Replace the vectors and hash-tables in your code with a single vector, say contactListVector, containing elements of type ContactItem, don't worry this class is explained below. Fundamentally the contact's name and number(s) are linked together in a ContactItem, hence you do not have to worry about there mappings which reduces the usage of redundant data structures.
class ContactItem {
private String name;
private String tnumber; //this can also be a data structure
//for storing multiple numbers
ContactItem( String name, String tnumber) {
this.name = name;
this.tnumber = tnumber;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getTnumber() {
return tnumber;
}
public void setTnumber(String tnumber) {
this.tnumber = tnumber;
}
}
You can reuse the sorting algorithm on contactListVector by comparing the member variable ContactItem.name of the vector element. Also you can deploy different sorts on member variables numbers and/or names. Also there are lots of libraries for JavaME available that have better sorting algorithm's implemented if need be use them.
I would recommend you to perform the sorting once on the contactListVector elements at the end of your method loadNames(...) maybe in the finally block triggered by some boolean variable. The current sorting call in each iteration on items enumeration is expensive and time consuming.
Also you can serialize / deserialize the ContactItem thus persist your contact list.
Let me know if you need detailed explanation.
What about inserting the contact name and numbers inside a recordStore , so you can later make a sort by creating a class which implements RecordComparator.
This statement in your code makes no sense:
selectedString = lbx.getString(lbx.getSelectedFlags(arrSel))
Per lcdui List API documentation above will return the string located at the index equal to the number of selected elements why would you need that?
If you need to output selected text for debugging purposes, use lbx.getString(i) instead.
To implement the getSelectedItems() in such a ways that it grabs the numbers of the selected indexes of a MULTIPLE CHOICE LIST do about as follows:
public Vector getSelectedItems(List lbx) {
boolean[] arrSel = new boolean[lbx.size()];
Vector selectedNumbers = new Vector();
int selected = lbx.getSelectedFlags(arrSel);
System.out.println("selected: [" + selected + "] elements in list");
String selectedString;
String result = "";
for (int i = 0; i < arrSel.length; i++) {
if (arrSel[i]) {
// here, i is the selected index
selectedNumbers.addElement(new Integer(i)); // add i to result
String selectedString = lbx.getString(i);
System.out.println("selected [" + selectedString
+ "] text at index: [" + i + "]");
}
}
return selectedNumbers;
}
As for sorting needs, just drop the HashTable and use Vector of properly designed objects instead as suggested in another answer - with your own sorting algorithm or one from some 3rd party J2ME library.
I would suggest you to have Contact class with name and Vector of numbers. And instead of sorting names array sort the array of contacts.

Stack overflow error for large inputs in Java

I'm writing a Java program that searches for and outputs cycles in a graph. I am using an adjacency list for storing my graph, with the lists stored as LinkedLists. My program takes an input formatted with the first line as the number of nodes in the graph and each subsequent line 2 nodes that form an edge e.g.:
3
1 2
2 3
3 1
My problem is that when the inputs get very large (the large graph I am using has 10k nodes and I don't know how many edges, the file is 23mb of just edges) I am getting a java.lang.StackOverflowError, but I don't get any errors with small inputs. I'm wondering if it would be better to use another data structure to form my adjacency lists or if there is some method I could use to avoid this error, as I'd rather not just have to change a setting on my local installation of Java (because I have to be sure this will run on other computers that I can't control the settings on as much). Below is my code, the Vertex class and then my main class. Thanks for any help you can give!
Vertex.java:
package algorithms311;
import java.util.*;
public class Vertex implements Comparable {
public int id;
public LinkedList adjVert = new LinkedList();
public String color = "white";
public int dTime;
public int fTime;
public int prev;
public Vertex(int idnum) {
id = idnum;
}
public int getId() {
return id;
}
public int compareTo(Object obj) {
Vertex vert = (Vertex) obj;
return id-vert.getId();
}
#Override public String toString(){
return "Vertex # " + id;
}
public void setColor(String newColor) {
color = newColor;
}
public String getColor() {
return color;
}
public void setDTime(int d) {
dTime = d;
}
public void setFTime(int f) {
fTime = f;
}
public int getDTime() {
return dTime;
}
public int getFTime() {
return fTime;
}
public void setPrev(int v) {
prev = v;
}
public int getPrev() {
return prev;
}
public LinkedList getAdjList() {
return adjVert;
}
public void addAdj(int a) { //adds a vertex id to this vertex's adj list
adjVert.add(a);
}
}
CS311.java:
package algorithms311;
import java.util.*;
import java.io.*;
public class CS311 {
public static final String GRAPH= "largegraph1";
public static int time = 0;
public static LinkedList[] DFS(Vertex[] v) {
LinkedList[] l = new LinkedList[2];
l[0] = new LinkedList();
l[1] = new LinkedList(); //initialize the array with blank lists, otherwise we get a nullpointerexception
for(int i = 0; i < v.length; i++) {
v[i].setColor("white");
v[i].setPrev(-1);
}
time = 0;
for(int i = 0; i < v.length; i++) {
if(v[i].getColor().equals("white")) {
l = DFSVisit(v, i, l);
}
}
return l;
}
public static LinkedList[] DFSVisit(Vertex[] v, int i, LinkedList[] l) { //params are a vertex of nodes and the node id you want to DFS from
LinkedList[] VOandBE = new LinkedList[2]; //two lists: visit orders and back edges
VOandBE[0] = l[0]; // l[0] is visit Order, a linked list of ints
VOandBE[1] = l[1]; // l[1] is back Edges, a linked list of arrays[2] of ints
VOandBE[0].add(v[i].getId());
v[i].setColor("gray"); //color[vertex i] <- GRAY
time++; //time <- time+1
v[i].setDTime(time); //d[vertex i] <- time
LinkedList adjList = v[i].getAdjList(); // adjList for the current vertex
for(int j = 0; j < adjList.size(); j++) { //for each v in adj[vertex i]
if(v[(Integer)adjList.get(j)].getColor().equals("gray") && v[i].getPrev() != v[(Integer)adjList.get(j)].getId()) { // if color[v] = gray and Predecessor[u] != v do
int[] edge = new int[2]; //pair of vertices
edge[0] = i; //from u
edge[1] = (Integer)adjList.get(j); //to v
VOandBE[1].add(edge);
}
if(v[(Integer)adjList.get(j)].getColor().equals("white")) { //do if color[v] = WHITE
v[(Integer)adjList.get(j)].setPrev(i); //then "pi"[v] <- vertex i
DFSVisit(v, (Integer)adjList.get(j), VOandBE); //DFS-Visit(v)
}
}
VOandBE[0].add(v[i].getId());
v[i].setColor("black");
time++;
v[i].setFTime(time);
return VOandBE;
}
public static void main(String[] args) {
try {
// --Read First Line of Input File
// --Find Number of Vertices
FileReader file1 = new FileReader("W:\\Documents\\NetBeansProjects\\algorithms311\\src\\algorithms311\\" + GRAPH);
BufferedReader bReaderNumEdges = new BufferedReader(file1);
String numVertS = bReaderNumEdges.readLine();
int numVert = Integer.parseInt(numVertS);
System.out.println(numVert + " vertices");
// --Make Vertices
Vertex vertex[] = new Vertex[numVert];
for(int k = 0; k <= numVert - 1; k++) {
vertex[k] = new Vertex(k);
}
// --Adj Lists
FileReader file2 = new FileReader("W:\\Documents\\NetBeansProjects\\algorithms311\\src\\algorithms311\\" + GRAPH);
BufferedReader bReaderEdges = new BufferedReader(file2);
bReaderEdges.readLine(); //skip first line, that's how many vertices there are
String edge;
while((edge = bReaderEdges.readLine()) != null) {
StringTokenizer ST = new StringTokenizer(edge);
int vArr[] = new int[2];
for(int j = 0; ST.hasMoreTokens(); j++) {
vArr[j] = Integer.parseInt(ST.nextToken());
}
vertex[vArr[0]-1].addAdj(vArr[1]-1);
vertex[vArr[1]-1].addAdj(vArr[0]-1);
}
for(int i = 0; i < vertex.length; i++) {
System.out.println(vertex[i] + ", adj nodes: " + vertex[i].getAdjList());
}
LinkedList[] l = new LinkedList[2];
l = DFS(vertex);
System.out.println("");
System.out.println("Visited Nodes: " + l[0]);
System.out.println("");
System.out.print("Back Edges: ");
for(int i = 0; i < l[1].size(); i++) {
int[] q = (int[])(l[1].get(i));
System.out.println("[" + q[0] + "," + q[1] + "] ");
}
for(int i = 0; i < l[1].size(); i++) { //iterate through the list of back edges
int[] q = (int[])(l[1].get(i)); // q = pair of vertices that make up a back edge
int u = q[0]; // edge (u,v)
int v = q[1];
LinkedList cycle = new LinkedList();
if(l[0].indexOf(u) < l[0].indexOf(v)) { //check if u is before v
for(int z = l[0].indexOf(u); z <= l[0].indexOf(v); z++) { //if it is, look for u first; from u to v
cycle.add(l[0].get(z));
}
}
else if(l[0].indexOf(v) < l[0].indexOf(u)) {
for(int z = l[0].indexOf(v); z <= l[0].indexOf(u); z++) { //if it is, look for u first; from u to v
cycle.add(l[0].get(z));
}
}
System.out.println("");
System.out.println("Cycle detected! : " + cycle);
if((cycle.size() & 1) != 0) {
System.out.println("Cycle is odd, graph is not 2-colorable!");
}
else {
System.out.println("Cycle is even, we're okay!");
}
}
}
catch (IOException e) {
System.out.println("AHHHH");
e.printStackTrace();
}
}
}
The issue is most likely the recursive calls in DFSVisit. If you don't want to go with the 'easy' answer of increasing Java's stack size when you call the JVM, you may want to consider rewriting DFSVisit to use an iterative algorithm instead of recursive. While Depth First Search is more easily defined in a recursive manner, there are iterative approaches to the algorithm that can be used.
For example: this blog post
The stack is a region in memory that is used for storing execution context and passing parameters. Every time your code invokes a method, a little bit of stack is used, and the stack pointer is increased to point to the next available location. When the method returns, the stack pointer is decreased and the portion of the stack is freed up.
If an application uses recursion heavily, the stack quickly becomes a bottleneck, because if there is no limit to the recursion depth, there is no limit to the amount of stack needed. So you have two options: increase the Java stack (-Xss JVM parameter, and this will only help until you hit the new limit) or change your algorithm so that the recursion depth is not as deep.
I am not sure if you were looking for a generic answer, but from a brief glance at your code it appears that your problem is recursion.
If you're sure your algorithm is correct and the depth of recursive calls you're making isn't accidental, then solutions without changing your algorithm are:
add to the JVM command line e.g. -Xss128m to set a 128 MB stack size (not a good solution in multi-threaded programs as it sets the default stack size for every thread not just the particular thread running your task);
run your task in its own thread, which you can initialise with a stack size specific to just that thread (and set the stack size within the program itself)-- see my example in the discussion of fixing StackOverflowError, but essentially the stack size is a parameter to the Thread() constructor;
don't use recursive calls at all-- instead, mimic the recursive calls using an explicit Stack or Queue object (this arguably gives you a bit more control).

Categories