Graph data structure

Graph data structure - java

Let's Say I have MyClass{ private LargeMatrix mtrx; hashCode(){...}}
JGraphT (maybe all Graph data structures) seems to be using Hash table to map the vertices. So will that affect the speed when I'm using the MyClass instead of String l1,l2,l3 ?
What are the pros and cons in that case ? Should I override hashcode(remove matrix hashcode ) ? Is there a graph that uses references instead of hashtable ?
so My code was:
package ann;
import org.jgrapht.DirectedGraph;
import org.jgrapht.graph.DefaultEdge;
import org.jgrapht.graph.SimpleDirectedGraph;
/**
* #author marmoush
*
*/
public class Network
{
DirectedGraph<String, DefaultEdge> diGraph;
String l1="hello1";
String l2="hello1";
String l3="hello3";
/**
*
*/
public Network()
{
diGraph = new SimpleDirectedGraph<String, DefaultEdge>(DefaultEdge.class);
diGraph.addVertex(l1);
diGraph.addVertex(l2);
diGraph.addVertex(l3);
diGraph.addEdge(l1, l2);
System.out.println(diGraph.containsEdge(l1,l2));
// TODO Auto-generated constructor stub
}
}
Exception in thread "main" java.lang.IllegalArgumentException: loops not allowed
at org.jgrapht.graph.AbstractBaseGraph.addEdge(Unknown Source)
at ann.Network.<init>(Network.java:28)
at test.TestNetwork.main(TestNetwork.java:9)
Because (I think) l1.hashCode()==l2.hashCode()
EDIT:
Matrices might be zeros sometimes or ones, they change over time so I would try to come up with something that differentiate between those objects, and that seems to be stupid solution. Why can't the graph just select vertices by there position in a vector or something ?
Should I reinvent the wheel ? with a graph that uses Vectors instead of hashtables ? or there is a work around ?

I would create a Vertex class and implement equals() and hashcode() appropriately. The speed impact is not that big, it can even be faster if the id of the vertex is numeric.

well it does make sense,
your objectl1 is the same as l2, even if it is stored in the memory at a different location, for the graph it is the same vertex.
is there a reason why you need to identical vertices in the graph? maybe there is a workaround

Strings are immutable in Java, so l1 and l2 are guaranteed to point to the exact same location in memory. It's a nice feature of the language, and it speeds up string processing a lot, but it will screw you up in situations like this.
That's why you're getting the loop exception. I suspect that you want an undirected graph structure here.
For a reference, see the Java Language Spec section 3.10.5:
Literal strings within the same class (§8) in the same package (§7)
represent references to the same
String object (§4.3.1).
Literal strings within different classes in the same package
represent references to the same
String object.
Literal strings within different classes in different packages
likewise represent references to
the same String object.

The problem is that this kind of graph doesn't allow loops. You have to change the kind of graph to AbstractBaseGraph where you can set the variable loopAllowed to true, or you could try to change this variable in your SimpleDirectedGraph.
Problem: I wasn't able to change the variable in the SimpleDirectedGraph, but you can use other types of graphs that allow you do it.
I hope I could help you.

Related

How to store an object of variable type in a field in Java?

I have created an object in Java, called "Edge". This object is a directed edge for a graph that stores a reference to the node at its start and the node at its end.
These nodes are currently a "Vertex" object, however I would like to also store the origin of the dual edge in the same object, so that it would return a "Face" object instead. Currently I only support returning a Vertex, as shown in the code below.
public class Edge{
private Vertex org; //this is the line that I want to be able to store a
//face given certain conditions (the index being 1 or 3)
private Face left; //this is a line where I want to store a face normally
//but store a Vertex if this is a dual edge
private int index; //a number from 0-3, where 1 and 3 are in the dual graph
public Vertex Org(){
return org;
}
}
I was wondering if there was a way of defining the function Org() and the field org, in such a way that the it could either be a Face or a Vertex. I was wondering if there was a way of using generic types, that could take become a "Vertex" or a "Face" depending on the index parameter. An example of what I tried is below.
public class Edge<T>{
public T org;
private T Org(){
return org;
}
}
However, this does not seem a very elegant solution and it only works for getting the origin and not the left Face/Vertex.
I was wondering if there was a way of storing a field that can be one of two possible object types, or another simple way of getting around this problem.

You don't want to return totally different types of objects from the same method, because what would you do with them after you returned them?
Suppose you did:
Object faceOrVertex = edge.org(); // returns face or vertex
So now you have to decide what to do with your face or vertex. You will need to write:
if (faceOrVertex instanceof Face) {
// cast to Face and do face stuff
} else {
// cast to Vertex and do vertex stuff
}
You might as well call two different methods returning known types.
Generics won't help you here. They don't remove the need for different types.
If your Face and Vertex classes have common features and you want to treat them in a common way, the solution is two declare an interface with common methods and have the Face and Vertex classes implement those methods. Without knowing exactly what you want to do with the result of the org method though, it is impossible to recommend something.
I suggest you first implement the solution with two different methods and then perhaps look later for common blocks of code that you could refactor into shared logic.

Your Vertex and Face are two different types, so a single method is never going to work, unless you have an interface that both Vertex and Face implements. Say you have defined an interface GraphElement. Now your class would bocome something like:
public interface GraphElement {
// operations
}
class Edge<V extends GraphElement, F extends GraphElement> {
V vertex;
F face;
int index;
GraphElement org() {
// processing code
return index % 2 == 0 ? face : vertex;
}
}

Storing 15,000 items in Java

I have a document with 15,000 items. Each item contains 6 variables (strings and integers). I have to copy all of these into some sort of two dimensional array, what the best way to do it?
Here are my ideas so far:
Make a GIANT 2D array or array list the same way you make any other array.
Pros: Simple Cons: Messy(would create a class just for this), huge amount of code, if I make a mistake it will be imposable to find where it is, all variables would have to be string even the ints which will make my job harder down the road
Make a new class with a super that takes in all the variables I need.
Create each item as a new instance of this class.
Add all of the instances to a 2D array or array list.
Pros: Simple, less messy, easier to find a mistake, not all the variables need to be strings which makes it much easier later when I don't have to convert string to int, a little less typing for me Cons: Slower? Will instances make my array compile slower? And will they make the over all array slow when I'm searching to items in it?
These ideas don't seem all to great :( and before I start the three week, five hour a day process of adding these items I would like to find the best way so I won't have to do it again... Suggestions on my current ideas or any new ideas?
Data example:
0: 100, west, sports, 10.89, MA, united
*not actual data

Your second options seems to be good. You can create a class containing all the items and create an array of that class.
You may use the following:
1. Read the document using buffered reader, so that memory issues will not occur.
2. Create a class containing your items.
3. Create a List of type you need and store the elements into it.
Let me know in case you face further problems.

If you already have the document with the 15000 * 6 items, in my experience you would be better served writing a program to use regex and parse it and have the output be the contents of the java array in the format you want. With such a parsing program in place, it will then also be very easy for you to change the format of the 15000 lines if you want to generate it differently.
As to the final format, I would have an ArrayList of your bean. By you text thus far, you don't necessarily need a super that takes in the variables, unless you need to have subtypes that are differentiated.
You'll probably run out of static space in a single class. So what I do is break up a big class like that into a file with a bunch of inner nested classes that each have a 64K (or less) part of the data as static final arrays, and then I merge them together in the main class in that file.
I have this in a class of many names to fix:
class FixName{
static String[][] testStrings;
static int add(String[][] aTestStrings, int lastIndex){
for(int i=0; i<aTestStrings.length; ++i) {
testStrings[++lastIndex]=aTestStrings[i];
}
return lastIndex;
}
static {
testStrings = new String[
FixName1.testStrings.length
+FixName2.testStrings.length
+FixName3.testStrings.length
+FixName4.testStrings.length
/**/ ][];
int lastIndex=-1;
lastIndex=add(FixName1.testStrings,lastIndex);
lastIndex=add(FixName2.testStrings,lastIndex);
lastIndex=add(FixName3.testStrings,lastIndex);
lastIndex=add(FixName4.testStrings,lastIndex);
/**/ }
}
class FixName1 {
static String[][] testStrings = {
{"key1","name1","other1"},
{"key2","name2","other2"},
//...
{"keyN","nameN","otherN"}
};
}

Create a wrapper (Item) if you have not already(as your question does not state it clearly).
If the size of the elements is fixed ie 1500 use array other wise use LinkedList(write your own linked list or use Collection).
If there are others operations that you need to support on this collection of items, may be further inserts, search( in particular) use balanced binary search tree.
With the understanding of the question i would say linked list is better option.

If the items have a unique property (name or id or row number or any other unique identifier) I recommend using a HashMap with a wrapper around the item. If you are going to do any kind of lookup on your data (find item with id x and do operation y) this is the fastest option and is also very clean, it just requires a wrapper and you can use a datastructure that is already implemented.
If you are not doing any lookups and need to process the items en masse in no specific order I would recommend an ArrayList, it is very optimized as it is the most commonly used collection in java. You would still need the wrapper to keep things clean and a list is far cleaner than an array at almost no extra cost.
Little point in making your own collection as your needs are not extremely specific, just use one that is already implemented and never worry about your code breaking, if it does it is oracles fault ;)

How can two Java list elements access each other?

The root of the problem for me is that Java does not allow references.
The problem can be summarized succinctly. Imagine you have a List of Blob objects:
class Blob {
public int xpos;
public int ypos;
public int mass;
public boolean dead;
private List<Object> giganticData;
public void blobMerge(Blob aBlob) {
. . .
if (. . .) {
this.dead = true;
} else {
aBlob.dead = true;
}
}
}
If two blobs are close enough, they should be merged, meaning one of the two blobs being compared should take on the attributes of the other (in this case, adding the mass and merging the giganticData sets) and the other should be marked for deletion from the list.
Setting aside the problem of how to optimally identify adjacent blobs, a stackoverflow question in its own right, how do you keep the blobMerge() logic in the Blob class? In C or C++ this would be straightforward, as you could just pass one Blob a pointer to the other and the "host" could do anything it likes to the "guest".
However, blobMerge() as implemented above in Java will operate on a copy of the "guest" Blob, which has two problems. 1) There is no need to incur the heavy cost of copying giganticData, and 2) the original copy of the "guest" Blob will remain unaffected in the containing list.
I can only see two ways to do this:
1) Pass the copies in, doing everything twice. In other words, Blob A hosts Blob B and Blob B hosts Blob A. You end up with the right answer, but have done way more work than necessary.
2) Put the blobMerge() logic in the Class that contains the containing List. However, this approach scales very poorly when you start subclassing Blob (BlueBlob, RedBlob, GreenBlob, etc.) such that the merge logic is different for every permutation. You end up with most of the subclass-specific code in the generic container that holds the list.
I've seen something about adding References to Java with a library, but the idea that you have to use a library to use a Reference put me off that idea.

Why would it operate on a copy? Java passes references to objects. And references are very much like C++ pointers.

Um... a reference is passed not a copy of the entire object. The original object will be modified and no data is actually moved around.

Help matching fields between two classes

I'm not too experienced with Java yet, and I'm hoping someone can steer me in the right direction because right now I feel like I'm just beating my head against a wall...
The first class is called MeasuredParams, and it's got 40+ numeric fields (height, weight, waistSize, wristSize - some int, but mostly double). The second class is a statistical classifier called Classifier. It's been trained on a subset of the MeasuredParams fields. The names of the fields that the Classifier has been trained on is stored, in order, in an array called reqdFields.
What I need to do is load a new array, toClassify, with the values stored in the fields from MeasuredParams that match the field list (including order) found in reqdFields. I can make any changes necessary to the MeasuredParams class, but I'm stuck with Classifier as it is.
My brute-force approach was to get rid of the fields in MeasuredParams and use an arrayList instead, and store the field names in an Enum object to act as an index pointer. Then loop through the reqdFields list, one element at a time, and find the matching name in the Enum object to find the correct position in the arrayList. Load the value stored at that positon into toClassify, and then continue on to the next element in reqdFields.
I'm not sure how exactly I would search through the Enum object - it would be a lot easier if the field names were stored in a second arrayList. But then the index positions between the two would have to stay matched, and I'm back to using an Enum. I think. I've been running around in circles all afternoon, and I keep thinking there must be an easier way of doing it. I'm just stuck right now and can't see past what I've started.
Any help would be GREATLY appreciated. Thanks so much!
Michael

You're probably better off using a Map rather than a List, you can use the enum as the key and get the values out.
Map<YourEnumType,ValueType> map = new HashMap<YourEnumType,ValueType>();

#Tom's recommendation to use Map is the preferred approach. Here's a trivial example that constructs such a Map for use by a static lookup() method.
private enum Season {
WINTER, SPRING, SUMMER, FALL;
private static Map<String, Season> map = new HashMap<String, Season>();
static {
for (Season s : Season.values()) {
map.put(s.name(), s);
}
}
public static Season lookup(String name) {
return map.get(name);
}
}
Note that every enum type has two implicitly declared static methods:
public static E[] values();
public static E valueOf(String name);
The values() method returns an array that is handy for constructing the Map. Alternatively, the array may be searched directly. The methods are implicit; they will appear in the javadoc of your enum when it is generated.
Addendum: As suggested by #Bert F, an EnumMap may be advantageous. See Effective Java Second Edition, Item 33: Use EnumMap instead of ordinal indexing, for a compelling example of using EnumMap to associate enums.

is there a performance hit when using enum.values() vs. String arrays?

I'm using enumerations to replace String constants in my java app (JRE 1.5).
Is there a performance hit when I treat the enum as a static array of names in a method that is called constantly (e.g. when rendering the UI)?
My code looks a bit like this:
public String getValue(int col) {
return ColumnValues.values()[col].toString();
}
Clarifications:
I'm concerned with a hidden cost related to enumerating values() repeatedly (e.g. inside paint() methods).
I can now see that all my scenarios include some int => enum conversion - which is not Java's way.
What is the actual price of extracting the values() array? Is it even an issue?
Android developers
Read Simon Langhoff's answer below, which has pointed out earlier by Geeks On Hugs in the accepted answer's comments. Enum.values() must do a defensive copy

For enums, in order to maintain immutability, they clone the backing array every time you call the Values() method. This means that it will have a performance impact. How much depends on your specific scenario.
I have been monitoring my own Android app and found out that this simple call used 13.4% CPU time! in my specific case.
In order to avoid cloning the values array, I decided to simple cache the values as a private field and then loop through those values whenever needed:
private final static Protocol[] values = Protocol.values();
After this small optimisation my method call only hogged a negligible 0.0% CPU time
In my use case, this was a welcome optimisation, however, it is important to note that using this approach is a tradeoff of mutability of your enum. Who knows what people might put into your values array once you give them a reference to it!?

Enum.values() gives you a reference to an array, and iterating over an array of enums costs the same as iterating over an array of strings. Meanwhile, comparing enum values to other enum values can actually be faster that comparing strings to strings.
Meanwhile, if you're worried about the cost of invoking the values() method versus already having a reference to the array, don't worry. Method invocation in Java is (now) blazingly fast, and any time it actually matters to performance, the method invocation will be inlined by the compiler anyway.
So, seriously, don't worry about it. Concentrate on code readability instead, and use Enum so that the compiler will catch it if you ever try to use a constant value that your code wasn't expecting to handle.
If you're curious about why enum comparisons might be faster than string comparisons, here are the details:
It depends on whether the strings have been interned or not. For Enum objects, there is always only one instance of each enum value in the system, and so each call to Enum.equals() can be done very quickly, just as if you were using the == operator instead of the equals() method. In fact, with Enum objects, it's safe to use == instead of equals(), whereas that's not safe to do with strings.
For strings, if the strings have been interned, then the comparison is just as fast as with an Enum. However, if the strings have not been interned, then the String.equals() method actually needs to walk the list of characters in both strings until either one of the strings ends or it discovers a character that is different between the two strings.
But again, this likely doesn't matter, even in Swing rendering code that must execute quickly. :-)
#Ben Lings points out that Enum.values() must do a defensive copy, since arrays are mutable and it's possible you could replace a value in the array that is returned by Enum.values(). This means that you do have to consider the cost of that defensive copy. However, copying a single contiguous array is generally a fast operation, assuming that it is implemented "under the hood" using some kind of memory-copy call, rather than naively iterating over the elements in the array. So, I don't think that changes the final answer here.

As a rule of thumb : before thinking about optimizing, have you any clue that this code could slow down your application ?
Now, the facts.
enum are, for a large part, syntactic sugar scattered across the compilation process. As a consequence, the values method, defined for an enum class, returns a static collection (that's to say loaded at class initialization) with performances that can be considered as roughly equivalent to an array one.

If you're concerned about performance, then measure.
From the code, I wouldn't expect any surprises but 90% of all performance guesswork is wrong. If you want to be safe, consider to move the enums up into the calling code (i.e. public String getValue(ColumnValues value) {return value.toString();}).

use this:
private enum ModelObject { NODE, SCENE, INSTANCE, URL_TO_FILE, URL_TO_MODEL,
ANIMATION_INTERPOLATION, ANIMATION_EVENT, ANIMATION_CLIP, SAMPLER, IMAGE_EMPTY,
BATCH, COMMAND, SHADER, PARAM, SKIN }
private static final ModelObject int2ModelObject[] = ModelObject.values();

If you're iterating through your enum values just to look for a specific value, you can statically map the enum values to integers. This pushes the performance impact on class load, and makes it easy/low impact to get specific enum values based on a mapped parameter.
public enum ExampleEnum {
value1(1),
value2(2),
valueUndefined(Integer.MAX_VALUE);
private final int enumValue;
private static Map enumMap;
ExampleEnum(int value){
enumValue = value;
}
static {
enumMap = new HashMap<Integer, ExampleEnum>();
for (ExampleEnum exampleEnum: ExampleEnum.values()) {
enumMap.put(exampleEnum.value, exampleEnum);
}
}
public static ExampleEnum getExampleEnum(int value) {
return enumMap.contains(value) ? enumMap.get(value) : valueUndefined;
}
}

I think yes. And it is more convenient to use Constants.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.