how to get the real value of data that has been hashed? - java

how to get the real value of data that has been hashed? is it possible that you can still get the real value of the data after you get the hashcode?
or is there any code that can reverse the output.
String ida = new String(txtID.getText().toString());
int idb = ida.hashCode();
codeD.setText("result: " + ida );
i already get the hashcode of the txtID(the user that has been input), but now i want to get the the real value that has been hashed without calling the ida.

Short answer: no.
The long answer:
A hash is meant to be a quick, one-way calculation to roughly identify some item. In Java, hash codes are usually used to putting something into a Map. The hash code is used to identify one Object from another when it is used as a key in the HashMap. It is not meant to store the data, only be different enough to prevent collisions. It is quite possible to have two objects that have the same hash.

As CodeChimp explained, what you are trying to do is not Hashing, but Encrypt/Decrypt.
This example will help you do it: http://www.example-code.com/java/aes_dataStream.asp
If you are like me and enjoy reinventing the wheel, this could be fun to implement: http://en.wikipedia.org/wiki/Affine_cipher

Related

Efficient data structure that checks for existence of String

I am writing a program which will add a growing number or unique strings to a data structure. Once this is done, I later need to constantly check for existence of the string in it.
If I were to use an ArrayList I believe checking for the existence of some specified string would iterate through all items until a matching string is found (or reach the end and return false).
However, with a HashMap I know that in constant time I can simply use the key as a String and return any non-null object, making this operation faster. However, I am not keen on filling a HashMap where the value is completely arbitrary. Is there a readily available data structure that uses hash functions, but doesn't require a value to be placed?
If I were to use an ArrayList I believe checking for the existence of some specified string would iterate through all items until a matching string is found
Correct, checking a list for an item is linear in the number of entries of the list.
However, I am not keen on filling a HashMap where the value is completely arbitrary
You don't have to: Java provides a HashSet<T> class, which is very much like a HashMap without the value part.
You can put all your strings there, and then check for presence or absence of other strings in constant time;
Set<String> knownStrings = new HashSet<String>();
... // Fill the set with strings
if (knownString.contains(myString)) {
...
}
It depends on many factors, including the number of strings you have to feed into that data structure (do you know the number by advance, or have a basic idea?), and what you expect the hit/miss ratio to be.
A very efficient data structure to use is a trie or a radix tree; they are basically made for that. For an explanation of how they work, see the wikipedia entry (a followup to the radix tree definition is in this page). There are Java implementations (one of them is here; however I have a fixed set of strings to inject, which is why I use a builder).
If your number of strings is really huge and you don't expect a minimal miss ratio then you might also consider using a bloom filter; the problem however is that it is probabilistic; but you can get very quick answers to "not there". Here also, there are implementations in Java (Guava has an implementation for instance).
Otherwise, well, a HashSet...
A HashSet is probably the right answer, but if you choose (for simplicity, eg) to search a list it's probably more efficient to concatenate your words into a String with separators:
String wordList = "$word1$word2$word3$word4$...";
Then create a search argument with your word between the separators:
String searchArg = "$" + searchWord + "$";
Then search with, say, contains:
bool wordFound = wordList.contains(searchArg);
You can maybe make this a tiny bit more efficient by using StringBuilder to build the searchArg.
As others mentioned HashSet is the way to go. But if the size is going to be large and you are fine with false positives (checking if the username exists) you can use BloomFilters (probabilistic data structure) as well.

Best pratice for using array as the key of memoization in Java

I am doing some algorithm problems in Java, and from time to time the problem needs memoization to optimize speed. And often times, the key is an array. What I usually uses is
HashMap<ArrayList<Integer>, Integer> mem;
The main reason here to use ArrayList<Integer> instead of int[] is that the hashCode() of an primitive array is calculated based on the reference, but for ArrayList<Integer> the value of the actual array is compared, which is desired behavior.
However, it is not very efficient and code can be pretty lengthy as well. So I am wondering if there is any best practice for this kind of memoization in Java? Thanks.
UPDATE: As many have pointed this out in the comments: it is a very bad idea to use mutable objects as the key of a HashMap, which I totally agree.
And I am going to clarify the question a little bit more: when I use this type of memoization, I will NOT change the ArrayList<Integer> once it is inserted to the map. Normally the array represents some status, and I need to cache the corresponding value for that status in case it is visited again.
So please do not focus on how bad it is to use a mutable object as the key to a HashMap. Do suggest some better way to do this kind of memoization please.
UPDATE2: So at last I choose the Arrays.toString() approach since I am doing algorithm problems on TopCoder/Codeforces, and it is just dirty and fast to code.
However, I do think HashMap is the more reasonable and readable way to do this.
You can create a new class - Key, put an array with some numbers as a field and implement your own hascode() based on the contents of the array.
It will improve the readability as well:
HashMap<Key, Integer> mem;
If your ArrayList contains usually 3-4 elements,
I would not worry much about performance. Your approach is OK.
But as others pointed out, your key is thus mutable which is
a bad idea.
Another approach is to append all elements of the ArrayList
together using some separator (say #) and thus have this kind
of string for key: 123#555#66678 instead of an ArrayList of
these 3 integers. You can just call Arrays.toString(int[])
and get a decent string key out of an array of integers.
I would choose the second approach.
If the input array is large, the main problem seems to be the efficiency of lookup. On the other hand, your computation is probably much more expensive than that, so you've got same CPU cycles to spare.
Lookup time will depend both on the hashcode calculation and on the brute-force equals needed to pinpoint the key in a hash bucket. This is why the array as a key is out of the question.
The suggestion already given by user:XpressOneUp, creating a class which wraps the array and provides its custom hash code, seems like your best bet and you can optimize hashcode calculation to involve only some array elements. You'll know best which elements are the most salient.
If the values in the array are small integer than here is way to do it efficiently :-
HashMap<String,Integer> Map
public String encode(ArrayList arr) {
String key = "";
for(int i=0;i<arr.size();i++) {
key = key + arr.get(i) + ",";
}
return(key);
}
Use the encode method to convert your array to unique string use to add and lookup the values in HashMap

Prevent treemap merging on collision

Edit: I should have probably mentioned that I am extremely new to Java programming. I just started with the language about two weeks ago.
I have tried looking for an answer to this questions, but so far I haven't found one so that is why I am asking it here.
I writing java code for an Dungeons and Dragons Initiative Tracker and I am using a TreeMap for its ability to sort on entry. I am still very new to java, so I don't know everything that is out there.
My problem is that when I have two of the same keys, the tree merges the values such that one of the values no longer exists. I understand this can be desirable behavior but in my case I cannot have that happen. I was hoping there would be an elegant solution to fix this behavior. So far what I have is this:
TreeMap<Integer,Character> initiativeList = new TreeMap<Integer,Character>(Collections.reverseOrder());
Character [] cHolder = new Character[3];
out.println("Thank you for using the Initiative Tracker Project.");
cHolder[0] = new Character("Fred",2);
cHolder[1] = new Character("Sam",3,23);
cHolder[2] = new Character("John",2,23);
for(int i = 0; i < cHolder.length; ++i)
{
initiativeList.put(cHolder[i].getInitValue(), cHolder[i]);
}
out.println("Initiative List: " + initiativeList);
Character is a class that I have defined that keeps track of a player's character name and initiative values.
Currently the output is this:
Initiative List: {23=John, 3=Fred}
I considered using a TreeMap with some sort of subCollection but I would also run into a similar problem. What I really need to do is just find a way to disable the merge. Thank you guys for any help you can give me.
EDIT: In Dungeons and Dragons, a character rolls a 20 sided dice and then added their initiative mod to the result to get their total initiative. Sometimes two players can get the same values. I've thought about having the key formatted like this:
Key = InitiativeValue.InitiativeMod
So for Sam his key would be 23.3 and John's would be 23.2. I understand that I would need to change the key type to float instead of int.
However, even with that two players could have the same Initiative Mod and roll the same Initiative Value. In reality this happens more than you might think. So for example,
Say both Peter and Scott join the game. They both have an initiative modifier of 2, and they both roll a 10 on the 20 sided dice. That would make both of their Initiative values 12.
When I put them into the existing map, they both need to show up even though they have the same value.
Initiative List: {23=John, 12=Peter, 12=Scott, 3=Fred}
I hope that helps to clarify what I am needing.
If I understand you correctly, you have a bunch of characters and their initiatives, and want to "invert" this structure to key by initiative ID, with the value being all characters that have that initiative. This is perfectly captured by a MultiMap data structure, of which one implementation is the Guava TreeMultimap.
There's nothing magical about this. You could achieve something similar with a
TreeMap<Initiative,List<Character>>
This is not exactly how a Guava multimap is implemented, but it's the simplest data structure that could support what you need.
If I were doing this I would write my own class that wrapped the above TreeMap and provided an add(K key, V value) method that handled the duplicate detection and list management according to your specific requirements.
You say you are "...a TreeMap for its ability to sort on entry..." - but maybe you could just use a TreeSet instead. You'll need to implement a suitable compareTo method on your Character class, that performs the comparison that you want; and I strongly recommend that you implement hashCode and equals too.
Then, when you iterate through the TreeSet, you'll get the Character objects in the appropriate order. Note that Map classes are intended for lookup purposes, not for ordering.

Reflection, hashtable or methods for performance?

I'm trying to write a Java program to decode and encode Ogg streams. I've got a decoder working but I didn't like the fact that I had duplicate code so I started writing something like that:
Decoder oggDecoder = new Decoder(
new StringDecoder( "Ogg" ),
new IntDecoder( "something" )//, ...
);
I wrote encoders and decoders for some "types" and then use them to build the whole thing.
But then I don't know how to store the result. I have 3 options I know:
- keep the data in an array of bytes and provide a get( String name ) and set( String name, Object value ) methods that will work directly on the bytes.
- use a dictionary.
- use a class and use reflection to set the properties.
I'm not that much into performance and if it's slow I don't really care as long as it's fast enough to read music. Meaning that I know writing the functions myself would make it faster but I want to write just one function working for all the properties.
So what do you think would be the fastest way of doing this?
Another way to ask this question would be:
Given a set of field names as an array of String, what is the most appropriate data structure to store the corresponding values that got decoded from a byte stream:
- keep them as byte
- store them in a dictionary
- store them in a class using reflexion
Thanks in advance for your answer.
KISS - just use a HashMap<String, byte[]>. No reflection needed.
Update
I don't think I understood at first what you want, but now I think what you are looking for is a hetergeneous map structure.
Here's a question that might be of more use to you.

A two way String hash function

I want to get a unique numeric representation of a String. I know there are lots of ways of doing this, my question is which do you think is the best? I don't want to have negative numbers - so the hashcode() function in java is not so good, although I could override it ... but I'd rather not since I am not so confident and don't want to accidentally break something.
My Strings are all semantic-web URIS. The reason for the numeric representation is that when I display the data for a URI on a page I need something to pass into the query String or put into various fields in my javascript. The URI itself is too unwieldy and looks bad when you have a URI as a value in a URI.
Basically I want to have a class called Resource which will look like this
Resource{
int id;
String uri;
String value; // this is the label or human readable name
// .... other code/getters/setters here
public int getId(){
return id = stringToIntFunction();
}
private int stringToIntFunction(String uri){
// do magic here
}
}
Can you suggestion a function that would do this if:
It had to be two way, that is you could also recover the original string from the numeric value
It doesn't have to be two way
Also are there other issues that are important that I am not considering?
If you want it to be reversible, you're in trouble. Hashes are designed to be one-way.
In particular, given that an int has 32 bits of information, and a char has 16 bits of information, requiring reversibility means you can only have strings of zero, one or two characters (and even that's assuming that you're happy to encode "" as "\0\0" or something similar). That's assuming you don't have any storage, of course. If you can use storage, then just store numbers sequentially... something like:
private int stringToIntFunction(String uri) {
Integer existingId = storage.get(uri);
if (existingId != null) {
return existingId.intValue();
}
return storage.put(uri);
}
Here storage.put() would increase a counter internally, store the URI as being associated with that counter value, and return it. My guess is that that's not what you're after though.
Basically, to perform a reversible encryption, I'd use a standard encryption library having converted the string to a binary format first (e.g. using UTF-8). I would expect the result to be a byte[].
If it doesn't have to be reversible, I'd consider just taking the absolute value of the normal hashCode() result (but mapping Integer.MIN_VALUE to something specific, as its absolute value can't be represented as an int).
Hashes are one way only (that's part of the reason they have a fixed length regardless of the input size). If you need two-way, you're looking at something like Base64 encoding.
Why can't you have negative numbers? Where do the URIs come from? Are they in a database? Why not use the Database Key ID? If they are not in a database, can you generate them for the user given a set of variables/parameters? (So the query string only contains things like foo=1&bar=two and you generate the URL on the Server or JavaScript side)
Given all the remars done above (hash function is one way), I would go for 2 possible solutions:
Use some encrypting function to get a long string representing your URL (you'll get something like -> param=456ab894ce897b98f (this could be longer and/or shorter depending on the URL). See DES encryption for instance or base64url.
Keep track of the URLs in a database (could be also a simple file-based database such as SQLite). Then you'll effectively have an uint <=> URL equivalence.
"Unique representation" implies that the Java supplied string.hashcode would be useless - you'd soon come across two URIs that shared the same hashcode.
Any two-way scheme is going to result in an unwieldy string - unless you store the URIs in a database and use the record ID as your unique identifier.
As far as one-way goes - an MD5 hash would be considerably more unique (but by no means unique) than the simple hashcode - but might be verging on "unwieldy" depending on your definition!
Q1: If you want to recover the string from the number then you could use:
1a: an encryption of the string, which is going to be the same size, or longer, unless you zip the string first. This will give an array of random looking bytes, which could be displayed as Base-64.
1b: a database, or a map, and the number is the index of the string in the map/database.
Q2: The string does not have to be recoverable.
Various ideas are possible here. You can display the hash in hex or in Base-64 to avoid negative signs. The only non-alphanumeric characters in Base-64 are '+', '/' and '='. For an almost unique hash you will need something of cryptographic size, MD5 (128 bits), SHA-1 (160 bits) or SHA-2 (256 or 512 bits).
An MD5 hash looks like "d131dd02c5e6eec4693d9a0698aff95c" in hex; the larger the hash the less likely a collision is.
rossum

Categories