A short as possible unique ID - java

I'm making a tool for optimizing script and now I want to compress all names in it to the minimum.
I got the function started for it, but it somehow bugs and stops after length 2 is exceeded.
Is there an easier way to do this? I just need a pattern that generates a String starting from a -> z then aa -> az ba -> bz and so on.
public String getToken() {
String result = ""; int i = 0;
while(i < length){
result = result + charmap.substring(positions[i], positions[i]+1);
positions[length]++;
if (positions[current] >= charmap.length()){
positions[current] = 0;
if ( current < 1 ) {
current++;length++;
}else{
int i2 = current-1;
while( i2 > -1 ){
positions[i2]++;
if(positions[i2] < charmap.length()){
break;
}else if( i2 > 0 ){
positions[i2] = 0;
}else{
positions[i2] = 0;
length++;current++;
}
i2--;
}
}
}
i++;
}
return result;
}
UNLIKE THE OTHER QUESTIONS!! I dont just want to increase an integer, the length increases to much.

Here's one I used
public class AsciiID {
private static final String alphabet=
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
private int currentId;
public String nextId() {
int id = currentId++;
StringBuilder b = new StringBuilder();
do {
b.append(alphabet.charAt(id % alphabet.length()));
} while((id /=alphabet.length()) != 0);
return b.toString();
}
}

I would use a base 36 or base 64 (depending on case sensitivity) library and run it with an integer and before you output, convert the integer to a base 36/64 number. You can think in terms of sequence, which is easier, and the output value is handled by a trusted library.

You can use:
Integer.toString(i++, Character.MAX_RADIX)
It's base36. It will be not as greatly compressed as Base64 but you have a 1-line implementation.

You could search for some library that operates numbers of any radix, say 27, 37 or more. Then you output that number as alphanumeric string (like HEX, but with a-zA-Z0-9).

Well let's assume we can only output ASCII (for unicode this problem gets.. complicated): As a quick look shows its printable characters are in the range [32,126]. So to get the most efficient representation of this problem we have to encode a given integer in base 94 so to speak and add 32 to any generated char.
How you do that? Look up how Sun does it in Integer.toString() and adapt it accordingly. Well it's probably more complex than necessary - just think about how you convert a number into radix 2 and adapt that. In its simplest form that's basically a loop with one division and modulo.

In your tool you need to create a dictionary, which will contain an unique integer id for each unique string and the string itself. When adding strings to the dictionary you increment given id for each newly added unique string. Once dictionary is completed, you can simply convert ids to String using something like this:
static final String CHARS = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
static final int CHARS_LENGTH = CHARS.length();
public String convert(int id) {
StringBuilder sb = new StringBuilder();
do {
sb.append(CHARS.charAt(id % CHARS_LENGTH));
id = id / CHARS_LENGTH;
} while(id != 0);
return sb.toString();
}

This function generates the Nth Bijective Number (except zeroth). This is the most optimal coding ever possible. (The zeroth would be an empty string.)
If there were 10 possible characters, 0-9, it generates, in order:
10 strings of length 1, from "0" to "9"
10*10 strings of length 2, from "00" to "99"
10*10*10 strings of length 3, from "000" to "999"
etc.
The example uses 93 characters, because I just happened to need those for Json.
private static final char[] ALLOWED_CHARS =
" !#$%&'()*+,-./0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz{|}~"
.toCharArray();
private static final AtomicInteger uniqueIdCounter = new AtomicInteger();
public static String getToken() {
int id = uniqueIdCounter.getAndIncrement();
return toBijectiveNumber(id, ALLOWED_CHARS);
}
public static String toBijectiveNumber(int id, char[] allowedChars) {
assert id >= 0;
StringBuilder sb = new StringBuilder(8);
int divisor = 1;
int length = 1;
while (id >= divisor * allowedChars.length) {
divisor *= allowedChars.length;
length++;
id -= divisor;
}
for (int i = 0; i < length; i++) {
sb.append(allowedChars[(id / divisor) % allowedChars.length]);
divisor /= allowedChars.length;
}
return sb.toString();
}

Related

Count all possible decoding Combination of the given binary String in Java

Suppose we have a string of binary values in which some portions may correspond to specific letters, for example:
A = 0
B = 00
C = 001
D = 010
E = 0010
F = 0100
G = 0110
H = 0001
For example, if we assume the string "00100", we can have 5 different possibilities:
ADA
AF
CAA
CB
EA
I have to extract the exact number of combinations using Dynamic programming.
But I have difficulty in the formulation of subproblems and in the composition of the corresponding vector of solutions.
I appreciate any indications of the correct algorithm formulation.
class countString {
static int count(String a, String b, int m, int n) {
if ((m == 0 && n == 0) || n == 0)
return 1;
if (m == 0)
return 0;
if (a.charAt(m - 1) == b.charAt(n - 1))
return count(a, b, m - 1, n - 1) +
count(a, b, m - 1, n);
else
return count(a, b, m - 1, n);
}
public static void main(String[] args) {
Locale.setDefault(Locale.US);
ArrayList<String> substrings = new ArrayList<>();
substrings.add("0");
substrings.add("00");
substrings.add("001");
substrings.add("010");
substrings.add("0010");
substrings.add("0100");
substrings.add("0110");
substrings.add("0001");
if (args.length != 1) {
System.err.println("ERROR - execute with: java countString -filename- ");
System.exit(1);
}
try {
Scanner scan = new Scanner(new File(args[0])); // not important
String S = "00100";
int count = 0;
for(int i=0; i<substrings.size(); i++){
count = count + count(S,substrings.get(i),S.length(),substrings.get(i).length());
}
System.out.println(count);
} catch (FileNotFoundException e) {
System.out.println("File not found " + e);
}
}
}
In essence, Dynamic Programming is an enhanced brute-force approach.
Like in the case of brute-force, we need to generate all possible results. But contrary to a plain brute-force the problem should be divided into smaller subproblems, and previously computed result of each subproblem should be stored and reused.
Since you are using recursion you need to apply so-called Memoization technic in order to store and reuse the intermediate results. In this case, HashMap would be a perfect mean of storing results.
But before applying the memoization in order to understand it better, it makes sense to start with a clean and simple recursive solution that works correctly, and only then enhance it with DP.
Plain Recursion
Every recursive implementation should contain two parts:
Base case - that represents a simple edge-case (or a set of edge-cases) for which the outcome is known in advance. For this problem, there are two edge-cases: the length of the given string is 0 and result would be 1 (an empty binary string "" results into an empty string of letters ""), another case is when it's impossible to decode a given binary string and result will be 0 (in the solution below it resolves naturally when the recursive case is being executed).
Recursive case - a part of a solution where recursive calls a made and when the main logic resides. In the recursive case, we need to find each binary "binary letter" at the beginning of the string and then call the method recursively by passing the substring (without the "letter"). Results of these recursive calls need to be accumulated in the total count that will returned from the method.
In order to implement this logic we need only two arguments: the binary string to analyze and a list of binary letters:
public static int count(String str, List<String> letters) {
if (str.isEmpty()) { // base case - a combination was found
return 1;
}
// recursive case
int count = 0;
for (String letter: letters) {
if (str.startsWith(letter)) {
count += count(str.substring(letter.length()), letters);
}
}
return count;
}
This concise solution is already capable of producing the correct result. Now, let's turn this brute-force version into a DP-based solution, by applying the memoization.
Dynamic Programming
As I've told earlier, a HashMap will be a perfect mean to store the intermediate results because allows to associate a count (number of combinations) with a particular string and then retrieve this number almost instantly (in O(1) time).
That how it might look like:
public static int count(String str, List<String> letters, Map<String, Integer> vocab) {
if (str.isEmpty()) { // base case - a combination was found
return 1;
}
if (vocab.containsKey(str)) { // result was already computed and present in the map
return vocab.get(str);
}
int count = 0;
for (String letter: letters) {
if (str.startsWith(letter)) {
count += count(str.substring(letter.length()), letters, vocab);
}
}
vocab.put(str, count); // storing the total `count` into the map
return count;
}
main()
public static void main(String[] args) {
List<String> letters = List.of("0", "00", "001", "010", "0010", "0100", "0110", "0001"); // binary letters
System.out.println(count("00100", letters, new HashMap<>())); // DP
System.out.println(count("00100", letters)); // brute-force recursion
}
Output:
5 // DP
5 // plain recursion
A link to Online Demo
Hope this helps.
Idea is to create every possible string with these values and check whether input starts with the value or not. If not then switch to another index.
If you have test cases ready with you you can verify more.
I have tested only with 2-3 values.
public int getCombo(String[] array, int startingIndex, String val, String input) {
int count = 0;
for (int i = startingIndex; i < array.length; i++) {
String matchValue = val + array[i];
if (matchValue.length() <= input.length()) {
// if value matches then count + 1
if (matchValue.equals(input)) {
count++;
System.out.println("match Found---->" + count); //ommit this sysout , its only for testing.
return count;
} else if (input.startsWith(matchValue)) { // checking whether the input is starting with the new value
// search further combos
count += getCombo(array, 0, matchValue, input);
}
}
}
return count;
}
In main Method
String[] arr = substrings.toArray(new String[0]);
int count = 0;
for (int i = 0; i < arr.length; i++) {
System.out.println("index----?> " + i);
//adding this condition for single inputs i.e "0","010";
if(arr[i].equals(input))
count++;
else
count = count + getCombo(arr, 0, arr[i], input);
}
System.out.println("Final count : " + count);
My test results :
input : 00100
Final count 5
input : 000
Final count 3

Java - Help converting letter to integer, adding 5, then converting back to letter

First off, here is my code so far
public int encrypt() {
/* This method will apply a simple encrypted algorithm to the text.
* Replace each character with the character that is five steps away from
* it in the alphabet. For instance, 'A' becomes 'F', 'Y' becomes '~' and
* so on. Builds a string with these new encrypted values and returns it.
*/
text = toLower;
encrypt = "";
int eNum = 0;
for (int i = 0; i <text.length(); i++) {
c = text.charAt(i);
if ((Character.isLetter(c))) {
eNum = (int) - (int)'a' + 5;
}
}
return eNum;
}
(text is the inputted string by the way. And toLower makes the string all lower case to make it easier converting.)
I got most of my assignment done, but one part of it is tasking me with moving every letter inputted 5 spaces over. A becomes F, B becomes G, etc.
So far from I got the letter converted to a number, but I am having trouble adding to it and then returning it back to a letter.
When I run the program and I enter my input such as "abc" I get '8'. It just adds them all up.
Any help would be much appreciated, and I can post the full code if necessary.
Few issues -
First of all - eNum = (int) - (int)'a' + 5; you do not need the first (int) - i believe, you can just do - eNum = (int)c + 5; . Your expression would always result in a negative integer.
Instead of returning eNum you should convert it to character and add it to a string and return the string at end (or you can create a character array of same length as string , keep storing the characters in the array, and return a string created from the character array).
Instead of using a in the condition , you should use c which denotes the current character at the ith index.
I am guessing not all of the variables in your code are member variables (instance variables) of the class , so you should define them with a datatype in your code.
Example changes to your code -
String text = toLower; //if toLower is not correct, use a correct variable to get the data to encrypt from.
String encrypt = "";
for (int i = 0; i <text.length(); i++) {
char c = text.charAt(i);
if ((Character.isLetter(c))) {
encrypt += (char)((int)c + 5);
}
}
return encrypt;
//Just a quick conversion for testing
String yourInput = "AbC".toLowerCase();
String convertedString = "";
for (int i = 0; i <text.length(); i++) {
char c = yourInput.charAt(i);
int num = Character.getNumericValue(c);
num = (num + 5)%128 //If you somehow manage to pass 127, to prevent errors, start at 0 again using modulus
convertedString += Integer.toString(num);
}
System.out.println(convertedString);
Hope this is what you're looking for.
Try something like this, I believe this has several advantages:
public String encrypt(String in) {
String workingCopy = in.toLowerCase();
StringBuilder out = new StringBuilder();
for (int i = 0; i < workingCopy.length(); i++) {
char c = workingCopy.charAt(i);
if ((Character.isLetter(c))) {
out.append((char)(c + 5));
}
}
return out.toString();
}
This code is a little bit verbose, but perhaps then it is easier to follow. I introduced the StringBuilder because it is more efficient than doing string = string + x

Flip a Hex String

Acording to a other question made here Split a Hex String without spaces and flip it, I write this new question more clearly here.
I have an Hex String like this:
Hex_string = 2B00FFEC
What I need is to change the order of the Hex String to start from the latest characters, so this would be like this:
Fliped_hex_string = ECFF002B
In the other question I asked a way to achieve this using the .split() method. But there should be another way to get this in a better way.
As simple as you can is
String s = "2B00FFEC";
StringBuilder result = new StringBuilder();
for (int i = 0; i <=s.length()-2; i=i+2) {
result.append(new StringBuilder(s.substring(i,i+2)).reverse());
}
System.out.println(result.reverse().toString()); //op :ECFF002B
OP constrains the character length to exactly 8 characters in comments.
A purely numeric answer (inspired from idioms to convert endianness); saves going to and from strings
n is an int:
int m = ((n>>24)&0xff) | // byte 3 to byte 0
((n<<8)&0xff0000) | // byte 1 to byte 2
((n>>8)&0xff00) | // byte 2 to byte 1
((n<<24)&0xff000000); // byte 0 to byte 3
If you need to convert this to hexadecimal, use
String s = Integer.toHexString(m);
and if you need to set n from hexadecimal, use
int n = (int)Long.parseLong(hex_string, 16);
where hex_string is your initial string. You need to go via the Long parser to allow for negatives.
You could do something like:
String a = "456789AB";
char[] ca = a.toCharArray();
StringBuilder sb = new StringBuilder(a.length());
for (int i = 0; i<a.length();i+=2)
{
sb.insert(0, ca, i, 2);
}
This also extends to longer Strings if needed
Perhaps you should try something as simple as this:
public static String flip(final String hex){
final StringBuilder builder = new StringBuilder(hex.length());
for(int i = hex.length(); i > 1; i-=2)
builder.append(hex.substring(i-2, i));
return builder.toString();
}
public static void main(String args[]){
System.out.println(flip("2B00FFEC"));
}
The output is: ECFF002B
Next time you ask a question, perhaps you should show us some code you've written used in order to solve your problem (and then ask us why your code doesn't work, not your problem). You will not learn anything from us just providing answers without you knowing how they work.
This method seems to do what you want
String changeHexOrder(String s) {
char[] arr = s.toCharArray();
char tmp;
//change positions of [i, i + 1 , , , , , ,length - i - 2, length - i - 1]
for (int i = 0; i < arr.length / 2; i += 2) {
tmp = arr[i];
arr[i] = arr[arr.length-i-2];
arr[arr.length-i-2] = tmp;
tmp = arr[i+1];
arr[i+1] = arr[arr.length-i-1];
arr[arr.length-i-1] = tmp;
}
return new String(arr);
}
This worked for me
StringBuilder lsbToMsb=new StringBuilder();
for(int i=input.length();i>0;i-=2)
{
lsbMsb.append(lsbToMsb.substring(i-2,i));
}
String lsbMsb=lsbMsb.toString();

How can I generate a random 7-character alphanumeric string?

I did the stuff like this but not working. the base48Encode method parameter I have passed the current system time in milli secs
private static final String CHARACTER_SET = "23456789abcdefghijkmnpqrstuvwxyzABCDEFGHIJKLMNPQRSTUVWXYZ";
public static String base48Encode(double d) {
Double num = Double.valueOf(d);
Integer length = CHARACTER_SET.length();
String encodeString = new String();
while (num > length) {
encodeString = CHARACTER_SET.charAt(num.intValue() % length) + encodeString;
num = Math.ceil(new Double(num / length) - 1);
}
encodeString = CHARACTER_SET.charAt(num.intValue()) + encodeString;
return encodeString;
}
I won't get duplicate values in any scenario.
It's not possible to 100% guarantee a unique value (especially given a string of 7 characters) due to the Birthday Paradox. Given a character set containing 48 characters, selecting 7 at random, you'd have a 1% chance of collision after only 110,000 random values.
You can help mitigate this by doing two things.
Use a larger character set.
Increase the length of your random value.
Using a character set of 64 characters and selecting 10 at random would greatly decrease your chance of a collision, down to a 1% after 160,000,000 random values.
Rather than using currentTimeMillis to generate a value, which would cause a collision if you generated two values within the same millisecond, I'd suggest just using the Random class (which is seeded from the current time down to the nanosecond).
private static final String CHARACTER_SET = "23456789abcdefghijkmnpqrstuvwxyzABCDEFGHIJKLMNPQRSTUVWXYZ";
private static Random rnd = new Random();
public static String randomString(int length){
StringBuilder builder = new StringBuilder();
for(int i = 0; i < length; i++){
builder.append(CHARACTER_SET.charAt(rnd.nextInt(CHARACTER_SET.length())));
}
return builder.toString();
}

How to generate strings that share the same hashcode in Java?

An existing system written in Java uses the hashcode of a string as its routing strategy for load balancing.
Now, I cannot modify the system but need to generate strings that share the same hashcode to test the worst condition.
I provide those strings from commandline and hope the system will route all these strings into the same destination.
Is it possible to generate a large numbers of strings that share the same hashcode?
To make this question clear:
String[] getStringsInSameHashCode(int number){
//return an array in length "number"
//Every element of the array share the same hashcode.
//The element should be different from each other
}
Remarks: Any hashCode value is acceptable. There is no constraint on what the string is. But they should be different from each other.
EDIT:
Override method of String class is not acceptable because I feed those string from command line.
Instrumentation is also not acceptable because that will make some impacts on the system.
see a test method, basically, so long as you match,
a1*31+b1 = a2*31 +b2, which means (a1-a2)*31=b2-b1
public void testHash()
{
System.out.println("A:" + ((int)'A'));
System.out.println("B:" + ((int)'B'));
System.out.println("a:" + ((int)'a'));
System.out.println(hash("Aa".hashCode()));
System.out.println(hash("BB".hashCode()));
System.out.println(hash("Aa".hashCode()));
System.out.println(hash("BB".hashCode()));
System.out.println(hash("AaAa".hashCode()));
System.out.println(hash("BBBB".hashCode()));
System.out.println(hash("AaBB".hashCode()));
System.out.println(hash("BBAa".hashCode()));
}
you will get
A:65
B:66
a:97
2260
2260
2260
2260
2019172
2019172
2019172
2019172
edit: someone said this is not straightforward enough. I added below part
#Test
public void testN() throws Exception {
List<String> l = HashCUtil.generateN(3);
for(int i = 0; i < l.size(); ++i){
System.out.println(l.get(i) + "---" + l.get(i).hashCode());
}
}
AaAaAa---1952508096
AaAaBB---1952508096
AaBBAa---1952508096
AaBBBB---1952508096
BBAaAa---1952508096
BBAaBB---1952508096
BBBBAa---1952508096
BBBBBB---1952508096
below is the source code, it might be not efficient, but it work:
public class HashCUtil {
private static String[] base = new String[] {"Aa", "BB"};
public static List<String> generateN(int n)
{
if(n <= 0)
{
return null;
}
List<String> list = generateOne(null);
for(int i = 1; i < n; ++i)
{
list = generateOne(list);
}
return list;
}
public static List<String> generateOne(List<String> strList)
{
if((null == strList) || (0 == strList.size()))
{
strList = new ArrayList<String>();
for(int i = 0; i < base.length; ++i)
{
strList.add(base[i]);
}
return strList;
}
List<String> result = new ArrayList<String>();
for(int i = 0; i < base.length; ++i)
{
for(String str: strList)
{
result.add(base[i] + str);
}
}
return result;
}
}
look at String.hashCode()
public int hashCode() {
int h = hash;
if (h == 0) {
int off = offset;
char val[] = value;
int len = count;
for (int i = 0; i < len; i++) {
h = 31*h + val[off++];
}
hash = h;
}
return h;
}
I think find a equal-hash string from a long string is too hard, it's easy when find equal-hash string of an short string (2 or 3).
Look at the equation below. (sorry I cant post image cause me new member)
Notice that, "FB" and "Ea" have the same hashcode, and any two strings like s1+"FB"+s2 and s1+"Ea"+s2 will have the same hashcode.
So, the easy solution is finding any 2-char substring of existing string and replace with a 2-char substring with the same hashcode
Exmaple, we have the string "helloworld"
get 2-char substring "he", hashcode("he") = 'h'*31 + 'e' = ('h'*31 + 31) + ('e' - 31) = ('h'+1)*31 + 'F' = 'i' + 'F' = hashcode("iF")
so the desire string is "iFlloworld"
we have increased 'h' by 1, we can increase by 2, or 3 etc (but will be wrong if it overflow the char value)
The below code run well with small level, it will wrong if the level is big, make the char value overflow, I will fix it later if you want (this code change 2 first chars, but I will edit code to 2 last chars because 2 first chars are calc with largest value)
public static String samehash(String s, int level) {
if (s.length() < 2)
return s;
String sub2 = s.substring(0, 2);
char c0 = sub2.charAt(0);
char c1 = sub2.charAt(1);
c0 = (char) (c0 + level);
c1 = (char) (c1 - 31 * level);
String newsub2 = new String(new char[] { c0, c1 });
String re = newsub2 + s.substring(2);
return re;
}
I was wondering if there was a "universal" solution; e.g. some constant string XYZ, such that
s.hashCode() == (s + XYZ).hashCode()
for any string s. Finding such a string involves solving a fairly complicated equation ... which was beyond my rusty mathematical ability. But then it dawned on me that h == 31*h + ch is always true when h and ch are both zero!
Based on that insight, the following method should create a different String with the same hashcode as its argument:
public String collider(String s) {
return "\0" + s;
}
If NUL characters are problematic for you, prepending any string whose hashcode is zero would work too ... albeit that the colliding strings would be longer than if you used zero.
Given String X, then String Y = "\u0096\0\0ɪ\0ˬ" + X will share same hashcode with X.
Explanation:
String.hashcode() returns Integer, and every Integer X in java has property that X = X + 2 * (Integer.MAX_VALUE + 1). Here, Integer.MAX_VALUE = 2 ^ 31 - 1;
So we only need to find String M, which has the property that M's hashcode % (2 * (Integer.MAX_VALUE + 1)) = 0;
I find "\u0096\0\0ɪ\0ˬ" : \u0096 's ascii code is 150,\0 's ascii code is 0, ɪ's ascii code is 618, ˬ's ascii code is 748, so its hashcode is 150 * 31 ^ 5 + 618 * 31 ^ 2 + 748 = 2 ^ 32 = 0;
It is up to you which string you would like, and I pick this one.
You can instrument the java.lang.String class so its method hashCode() will always return the same number.
I suppose Javassist is the easiest way to do such an instrumentation.
In short:
obtain an instance of java.lang.instrument.Instrumentation by using a Java-agent (see package java.lang.instrument documentation for details)
redefine java.lang.String class by using Instrumentation.redefineClasses(ClassDefinition[]) method
The code will look like (roughly):
ClassPool classPool = new ClassPool(true);
CtClass stringClass = classPool.get("java.lang.String");
CtMethod hashCodeMethod = stringClass.getDeclaredMethod("hashCode", null);
hashCodeMethod.setBody("{return 0;}");
byte[] bytes = stringClass.toBytecode();
ClassDefinition[] classDefinitions = new ClassDefinition[] {new ClassDefinition(String.class, bytes);
instrumentation.redefineClasses(classDefinitions);// this instrumentation can be obtained via Java-agent
Also don't forget that agent manifest file must specify Can-Redefine-Classes: true to be able to use redefineClasses(ClassDefinition[]) method.
String s = "Some String"
for (int i = 0; i < SOME_VERY_BIG_NUMBER; ++i) {
String copy = new String(s);
// Do something with copy.
}
Will this work for you? It just creates a lot of copies of the same String literal that you can then use in your testing.

Categories