length of a String with surrogate characters in it - java - java

I am having trouble counting the length of my String which has some surrogate characters in it ?
my String is,
String val1 = "\u5B66\uD8F0\uDE30";
The problem is, \uD8F0\uDE30 is one character not two, so the length of the String should be 2.
but when I am calculating the length of my String as val1.length() it gives 3 as output, which is totally wrong. how can I fix the problem and get the actual length of the String?

You can use codePointCount(beginIndex, endIndex) to count the number of code points in your String instead of using length().
val1.codePointCount(0, val1.length())
See the following example,
String val1 = "\u5B66\uD8F0\uDE30";
System.out.println("character count: " + val1.length());
System.out.println("code points: "+ val1.codePointCount(0, val1.length()));
output
character count: 3
code points: 2
FYI, you cannot print individual surrogate characters from a String using charAt() either.
In order to print individual supplementary character from a String use codePointAt and offsetByCodePoints(index, codePointOffset), like this,
for (int i =0; i<val1.codePointCount(0, val1.length()); i++)
System.out.println("character at " + i + ": "+ val1.codePointAt(val1.offsetByCodePoints(0, i)));
}
gives,
character at 0: 23398
character at 1: 311856
for Java 8
You can use val1.codePoints(), which returns an IntStream of all code points in the sequence.
Since you are interested in length of your String, use,
val1.codePoints().count();
to print code points,
val1.codePoints().forEach(a -> System.out.println(a));

Related

Hashmap in for loop not reading all the input

This is for AOC day 2. The input is something along the lines of
"6-7 z: dqzzzjbzz
13-16 j: jjjvjmjjkjjjjjjj
5-6 m: mmbmmlvmbmmgmmf
2-4 k: pkkl
16-17 k: kkkkkkkkkkkkkkkqf
10-16 s: mqpscpsszscsssrs
..."
It's formatted like 'min-max letter: password' and seperated by line. I'm supposed to find how many passwords meet the minimum and maximum requirements. I put all that prompt into a string variable and used Pattern.quote("\n") to seperate the lines into a string array. This worked fine. Then, I replaced all the letters except for the numbers and '-' by making a pattern Pattern.compile("[^0-9]|-"); and running that for every index in the array and using .trim() to cut off the whitespace at the end and start of each string. This is all working fine, I'm getting the desired output like 6 7 and 13 16.
However, now I want to try and split this string into two. This is my code:
HashMap<Integer,Integer> numbers = new HashMap<Integer,Integer>();
for(int i = 0; i < inputArray.length; i++){
String [] xArray = x[i].split(Pattern.quote(" "));
int z = Integer.valueOf(xArray[0]);
int y = Integer.valueOf(xArray[1]);
System.out.println(z);
System.out.println(y);
numbers.put(z, y);
}
System.out.println(numbers);
So, first making a hasmap which will store <min, max> values. Then, the for loop (which runs 1000 times) splits every index of the 6 7 and 13 16 string into two, determined by the " ". The System.out.println(z); and System.out.println(y); are working as intended.
6
7
13
16
...
This output goes on to give me 2000 integers seperated by a line each time. That's exactly what I want. However, the System.out.println(numbers); is outputting:
{1=3, 2=10, 3=4, 4=7, 5=6, 6=9, 7=12, 8=11, 9=10, 10=18, 11=16, 12=13, 13=18, 14=16, 15=18, 16=18, 17=18, 18=19, 19=20}
I have no idea where to even start with debugging this. I made a test file with an array that is formatted like "even, odd" integers all the way up to 100. Using this exact same code (I did change the variable names), I'm getting a better output. It's not exactly desired since it starts at 350=351 and then goes to like 11=15 and continues in a non-chronological order but at least it contains all the 100 keys and values.
Also, completely unrelated question but is my formatting of the for loop fine? The extra space at the beginning and the end of the code?
Edit: I want my expected output to be something like {6=7, 13=16, 5=6, 2=4, 16=17...}. Basically, the hashmap would have the minimum and maximum as the key and value and it'd be in chronological order.
The problem with your code is that you're trying to put in a nail with a saw. A hashmap is not the right tool to achieve what you want, since
Keys are unique. If you try to input the same key multiple times, the first input will be overwritten
The order of items in a HashMap is undefined.
A hashmap expresses a key-value-relationship, which does not exist in this context
A better datastructure to save your Passwords would probably just be a ArrayList<IntegerPair> where you would have to define IntegerPair yourself, since java doesn't have the notion of a type combining two other types.
I think you are complicating the task unnecessarily. I would proceed as follows:
split the input using the line separator
for each line remove : and split using the spaces to get an array with length 3
build from the array in step two
3.1. the min/max char count from array[0]
3.2 charachter classes for the letter and its negation
3.3 remove from the password all letters that do not correspond to the given one and check if the length of the password is in range.
Something like:
public static void main(String[] args){
String input = "6-7 z: dqzzzjbzz\n" +
"13-16 j: jjjvjmjjkjjjjjjj\n" +
"5-6 m: mmbmmlvmbmmgmmf\n" +
"2-4 k: pkkl\n" +
"16-17 k: kkkkkkkkkkkkkkkqf\n" +
"10-16 s: mqpscpsszscsssrs\n";
int count = 0;
for(String line : input.split("\n")){
String[] temp = line.replace(":", "").split(" "); //[6-7, z, dqzzzjbzz]
String minMax = "{" + (temp[0].replace('-', ',')) + "}"; //{6,7}
String letter = "[" + temp[1] + "]"; //[z]
String letterNegate = "[^" + temp[1] + "]"; //[^z]
if(temp[2].replaceAll(letterNegate, "").matches(letter + minMax)){
count++;
}
}
System.out.println(count + "passwords are valid");
}

indexOf method asking for a char that appears multiple times?

String str = "Aardvark";
str.indexOf('a');
I was wondering what index str would return if it asked for a certain character and the string contained multiple of it. For example, aardvark: would the method return index 0, for the first instance it saw the char? There are 3 'a' chars in the word, so which would it return?
One additional question (couldn't fit it in the original question)
What is the difference between
str.indexOf('a');
and
str.indexOf("a");
I know the first is a char and the second is a String, but if str = "Aardvark", wouldn't the second statement return -1 or some sort of error, because "a" refers to a single-character String, not one char of a string?
I'm very sorry if this was unclear, I couldn't really think of a better way to pose my question. Thanks in advance!
indexOf() will return the index of the first occurrence of the string/char
like you say, one looks for a char and the other on a sub string. "a" will be found, as "a" is a substring of "Aardvark"
It would print the first occurence..
To get the second occurence you
would have to
fill in
indexOf(char c, int lookafterfirstindex);
indexOf can also take those two parameters instead of just the char.
Link to API Doc:
https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#indexOf(java.lang.String,%20int)
Here is a simple example:
String text = "abcd_a";
System.out.println("Index of a: "+ text.indexOf('a')); // Index of a: 0
System.out.println("Index of a: "+ text.indexOf("a")); // Index of a: 0
System.out.println("Index of b: "+ text.indexOf('b')); // Index of b: 1
System.out.println("Index of c: "+ text.indexOf('c')); // Index of c: 2
System.out.println("Index of z: "+ text.indexOf('z')); // Index of z: -1
simple index of:
indexOf(char/string) will always return the first index of the occurrence.
from index:
There is also indexOf(char/string, int fromIndex) - which will search from a given position in your string.
last index:
There is a lastIndexOf(char/string) - which will search last occurrence.
Regarding the char vs String, I would use char if I only need one char index lookup. The char will peform much faster than the String index-lookup-methods!!!
Java String Spec

Find the letter that occur most times from user with using tables [duplicate]

This question already has answers here:
Java program to find the character that appears the most number of times in a String?
(8 answers)
Closed 6 years ago.
I got a task from my university today:
Write a program that reads a ( short ) text from the user and prints the so called max letter (most common character in string) , that the letter which the greatest number of occurrences of the given text .
Here it is enough to look at English letters (A- Z) , and not differentiate between uppercase and lowercase letters in the count of the number of occurrences .
For example, if : text = " Ada bada " so should the print show the most common character, this example it would be a.
This is an introductory course, so in this submission we do not need to use the " scanner - class" . We have not gone through this so much.
The program will use the show message input two get the text from user .
Info: The program shall not use while loop ( true / false ) , "return " statement / "break " statement .
I've been struggling with how I can get char values into a table.. am I correct I need to use array to search for most common character? I think I need to use the binarySearch, but that only supports int not char.
I'll be happy for any answers. hint's and solutions. etc.. if you're very kind a full working program, but again please don't use the things I have written down in the "info" section above.
My code:
String text = showInputDialog("Write a short text: ");
//format string to char
String a = text;
char c = a.charAt(4);
/*with this layout it collects number 4 character in the text and print out.
* I could as always go with many char c... but that wouldn't be a clean program * code.. I think I need to make it into a for-loop.. I have only worked with * *for-loops with numbers, not char (letters).. Help? :)
*/
out.print( text + "\n" + c)
//each letter into 1 char, into table
//search for most used letter
Here's the common logic:
split your string into chars
loop over the chars
store the occurrences in a hash, putting the letter as key and occurrences as value
return the highest value in the hash
As how to split string into chars, etc., you can use Google. :)
Here's a similar question.
There's a common program asked to write in schools to calculate the frequency of a letter in a given String. The only thing you gotta do here is find which letter has the maximum frequency. Here's a code that illustrates it:
String s <--- value entered by user
char max_alpha=' '; int max_freq=0, ct=0;
char c;
for(int i=0;i<s.length();i++){
c=s.charAt(i);
if((c>='a'&&c<='z')||(c>='A'&&c<='Z')){
for(int j=0;j<s.length();j++){
if(s.charAt(j)==c)
ct++;
} //for j
}
if(ct>max_freq){
max_freq=ct;
max_alpha=c;
}
ct=0;
s=s.replace(c,'*');
}
System.out.println("Letter appearing maximum times is "+max_alpha);
System.out.println(max_alpha+" appears "+max_freq+" times");
NOTE: This program presumes that all characters in the string are in the same case, i.e., uppercase or lowercase. You can convert the string to a particular case just after getting the input.
I guess this is not a good assigment, if you are unsure about how to start. I wish you for having better teachers!
So you have a text, as:
String text = showInputDialog("Write a short text: ");
The next thing is to have a loop which goes trough each letter of this text, and gets each char of it:
for (int i=0;i<text.length();i++) {
char c=text.charAt(i);
}
Then comes the calculation. The easiest thing is to use a hashMap. I am unsure if this is a good topic for a beginners course, so I guess a more beginner friendly solution would be a better fit.
Make an array of integers - this is the "table" you are referring to.
Each item in the array will correspond to the occurrance of one letter, e.g. histogram[0] will count how many "A", histogram[1] will count how many "B" you have found.
int[] histogram = new int[26]; // assume English alphabet only
for (int i=0;i<histogram.length;i++) {
histogram[i]=0;
}
for (int i=0;i<text.length();i++) {
char c=Character.toUppercase(text.charAt(i));
if ((c>=65) && (c<=90)) {
// it is a letter, histogram[0] contains occurrences of "A", etc.
histogram[c-65]=histogram[c-65]+1;
}
}
Then finally find the biggest occurrence with a for loop...
int candidate=0;
int max=0;
for (int i=0;i<histogram.length;i++) {
if (histogram[i]>max) {
// this has higher occurrence than our previous candidate
max=histogram[i];
candidate=i; // this is the index of char, i.e. 0 if A has the max occurrence
}
}
And print the result:
System.out.println(Character.toString((char)(candidate+65));
Note how messy this all comes as we use ASCII codes, and only letters... Not to mention that this solution does not work at all for non-English texts.
If you have the power of generics and hashmaps, and know some more string functions, this mess can be simplified as:
String text = showInputDialog("Write a short text: ");
Map<Char,Integer> histogram=new HashMap<Char,Integer>();
for (int i=0;i<text.length();i++) {
char c=text.toUppercase().charAt(i));
if (histogram.containsKey(c)) {
// we know this letter, increment its occurrence
int occurrence=histogram.get(c);
histogram.put(c,occurrence+1);
}
else {
// we dunno this letter yet, it is the first occurrence
histogram.put(c,1);
}
}
char candidate=' ';
int max=0;
for (Char c:histogram.keySet()) {
if (histogram.get(c)>max) {
// this has higher occurrence than our previous candidate
max=histogram.get(c);
candidate=c; // this is the char itself
}
}
System.out.println(c);
small print: i didn't run this code but it shall be ok.

How to place spaces in between input textfield

I am trying to place spaces in between a number that has been entered in a textfield. I am using the following code:
for(int i = 0; i <= 2; i++)
{
char cijfer = tf1.getText().charAt(i);
char getal1 = tf1.getText().charAt(0);
char getal2 = tf1.getText().charAt(1);
char getal3 = tf1.getText().charAt(2);
}
String uitvoerGetal = getal1 + " " + getal2 + " " + getal3;
I suppose I don't understand the charAt() function yet, does anyone have a link explaining it in a way so I might be able to make this work too? Thanks in advance!
Example:
public class Test {
public static void main(String args[]) {
String s = "Strings are immutable";
char result = s.charAt(8);
System.out.println(result);
}
}
This produces the following result:
a
In more Detail From java docs
public char charAt(int index)
Returns the char value at the specified index. An index ranges from 0 to length() - 1. The first char value of the sequence is at index 0, the next at index 1, and so on, as for array indexing.
If the char value specified by the index is a surrogate, the surrogate value is returned.
Specified by:
charAt in interface CharSequence
Parameters:
index - the index of the char value.
Returns:
the char value at the specified index of this string. The first char value is at index 0.
Throws:
IndexOutOfBoundsException - if the index argument is negative or not less than the length of this string.
In straight words You can't. You can't add space in int datatype because int is meant to store the integer value only. Change int to String to store the space in between.
Okay, let's see what's wrong with your code...
Your for-loop is 1-based instead of the standard 0-based. That's not good at all.
You're attempting to assign a char to a String (3 times), the first call to charAt is correct, but for some reason you then switch to using a String?
Finally you're attempting to assign a String to an int, which is just completely nonsensical.
You have a number of problems, but well done on an honest attempt.
First up, the indexes in a string are zero-based, so charAt(0) gives you the first character, charAt(1) gives you the second character, and so on.
Secondly, repeating all your calls to charAt three times is probably unnecessary.
Thirdly, you must be careful with your types. The return value from charAt is a char, not a String, so you can't assign it to a String variable. Likewise, on the last line, don't assign a String to an int variable.
Lastly, I don't think you've thought about what happens if the text field doesn't contain enough characters.
Bearing these points in mind, please try again, and ask for further help if you need it.
Try following code
String text = tf1.getText(); // get string from jtextfield
StringBuilder finalString = new StringBuilder();
for(int index = 0; index <text.length(); index++){
finalString.append(text.charAt(index) + " "); // add spaces
}
tf1.setText(finalString.toString().trim()) // set string to jtextfield

Shortening a string in Java

I have a requirement to shorten a 6 character string like "ABC123" into a unique 4 character string. It has to be repeatable so that the input string will always generate the same output string. Does anyone have any ideals how to do this?
It is not possible to do a fully unique mapping from a 6 character string to a 4 character string. This is an example of a simple hash function. Because the range space is smaller than the domain space, you are necessarily going to have some hash collisions. You can try to minimize the number of collisions based on the type of data you're going to be accepting, but ultimately it's impossible to map every 6 character string to a unique 4 character string, you would run out of 4 character strings.
You need some restrictions on the input string, otherwise math will inevitably bite you.
For example, let's assume you know that it consists of upper case letters and digits only. Therefore, there are 36^6 possible input strings.
The result needs to have less restrictions, e.g. you allow 216 different characters (printable extended ascii or something like that).
By pure coincidence, 216^4 = 36^6, so what you need is a mapping. That's easy, just use the algorithm for converting number representations from one radix to another.
Not sure this can be done, as I would bet there are some business constraints (like a user has to be able to type in the key).
The idea is to "hash" down the value into a smaller number of places. This requires a character set large enough to handle all combinations.
Let's assume the original key is case insensitive, you have 26 + 10 = 32, then raised to the 6th unique combinations (2,176,782,336 unique combinations). To match this in only 4 characters, you have to use a character set with 216 unique characters, as 216 ^ 6 is 2,176,782,336 or the first number raise to 4 with more combinations than a case insensitive key with numbers. (case insentivity, plus numerics only takes you to 62).
If we take the standard US keyboard, we have 26 letters x 2 cases = 52
10 numbers
10 special characters on number keys
11 other special character keys * 2 = 22
This is 94 unique characters, or less than half the uniques you need just to get a case insensitive 6 digit code into 4 digits. Now, on the Planet Klingon, where keyboards are much larger ... ;-)
If the key is case insensitive, your character set has to expand to 489 unique characters to fit in a 4 digit "hash". Ouch!
Assumption: The input string can only have characters with ASCII decimal values below 128... otherwise, as others have stated, this wont work.
public class Foo {
public static int crunch(String str) {
int y = 0;
int limit = str.length() > 6 ? 6 : str.length();
for (int i = 0; i < limit; ++i) {
y += str.charAt(i) * (limit - i);
}
return y;
}
public static void main(String[] args) {
String[] words = new String[]{
"abcdef", "acdefb", "fedcba", "}}}}}}", "ZZZZZZ", "123", "!"
};
for (int idx = 0; idx < words.length; ++idx) {
System.out.printf("in=%-6s out=%04d\n",
words[idx], crunch(words[idx]));
}
}
}
Generates:
in=abcdef out=2072
in=acdefb out=2082
in=fedcba out=2107
in=}}}}}} out=2625
in=ZZZZZZ out=1890
in= 123 out=0298
in= ! out=0033
You have to make assumptions about the range of values the characters can have and when is an acceptable encoded character. There are any number of ways you can do this. You could pack the String to 1,2,3,4 or 5 characters depending on your assumptions.
One simple example which would give you 4 characters is to assume the last three letters are a number.
public static String pack(String text) {
return text.substring(0, 3) + (char) Integer.parseInt(text.substring(3));
}
public static String unpack(String text) {
return text.substring(0, 3) + ("" + (1000 + text.charAt(3))).substring(1);
}
public static void main(String[] args) throws IOException {
String text = "ABC123";
String packed = pack(text);
System.out.println("packed length= " + packed.length());
String unpacked = unpack(packed);
System.out.println("unpacked= '" + unpacked + '\'');
}
prints
packed length= 4
unpacked= 'ABC123'

Categories