Java String Mutability - java.lang.NoSuchFieldException: offset - java

I'm new to Java and I saw a Q&A section here with two examples where mutability is removed. Upon testing MutableString.java:
import java.lang.reflect.Field;
public class MutableString {
public static void main(String[] args) {
String s = "Immutable";
String t = "Notreally";
mutate(s, t);
StdOut.println(t);
// strings are interned so this doesn't even print "Immutable" (!)
StdOut.println("Immutable");
}
// change the first min(|s|, |t|) characters of s to t
public static void mutate(String s, String t) {
try {
Field val = String.class.getDeclaredField("value");
Field off = String.class.getDeclaredField("offset");
val.setAccessible(true);
off.setAccessible(true);
int offset = off.getInt(s);
char[] value = (char[]) val.get(s);
for (int i = 0; i < Math.min(s.length(), t.length()); i++)
value[offset + i] = t.charAt(i);
}
catch (Exception e) { e.printStackTrace(); }
}
}
I received the following error:
java.lang.NoSuchFieldException: offset
Any input on the following would be greatly appreciated:
a) why do I get this exception
b) how do I check which fields exist in a class (Java strings specifically)

Disclaimer: these kinds of hacks are interesting lessons in learning and fun trivia. But they are definitely not something that you want to use in any production code. It will lead to pain.
By their very nature, such a hack always depends on implementation details of the classes that are hacked.
In your case you seem to be using a String implementation that doesn't have an offset field, but uses some other mechanism (or maybe just a different name!).
For example, I've just reviewed the Oracle Java 7 String class and it no longer has the offset field (which was used in Java 6 and earlier to share the char[] among substrings)!*
You can use Class.getDeclaredFields() to check which fields this implementation does define:
for (Field f : String.class.getDeclaredFields()) {
System.out.println(f);
}
For a version of that hack that works with Java 7, you could do this:
public static void mutate(String s, String t) {
try {
Field val = String.class.getDeclaredField("value");
val.setAccessible(true);
char[] value = (char[]) val.get(s);
for (int i = 0; i < Math.min(s.length(), t.length()); i++)
value[i] = t.charAt(i);
}
catch (Exception e) { e.printStackTrace(); }
}
Of course, this too will break if the internals of String change again.
* Here's an Email that talks about that change, it seems that the sharing of the char[] only lead to improved performance in a few, special cases.

Related

Get variable value of string in java [duplicate]

This question already has answers here:
Get variable by name from a String
(6 answers)
Closed 4 years ago.
I would like it so that a user can tell my code that when a certain variable has a certain value it should do something. I have written a simple code sample of what I would like this to look like and I hope you can make sense of it. Is it in any way possible to make a String and let Java check wheter the variable that carries the same name is equal to the value of another variable.
int turn = 1;
String variable = "turn";
int compareToThisValue = 1;
if (variable.toVariable() == compareToThisValue) {
System.out.println("Yes it works thank you guys!");
{
I guess the following code can help. It uses java Reflection to get the job done. If you have some other requirements this can be tweaked to do so.
import java.lang.reflect.*;
class Test {
int turn = 1;
boolean checkValueVariable(String variableName, int value) throws Exception {
Field[] fields = this.getClass().getDeclaredFields();
for (Field field : fields) {
if (field.getName().equals(variableName))
return field.getInt(this) == value;
}
return false;
}
public static void main(String... args) {
Test test = new Test();
String variableName = "turn";
int variableValue = 1;
try {
System.out.println(test.checkValueVariable(variableName, variableValue));
} catch (Exception e) {
e.printStackTrace();
}
}
}

How to get the base pointer of Unicode character?

Currently I have "codePointAt" which returns the code point of the character from the string.
Is there any API or other way to get the base pointer of the current character?
public class Testclass {
public static void main(String[] args) {
String unicodeString = "कागज़";
int currentPoint = unicodeString.codePointAt(0);
// Now currentPoint = 0x0915
// I need currentPoint = 0x0900
}
}
Note# I cannot create the base pointer by addition/subtraction because different language's base point start from different One's/Ten's place values. For e.g.
Armenian - 0530-058F - Base pointer 0x0530(ten's place value)
Devanagari - 0900-097F - Base pointer 0x0900(hundred's place value)
Currently I'm using if-else blocks to get the base pointer which not dynamic and also lengthy approach. :-(
int basePointer;
if(currentPoint>0x600 && currentPoint<=0x6FF)//Means Arabic
{
basePointer = 0x0600;
}
if(currentPoint>0x900 && currentPoint<=0x97F)//Means Devnagri
{
basePointer = 0x0900;
}
OK, after thinking about this for a bit, here is a way to do it just using the Java API. It consists of three parts:
Regenerating the inaccessible block base table blockStarts in Character.UnicodeBlock into a Map
Using Character.UnicodeBlock.of(int) to look up the block name given the codepoint
Using the Map to lookup the Unicode block base given the block name
Note that regenerating the block base table is relatively slow at approx 10-15 ms on my machine, so it would probably be best to generate this once and reuse. I've left the rudimentary timing code in place.
private static final int SUPPLEMENTARY_PRIVATE_USE_AREA_A_BASE = 0x0F0000;
private static final int SUPPLEMENTARY_PRIVATE_USE_AREA_B_BASE = 0x100000;
private static final Character.UnicodeBlock SUPPLEMENTARY_PRIVATE_USE_AREA_A =
Character.UnicodeBlock.of(SUPPLEMENTARY_PRIVATE_USE_AREA_A_BASE);
private static final Character.UnicodeBlock SUPPLEMENTARY_PRIVATE_USE_AREA_B =
Character.UnicodeBlock.of(SUPPLEMENTARY_PRIVATE_USE_AREA_B_BASE);
public static Map<Character.UnicodeBlock, Integer> makeUnicodeBlockBaseMap() {
long startNanos = System.nanoTime();
Map<Character.UnicodeBlock, Integer> unicodeBases = new HashMap<>();
// Unicode blocks start on 16 (0x10) byte boundaries.
for (int cp = 0x00000; cp < SUPPLEMENTARY_PRIVATE_USE_AREA_A_BASE; cp += 0x10) {
Character.UnicodeBlock ucb = Character.UnicodeBlock.of(cp);
if (ucb != null) {
unicodeBases.putIfAbsent(ucb, cp);
}
}
// These blocks are huge, so add them manually.
unicodeBases.put(SUPPLEMENTARY_PRIVATE_USE_AREA_A, SUPPLEMENTARY_PRIVATE_USE_AREA_A_BASE);
unicodeBases.put(SUPPLEMENTARY_PRIVATE_USE_AREA_B, SUPPLEMENTARY_PRIVATE_USE_AREA_B_BASE);
long endNanos = System.nanoTime();
System.out.format("Total time = %.3f s%n", (endNanos - startNanos) / 1e9);
return unicodeBases;
}
public static void main(String[] args) {
Map<Character.UnicodeBlock, Integer> unicodeBlockBases = makeUnicodeBlockBaseMap();
String unicodeString = "कागज़";
int currentPoint = unicodeString.codePointAt(0);
Character.UnicodeBlock ucb = Character.UnicodeBlock.of(currentPoint);
System.out.println(ucb); // DEVANAGARI
System.out.format("0x%04X%n", unicodeBlockBases.get(ucb)); // 0x0900
}
You can put the start/end positions to SortedMaps for each language and check the codePoints:
private static final SortedSet<Integer, Integer> startToBase = new TreeMap<>();
private static final SortedSet<Integer, Integer> endToBase = TreeMap<>();
static {
// Fill the SortedMaps:
// latin
startToBase.put(0, 0);
endToBase.put(0x00ff, 0);
// ...
}
// Or load this from a web service, table or anything you find comfortable
public static final int baseCodePoint(int codePoint) {
// The codePoint should be inserted here (after)
int baseFromStart = startToBase.get(startToBase.headMap(codePoint + 1).lastKey());
// the code point should be inserted here (before).
int baseFromEnd = endToBase.get(endToBAse.tailMap(codePoint).firstKey());
if (baseFromStart == baseFromEnd) {
return baseFromStart;
}
throw new IllegalArgumentException(codePoint + " is unknown.");
}
This is what I have done, thanks to Gábor Bakos for inspiration:
TreeMap<Integer, Integer> languageCodePoints = new TreeMap<>();
languageCodePoints.put(0x0020, 0x007E);
languageCodePoints.put(0x00A0, 0x00FF);
languageCodePoints.put(0x0100, 0x017F);
languageCodePoints.put(0x0900, 0x097F); // Devanagri
// So on for all other languages, referred ISO/IEC 10646:2010
// for code points of present languages
In the function I used this only:
String unicodeString = "कागज़";
int currentPoint = unicodeString.codePointAt(0);
int startCodePoint = languageCodePoints.floorKey(currentPoint);
Now "startCodePoint = 0x900" which I really required. I think pretty simple way. :-P
Just one thing is that, I have to maintain "languageCodePoints" TreeMap for new language entries but far better than switch/if-else.
Thanks to all for such kind support. :-)
You can use bit manipulation to find the base pointers, something like this:
switch (codePoint & 0xffffff00) {
case 0x0600: // Arabic
case 0x0900: // Devnagri, though you might need to check it is below 0x97F
case 0x0000: // Latin
default: // Something else
}
Ah, sorry I think Armenian requires further processing, but hopefully the general idea is applicable for most of the languages.
public static int baseCodePoint(int codePoint) {
switch (codePoint & 0xffffff00) {
case 0x0900: if (codePoint < 0x0980) return 0x0900;
case 0x0500: if (codePoint >= 0x0530 && codePoint <= 0x058F) return 0x0530;
// case ...: other bases where it is not the real base
// Handling regular base pointers
default: return codePoint & 0xffffff00;
}
}

is This Data Abstration Example right or not?

i had given the following code in an interview. I want to know whether it is right or not..
public class DataAbstraction
{
public static void main (String args[])
{
MyDetails obj = new MyDetails();
obj.setNumebr(10);
obj.incrementBy(20);
int num = obj.getMumber();
System.out.println(num);
}
}
class MyDetails
{
private int n;
public void setNumebr(int i)
{
n = i;
}
public void incrementBy(int i)
{
n = n + i;
}
public int getMumber()
{
return n;
}
}
So please check it and correct me if i was wrong
There are many forms of abstractions in software. I would say that this is an example of data abstraction (though I would usually call it encapsulation). You could, if you would like to, change the member variable n to be of type... String(!), without changing the public interface of MyDetails.
Put differently: The details in the MyDetails class are hidden from the client code. The fact that MyDetails stores an int is abstracted away and it could be changed, for instance like this:
class MyDetails
{
private String n; // changed internal detail
public void setNumebr(int i)
{
n = "" + i;
}
public void incrementBy(int i)
{
n = "" + getMumber() + i;
}
public int getMumber()
{
return Integer.parseInt(n);
}
}
Have a look at the Wikipedia article on data abstraction for further details:
Abstraction > Data abstraction
Since there aren't enough details in the question its guessing time again:
1) No, its wrong. it contains various spelling errors like "getMumber" and "setNumebr".
2) Yes, if we ignore the spelling errors the methods seem to do what one would expect from their names.
2) No, it doesn't launch the rocket and it doesn't scale to multi processor machines (assuming these where the requirements).

how to convert a string to float and avoid using try/catch in java?

There are some situation that I need to convert string to float or some other numerical data-type but there is a probability of getting some nonconvertible values such as "-" or "/" and I can't verify all the values beforehand to remove them.
and I want to avoid using try/catch for this matter , is there any other way of doing a proper conversion in java? something similar to C# TryParse?
The simplest thing I can think of is java.util.Scanner . However this approach requires a new Scanner instance for each String.
String data = ...;
Scanner n = new Scanner(data);
if(n.hasNextInt()){//check if the next chars are integer
int i = n.nextInt();
}else{
}
Next you could write a regex pattern that you use to check the String (complex to fail too big values) and then call Integer.parseInt() after checking the string against it.
Pattern p = Pattern.compile("insert regex to test string here");
String data = ...;
Matcher m = p.matcher(data);
//warning depending on regex used this may
//only check part of the string
if(m.matches()){
int i = Integer.parseInt(data);
}
However both of these parse the string twice, once to test the string and a second time to get the value. Depending on how often you get invalid strings catching an exception may be faster.
Unfortunately, there is no such method in Java. There is no out parameter in Java, so writing such a method would need to return a null Float to signal an error, or to pass a FloatHolder object which could be modified by the method:
public class FloatHolder {
private float value;
public void setValue(float value) {
this.value = value;
}
public float getValue() {
return this.value;
}
}
public static boolean tryParseFloat(String s, FloatHolder holder) {
try {
float value = Float.parseFloat(s);
holder.setValue(value);
}
catch (NumberFormatException e) {
return false;
}
}
This is an old question, but since all the answers fail to mention this (and I wasn't aware of it myself until seeing it in a merge request written by a colleague), I want to point potential readers to the Guava Floats and Ints classes:
With the help of these classes, you can write code like this:
Integer i = Ints.tryParse("10");
Integer j = Ints.tryParse("invalid");
Float f = Floats.tryParse("10.1");
Float g = Floats.tryParse("invalid.value");
The result will be null if the value is an invalid int or float, and you can then handle it in any way you like. (Be careful to not just cast it to an int/float, since this will trigger a NullPointerException if the value is an invalid integer/floating point value.)
Note that these methods are marked as "beta", but they are quite useful anyway and we use them in production.
For reference, here are the Javadocs for these classes:
https://google.github.io/guava/releases/snapshot-jre/api/docs/com/google/common/primitives/Ints.html
https://google.github.io/guava/releases/snapshot-jre/api/docs/com/google/common/primitives/Floats.html
Java does not provide some built in tryParse type of methods, on of the solutions you can try is to create your own tryParse Method and put try/catch code in this method and then you can easily use this method across your application very easily and without using try/catch at all the places you use the method.
One of the sample functions can have following code
public static Long parseLong(String value) {
if(isNullOrEmpty(value)) {
return null;
}
try {
return Long.valueOf(value);
}
catch (NumberFormatException e) {
}
return null;
}
Regular expressions helped me solve this issue. Here is how:
Get the string input.
Use the expression that matches one or more digits.
Parse if it is a match.
String s = "1111";
int i = s.matches("^[0-9]+$") ? Integer.parseInt(s) : -1;
if(i != -1)
System.out.println("Integer");
else
System.out.println("Not an integer");

How does this normalize function work?

I was doing a Junit tutorial and I came across this normalize function that was being tested. It was defined like this:
public static String normalizeWord(String word) {
try {
int i;
Class<?> normalizerClass = Class.forName("java.text.Normalizer");
Class<?> normalizerFormClass = null;
Class<?>[] nestedClasses = normalizerClass.getDeclaredClasses();
for (i = 0; i < nestedClasses.length; i++) {
Class<?> nestedClass = nestedClasses[i];
if (nestedClass.getName().equals("java.text.Normalizer$Form")) {
normalizerFormClass = nestedClass;
}
}
assert normalizerFormClass.isEnum();
Method methodNormalize = normalizerClass.getDeclaredMethod(
"normalize",
CharSequence.class,
normalizerFormClass);
Object nfcNormalization = null;
Object[] constants = normalizerFormClass.getEnumConstants();
for (i = 0; i < constants.length; i++) {
Object constant = constants[i];
if (constant.toString().equals("NFC")) {
nfcNormalization = constant;
}
}
return (String) methodNormalize.invoke(null, word, nfcNormalization);
} catch (Exception ex) {
return null;
}
}
How does this function work? What is it actually doing?
It does the same as:
import java.text.Normalizer;
try {
return Normalizer.normalize(word, Normalizer.Form.NFC);
} catch (Exception ex) {
return null;
}
Except that all operations are performed via Reflection.
It's using reflection to call
java.text.Normalizer.normalize(word, java.text.Normalizer.Form.NFC);
Presumably to allow it to run on Java versions before 1.6 which don't have this class.
This function offers services regarding strings normalization for Unicode.
In Unicode, you can represent the same thing in many ways. For example, you have a character with accent. You can represent it joined, using one single Unicode character, or decomposed (the original letter, without accents, then the modifier - the accent).
The class comes in Java 6. For Java 5, there's a SUN proprietary class.
See class info.olteanu.utils.TextNormalizer in Phramer project (http://sourceforge.net/projects/phramer/ , www.phramer.org ) for a way to get a normalizer both in Java 5 (SUN JDK) and in Java 6, without any compilation issues (the code will compile in any version >= 5 and the code will run in both JVMs, although SUN discarded the Java 5 proprietary class).

Categories