Convert code with pointers in C to Java code - java

I am having some difficulty in understanding how to write the below piece of code using String or char[] in Java.
void xyz(char *a, int startIndex, int endIndex)
{
int j;
for (j = startIndex; j <= endIndex; j++)
{
doThis((a+startIndex), (a+j));
xyz(a, startIndex+1, endIndex);
}
}
Here char *a points to the starting location of the char name[]
The above are just some random functions, but I just want the logic of how to use char* and character index char[] in Java

Based on the rephrased question from the comment thread:
You cannot change the characters of a Java String. If you need to modify a sequence of characters, use StringBuilder, which supports setCharAt(int, char), insert(int, char), and append(char). You can use new StringBuilder(myString) to convert a String to a StringBuilder, and stringBuilder.toString() to convert back.
This is perfectly legit Java code -- it's not code smelly, it's just the way you work with mutable character sequences.

A char* in C is, as you noted, pointing to the start of your character array (which is how C manages Strings).
In C the size of a char is one byte, and pointers always point to the start of a byte. Your C String is an array of characters, so adding 1 to a pointer moves the start of your string right by one character.
That means that the C code:
char *a;
// Set the String here
a = a + 1;
translates in Java to something like:
String a;
// Set the String here
a = a.substring(1);
or if you are using a char array:
char[] a;
// Set the array contents here
char[] copyTo = new char[a.length];
System.arraycopy(a, 1, copyTo, 0, a.length);
a = copyTo;
Java will be a bit more careful of protecting you that C will be though. For instance, if you have a zero length string, the C code has the potential to either segfault (crashing the application) or give you a gibberish string full of memory junk (then, eventually, crash the application), whereas the Java code will throw an exception (normally an IndexOutOfBoundsException) which you can, hopefully, handle cleanly.
Remember though, that String in Java are immutable. You cannot change them, you can only create new Strings. Fortunately, String has several built in functions which allow you to do a lot of the standard actions, like replace part of the String with another and return the result. A character array is mutable, and you can change the characters within them, but you will lose a lot of the nice benefits you get from using the proper String class.

Simple Answer:
You can't do exactly that. Java is pass by reference only. You don't have access to memory location information, so you can't do arithmetic with it.
Longer Answer:
It looks like you are passing in a string for manipulation. You have several options to simulate that.
You can convert the string to an array of characters and then pass in a char[]. If your manipulations are not any sort of standard string operation and completely custom this is probably what you need to do. Keep in mind that you can't change the size of the array passed in, nor can you have a point at a new array after the function completes. (again, only pass by value). Only the values of the existing elements of the array can be modified.
You can pass in the String and use the String methods, such as subString() (which your begin and end indexes seem to suggest, but this may not meet your needs. Note that strings are immutable however, and you can only get a result out via the return statement.
If you really need to modify the contents of the object passed in you can pass a StringBuilder, StringBuffer or CharBuffer object and modify away.
There's a hack that can also be used to circumvent pass by reference, but it's poor style except in special situations. Pass in an array of whatever you need to modify, so in this case an array of array of characters would allow you to set a new sub-array, and effectively acheive pass by reference, but try not to do this :)

If your method modifies the values you cant use String as that is immutable, you can use StringBuilder instead.
If your methods already rely on char arrays and you need the offsets you can use a CharBuffer to wrap an array. It does not support String operations but supports views for sub ranges, which seems to be what you use in the doThis() method.

Related

Source code of a String Class in Java

I can't figure out what this lines do:
this.value = "".value;
and
int len2 = anotherString.value.length;
There is an array called "values" in the beginning of the class. It's something like String.Array (Array of chars).
How does it work?
Since the value field contains the characters for the String instance, what you see is the constructor using the value field of the (usually) interned empty string constant.
This way, an empty string created with the empty constructor will all use the same instance of the array. Strings are immutable, so they can and usually do, share the underlying char array to save on memory usage.

How to access array elements by pointer arithmetic in Java

Given the following declaration in C, I can apply '+' to the address, and access
other elements.
char toto[5];
In other words, applying this operator +
toto+0x04
Accesses a different array element in Java.
Is there another way to implement this operation in java ?
Many Thanks
If I'm right you want to access the element at that position in the array. You can do this in java.
char foo[] = new char[]{'1', '2','3','4', '5'};
char fooAtPositionFour = foo[4];
and assign a new value this way:
foo[4] = 'x';
Technically no since toto+4 is an address and Java's memory management policy is quite different from C's one. However you can get *(toto+4) with toto[4] ;)
Almost always there is another way in java to implement what you want.
You rarely process char[] directly. Using String and StringBuilder is preferred.
String toto = "hello";
String lastChar = toto.substring(4); // from the 4th character.
To answer your question literally. You can use the sun.misc.Unsafe with get/putByte or get/putChar class to do pointer arithmetic, but I suggest avoiding it unless you really, really need to.
in fact i need to isolate the last four byte where param is a char[]
Do you need the last four bytes, or the last char? The last char then toto[toto.length-1]
For the last four bytes you would need to turn the char array (UTF-16 in Java, I've no idea what the encoding would be in C) into a byte array then take the last four bytes.
new String(toto).toBytes("THE_CHAR_ENCODING_YOU_WANT_TO_USE")

How can I create a character array in Java without a specified length?

char ret[] = {};
Doesn't work seem to work and I'm not sure what the best way to do this is.
Any help would be greatly appreciated.
Arrays must have a fixed length.
If your goal is to have a dynamically expansible list, consider a List instead. Everytime you add an item by add() method, it will grow dynamically whenever needed.
List<Character> chars = new ArrayList<Character>();
// ...
See also:
Java Tutorials - Trail: Collections - The List Interface
You're probably looking for an ArrayList<Character>.
char[] ret = new char[0];
this will create an empty array. But you can use it only as a placeholder for cases when you don't have an array to pass.
If you don't know the size initially, but want to add to the array, then use an ArrayList<Character>
The best way to have an extensible array of char without the overhead of an Character object for each char, is to use a StringBuilder. This allows you to build an array of char in a wide variety of ways. Once you are finished you can use getChars() to extract a copy of the char[].
you should use:
char[] ret = {};
You cannot create an array without a length only with a length of 0, and why would u want that?
Your question doesn't really make sense; an array always has some length. But is this what you were thinking of?
char[] ret = null;
This creates a reference to an array, but initialises it to null. There is no actual array yet.
Methinks you are still thinking in C or Pascal. The functionality you want is probably a Java String. char[] is uncommon in Java. (You can iterate through the chars of a string with charAt, among other possibilities.)
You probably come from PHP or another programming language that hasn't got real arrays. (An array in PHP is a sort of dictionary/hashmap)
Arrays in JAVA are fixed length.
For no fixed length, you can use a Vector. If you want to index with something else than integers from 0 to length, you can use a Dictionary.

Why does appending "" to a String save memory?

I used a variable with a lot of data in it, say String data.
I wanted to use a small part of this string in the following way:
this.smallpart = data.substring(12,18);
After some hours of debugging (with a memory visualizer) I found out that the objects field smallpart remembered all the data from data, although it only contained the substring.
When I changed the code into:
this.smallpart = data.substring(12,18)+"";
..the problem was solved! Now my application uses very little memory now!
How is that possible? Can anyone explain this? I think this.smallpart kept referencing towards data, but why?
UPDATE:
How can I clear the big String then? Will data = new String(data.substring(0,100)) do the thing?
Doing the following:
data.substring(x, y) + ""
creates a new (smaller) String object, and throws away the reference to the String created by substring(), thus enabling garbage collection of this.
The important thing to realise is that substring() gives a window onto an existing String - or rather, the character array underlying the original String. Hence it will consume the same memory as the original String. This can be advantageous in some circumstances, but problematic if you want to get a substring and dispose of the original String (as you've found out).
Take a look at the substring() method in the JDK String source for more info.
EDIT: To answer your supplementary question, constructing a new String from the substring will reduce your memory consumption, provided you bin any references to the original String.
NOTE (Jan 2013). The above behaviour has changed in Java 7u6. The flyweight pattern is no longer used and substring() will work as you would expect.
If you look at the source of substring(int, int), you'll see that it returns:
new String(offset + beginIndex, endIndex - beginIndex, value);
where value is the original char[]. So you get a new String but with the same underlying char[].
When you do, data.substring() + "", you get a new String with a new underlying char[].
Actually, your use case is the only situation where you should use the String(String) constructor:
String tiny = new String(huge.substring(12,18));
When you use substring, it doesn't actually create a new string. It still refers to your original string, with an offset and size constraint.
So, to allow your original string to be collected, you need to create a new string (using new String, or what you've got).
I think this.smallpart kept
referencing towards data, but why?
Because Java strings consist of a char array, a start offset and a length (and a cached hashCode). Some String operations like substring() create a new String object that shares the original's char array and simply has different offset and/or length fields. This works because the char array of a String is never modified once it has been created.
This can save memory when many substrings refer to the same basic string without replicating overlapping parts. As you have noticed, in some situations, it can keep data that's not needed anymore from being garbage collected.
The "correct" way to fix this is the new String(String) constructor, i.e.
this.smallpart = new String(data.substring(12,18));
BTW, the overall best solution would be to avoid having very large Strings in the first place, and processing any input in smaller chunks, aa few KB at a time.
In Java strings are imutable objects and once a string is created, it remains on memory until it's cleaned by the garbage colector (and this cleaning is not something you can take for granted).
When you call the substring method, Java does not create a trully new string, but just stores a range of characters inside the original string.
So, when you created a new string with this code:
this.smallpart = data.substring(12, 18) + "";
you actually created a new string when you concatenated the result with the empty string.
That's why.
As documented by jwz in 1997:
If you have a huge string, pull out a substring() of it, hold on to the substring and allow the longer string to become garbage (in other words, the substring has a longer lifetime) the underlying bytes of the huge string never go away.
Just to sum up, if you create lots of substrings from a small number of big strings, then use
String subtring = string.substring(5,23)
Since you only use the space to store the big strings, but if you are extracting a just handful of small strings, from losts of big strings, then
String substring = new String(string.substring(5,23));
Will keep your memory use down, since the big strings can be reclaimed when no longer needed.
That you call new String is a helpful reminder that you really are getting a new string, rather than a reference to the original one.
Firstly, calling java.lang.String.substring creates new window on the original String with usage of the offset and length instead of copying the significant part of underlying array.
If we take a closer look at the substring method we will notice a string constructor call String(int, int, char[]) and passing it whole char[] that represents the string. That means the substring will occupy as much amount of memory as the original string.
Ok, but why + "" results in demand for less memory than without it??
Doing a + on strings is implemented via StringBuilder.append method call. Look at the implementation of this method in AbstractStringBuilder class will tell us that it finally do arraycopy with the part we just really need (the substring).
Any other workaround??
this.smallpart = new String(data.substring(12,18));
this.smallpart = data.substring(12,18).intern();
Appending "" to a string will sometimes save memory.
Let's say I have a huge string containing a whole book, one million characters.
Then I create 20 strings containing the chapters of the book as substrings.
Then I create 1000 strings containing all paragraphs.
Then I create 10,000 strings containing all sentences.
Then I create 100,000 strings containing all the words.
I still only use 1,000,000 characters. If you add "" to each chapter, paragraph, sentence and word, you use 5,000,000 characters.
Of course it's entirely different if you only extract one single word from the whole book, and the whole book could be garbage collected but isn't because that one word holds a reference to it.
And it's again different if you have a one million character string and remove tabs and spaces at both ends, making say 10 calls to create a substring. The way Java works or worked avoids copying a million characters each time. There is compromise, and it's good if you know what the compromises are.

String can't change. But int, char can change

I've read that in Java an object of type String can't change. But int and char variables can. Why is it? Can you give me an example?
Thank you.
(I am a newer -_- )
As bzabhi said, strings are immutable in Java. This means that a string object will never change. This does not mean you can not change string variables, just that you cannot change the underlying memory representation of the string. for an example:
String str = "Hello";
str += " World!";
Following the execution of these lines, str will point to a new string in memory. The original "Hello" string still exists in memory, but most likely it will not be there for long. Assuming that there are no extenuating circumstances, nothing will be pointing at the original string, so it will be garbage collected.
I guess the best way to put this would be to say that when line 2 of the example executes, a new string in memory is created from the concatenation of the original string and the string being added to it. The str variable, which is just a reference to a memory location, is then changed to point at the new variable that was just created.
I am not particularly knowledgeable on the point, but, as I understand it, this is what happens with all "non-primitive" values. Anything that at some point derives from Object follows these rules. Primitive values, such as ints, bools, chars, floats and doubles allow the actual value in memory to be changed. So, from this:
int num = 5;
num += 2;
the actual value in memory changes. Rather than creating a new object and changing the reference, this code sample will simply change the value in memory for the num variable.
As for why this is true, it is simply a design decision by the makers of Java. I'm sure someone will comment on why this was made, but that isn't something I know.
int and char can't change either. As with strings, you can put a different value into the same variable, but an integer itself doesn't change. 3 will always be 3; you can't modify it to be 4.
String is an immutable type (the value inside of it cannot change). The same is true for all primitive types (boolean, byte, char, short, int, long, float, and double).
int x;
String s;
x = 1;
x = 2;
s = "hello";
s = "world";
x++; // x = x + 1;
x--; // x = x - 1;
As you can see, in no case can you alter the constant value (1, 2, "hello", "world") but you can alter where they are pointing (if you warp your mind a bit and say that an int variable points at a constant int value).
I'm not sure that it is possible to show (by example) that Strings cannot change. But you can confirm this by reading the description section of Javadoc for the String class, then reading the methods section and noting that there are no methods that can change a String.
EDIT: There are many reasons why Strings are designed to be immutable in Java. The most important reason is that immutable Strings are easier to use correctly than mutable ones. And if you do need the mutable equivalent of a String for some reason, you can use the StringBuilder (or StringBuffer) class.
It's also worthwhile to note that since strings are immutable, that if they are passed into a method, they can't be modified inside of the method and then have those changes seen outside of the method scope.
public void changeIt(String s) {
// I can't do anything to s here that changes the value
// original string object passed into this method
}
public void changeIt(SomeObject o) {
// if SomeObject is mutable, I can do things to it that will
// be visible outside of this method call
}
This little article can probably explain it better than I can: http://www.jchq.net/tutorial/09_02Tut.htm
Strings are immutable in java. Nevertheless, you can still append or prepend values to strings. By values, I mean primitive data types or other strings.
However, a StringBuffer is mutable, i.e. it can be changed in memory (a new memory block doesn't have to be allocated), which makes it quite efficient. Also, consider the following example:
StringBuffer mystringbuffer = new StringBuffer(5000);
for (int i = 0; i<=1000; i++)
{
mystringbuffer.append ( 'Number ' + i + '\n');
}
System.out.print (mystringbuffer);
Rather than creating one thousand strings, we create a single object (mystringbuffer), which can expand in length. We can also set a recommended starting size (in this case, 5000 bytes), which means that the buffer doesn't have to be continually requesting memory when a new string is appended to it.
While a StringBuffer won't improve efficiency in every situation, if your application uses strings that grow in length, it would be efficient. Code can also be clearer with StringBuffers, because the append method saves you from having to use long assignment statements.

Categories