String.substring() making a copy of the underlying char[] value [closed] - java

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
A question relating to performance considerations for String.substring. Prior to Java 1.7.0_06, the String.substring() method returned a new String object that shared the same underlying char array as its parents but with different offset and length. To avoid keeping a very large string in memory when only a small substring was needed to be kept, programmers used to write code like this:
s = new String(queryReturningHugeHugeString().substring(0,3));
From 1.7.0_06 onwards, it has not been necessary to create a new String because in Oracle's implementation of String, substrings no longer share their underlying char array.
My question is: can we rely on Oracle (and other vendors) not going back to char[] sharing in some future release, and simply do s = s.substr(...), or should we explicitly create a new String just in case some future release of the JRE starts using a sharing implementation again?

The actual representation of the String is an internal implementation detail, so you can never be sure. However according to public talks of Oracle engineers (most notably #shipilev) it's very unlikely that it will be changed back. This was done not only to fight with possible memory leak, but also to simplify the String internals. With simpler strings it's easier to implement many optimization techniques like String deduplication or Compact Strings.

Related

Why are Java substrings bad? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I have recently started using Java for the first time (I used to primarily use C, C++ or Assembly before this) and ran into substrings. I know that Java characters and strings take up at least double the space the character or string should take. But why are substrings bad? I have been advised by a lot of people to avoid them if possible on processing intensive platforms but Strings are used everywhere in web services which can be very processing intensive, so I am curious as to why so many people have this opinion.
This may be related to how substring() was previously implemented. In earlier Java versions calling substring() on a long String would keep the original String in memory (they would share the internal char[]). This can cause memory issues if the original Strings are kept around in memory unnecessarily.
In Java 8 this is no longer the case (the internal char[] is copied) and you can freely take substrings of even long Strings.

Why didn't they design array index to start from 1? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
In many programming languages, the array index begins with 0. Is there a reason why it was designed so?
According to me, it would have been more convenient if the length of the array was equal to the last index. We could avoid most of the ArrayIndexOutOfBounds exceptions.
I can understand when it comes to a language like C. C is an old language and the developers may have not thought about the issues and discomfort. But in case of modern languages like java, they still had a chance to redefine the design. Why have they chosen to keep it the same?
Is it somehow related to working of operating systems or did they actually wanted to continue with the familiar behaviour or design structure (though new programmers face a lot of problems related to this)?
An array index is just a memory offset.
So the first element of an array is at the memory it is already pointing to, which is simply
*(arr) == *(arr+0).

How to get the length of a type after serializization [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I encounter a problem that with Java, I have a map,such as map<K,V>, K and V can be arbitrary type, e.g int, Long, String, Time, etc.
After the map is serialized, can I get the length of K or the V? Can I write a common method to implement this idea? Something like:
public long getLength(object obj) {
//how to get the length of this obj, obj can be any type
}
How could do that?
Nope.
But you can approximate though, here's a nice article about a sizeof function.
The reason is that you can change the default binary serialization (which is kinda verbose). There are comparisons for these tools, the last time I was in this topic Kyro was the most optimal (10x smaller than the default Java binary serialization, because it does not neither export redundant nor anything verification-related data).
Here's a comparison about the tools.
There's no way to get the length of an object after serialization except by serializing it.

Java naming convention for identifiers that begin with a number [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I have to deal with a domain object that's real name is 351K-Report. According to the Java naming convention its forbidden to use a number at the beginning of an identifier.
I don't want to fully spell out the number. And, I also think that it's a bad idea to place an underline in front of the number.
But what is the recommended alternative?
UPDATE
There are also other reports, like SpecReport, TopReport, LF10Report and so on. So I'm very doubtful that inverting parts of the noun changes the meaning of the whole project.
Maybe reverse it. For example:
report351K
That would be very bad..
Imagine this:
int 1d = 3;
double d = 1d * 2;
What would be d?
Alternatives:
Since variables that begins with _ usually indicates for class member, I would use report351K.
if you really want to do this then _351KReport but I don't think you should do this. try to make something meaningful of it and at the same time is convineient to Java

Immutable container for a byte[] that supports subsequences, like String.substring() [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
String.substring() efficiently reuses the underlying char[] within the String object, I'm wondering if there is an immutable container for byte[] arrays that supports a similar kind of efficient reuse of the underlying array.
Extra points if it can handle things like efficient append and prepend. Still extra points if its packaged for Maven.
Anyone know of such a thing?
The most suitable thing that comes into my mind without going outside base SDK are java.nio Buffers, like ByteBuffer..
There's the Protocol Buffers ByteString. From the JavaDoc:
Immutable sequence of bytes. Substring is supported by sharing the reference to the immutable underlying bytes, as with String. Concatenation is likewise supported without copying (long strings) by building a tree of pieces in RopeByteString.

Categories