stringBuffer insert method using capacity as a parameter - java

public class example{
public static void main(String args[]) {
StringBuffer s1 = new StringBuffer(10);
s1.insert(0,avaffffffffffffffffffffffffffffffffffffffffffvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv");
System.out.println(s1);
}
}
the output of this code is coming as avaffffffffffffffffffffffffffffffffffffffffffvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv.
what is the use of parameter 10 in the StringBuffer class's method?
if 10 is the size of Buffer and 0 is the offset of insert method then how will we get the whole string as an output?

From JavaDoc:
A string buffer is like a String, but can be modified. At any
point in time it contains some particular sequence of characters, but
the length and content of the sequence can be changed through certain
method calls
10 is just initial capacity (continue reading JavaDoc):
Every string buffer has a capacity. As long as the length of the character sequence contained in the string buffer does not exceed the capacity, it is not necessary to allocate a new internal buffer array. If the internal buffer overflows, it is automatically made larger.

Read the docs:
capacity - the initial capacity.
So it is not the 'size'.

what is the use of parameter 10 in the StringBuffer class's method? if 10 is the size of Buffer and 0 is the offset of insert method then how will we get the whole string as an output?
The answer is that you can make the initial capacity smaller if you know that it is not likely to be used up. When a string buffer is created, memory has to be allocated. The default size is 16, but if you only want to use it for one character, you can specify that is initial capacity is 1, and since it will only resize when you add more than one character to it, you can avoid wasting memory.
The same applies to the parameters in things like HashSet(n). It will resize if you add elements too it, but if you know exactly how many elements it is going to have you can save a little memory and the operations needed for resizing by specifying its size exactly.

Related

Difference between length() and capacity() methods in StringBuilder

I was trying to figure out when to use or why capacity() method is different from length() method of StringBuilder or StringBuffer classes.
I have searched on Stack Overflow and managed to come up with this answer, but I didn't understand its distinction with length() method. I have visited this website also but this helped me even less.
StringBuilder is for building up text. Internally, it uses an array of characters to hold the text you add to it. capacity is the size of the array. length is how much of that array is currently filled by text that should be used. So with:
StringBuilder sb = new StringBuilder(1000);
sb.append("testing");
capacity() is 1000 (there's room for 1000 characters before the internal array needs to be replaced with a larger one), and length() is 7 (there are seven meaningful characters in the array).
The capacity is important because if you try to add more text to the StringBuilder than it has capacity for, it has to allocate a new, larger buffer and copy the content to it, which has memory use and performance implications*. For instance, the default capacity of a StringBuilder is currently 16 characters (it isn't documented and could change), so:
StringBuilder sb = new StringBuilder();
sb.append("Singing:");
sb.append("I am the very model of a modern Major General");
...creates a StringBuilder with a char[16], copies "Singing:" into that array, and then has to create a new array and copy the contents to it before it can add the second string, because it doesn't have enough room to add the second string.
* (whether either matters depends on what the code is doing)
The length of the string is always less than or equal to the capacity of the builder. The length is the actual size of the string stored in the builder, and the capacity is the maximum size that it can currently fit.
The builder’s capacity is automatically increased if more characters are added to exceed its capacity. Internally, a string builder is an array of characters, so the builder’s capacity is the size of the array. If the builder’s capacity is exceeded, the array is replaced by a new array. The new array size is 2 * (the previous array size + 1).
Since you are new to Java, I would suggest you this tip also regarding StringBuilder's efficiency:
You can use newStringBuilder(initialCapacity) to create a StringBuilder with a specified initial capacity. By carefully choosing the initial capacity, you can make your program more efficient. If the capacity is always larger than the actual length of the builder, the JVM will never need to reallocate memory for the builder. On the other hand, if the capacity is too large, you will waste memory space. You can use the trimToSize() method to reduce the capacity to the actual size.
I tried to explain it the best terms I could so I hope it was helpful.

In Java 8, why is the default capacity of ArrayList now zero?

As I recall, before Java 8, the default capacity of ArrayList was 10.
Surprisingly, the comment on the default (void) constructor still says: Constructs an empty list with an initial capacity of ten.
From ArrayList.java:
/**
* Shared empty array instance used for default sized empty instances. We
* distinguish this from EMPTY_ELEMENTDATA to know how much to inflate when
* first element is added.
*/
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
...
/**
* Constructs an empty list with an initial capacity of ten.
*/
public ArrayList() {
this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}
Technically, it's 10, not zero, if you admit for a lazy initialisation of the backing array. See:
public boolean add(E e) {
ensureCapacityInternal(size + 1);
elementData[size++] = e;
return true;
}
private void ensureCapacityInternal(int minCapacity) {
if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
}
ensureExplicitCapacity(minCapacity);
}
where
/**
* Default initial capacity.
*/
private static final int DEFAULT_CAPACITY = 10;
What you're referring to is just the zero-sized initial array object that is shared among all initially empty ArrayList objects. I.e. the capacity of 10 is guaranteed lazily, an optimisation that is present also in Java 7.
Admittedly, the constructor contract is not entirely accurate. Perhaps this is the source of confusion here.
Background
Here's an E-Mail by Mike Duigou
I have posted an updated version of the empty ArrayList and HashMap patch.
http://cr.openjdk.java.net/~mduigou/JDK-7143928/1/webrev/
This revised implementation introduces no new fields to either class. For ArrayList the lazy allocation of the backing array occurs only if the list is created at default size. According to our performance analysis team, approximately 85% of ArrayList instances are created at default size so this optimization will be valid for an overwhelming majority of cases.
For HashMap, creative use is made of the threshold field to track the requested initial size until the bucket array is needed. On the read side the empty map case is tested with isEmpty(). On the write size a comparison of (table == EMPTY_TABLE) is used to detect the need to inflate the bucket array. In readObject there's a little more work to try to choose an efficient initial capacity.
From: http://mail.openjdk.java.net/pipermail/core-libs-dev/2013-April/015585.html
In java 8 default capacity of ArrayList is 0 until we add at least one object into the ArrayList object (You can call it lazy initialization).
Now question is why this change has been done in JAVA 8?
Answer is to save memory consumption. Millions of array list objects are created in real time java applications. Default size of 10 objects means that we allocate 10 pointers (40 or 80 bytes) for underlying array at creation and fill them in with nulls.
An empty array (filled with nulls) occupy lot of memory .
Lazy initialization postpones this memory consumption till moment you will actually use the array list.
Please see below code for help.
ArrayList al = new ArrayList(); //Size: 0, Capacity: 0
ArrayList al = new ArrayList(5); //Size: 0, Capacity: 5
ArrayList al = new ArrayList(new ArrayList(5)); //Size: 0, Capacity: 0
al.add( "shailesh" ); //Size: 1, Capacity: 10
public static void main( String[] args )
throws Exception
{
ArrayList al = new ArrayList();
getCapacity( al );
al.add( "shailesh" );
getCapacity( al );
}
static void getCapacity( ArrayList<?> l )
throws Exception
{
Field dataField = ArrayList.class.getDeclaredField( "elementData" );
dataField.setAccessible( true );
System.out.format( "Size: %2d, Capacity: %2d%n", l.size(), ( (Object[]) dataField.get( l ) ).length );
}
Response: -
Size: 0, Capacity: 0
Size: 1, Capacity: 10
Article Default capacity of ArrayList in Java 8 explains it in details.
If the very first operation that is done with an ArrayList is to pass addAll a collection which has more than ten elements, then any effort put into creating an initial ten-element array to hold the ArrayList's contents would be thrown out the window. Whenever something is added to an ArrayList it's necessary to test whether the size of the resulting list will exceed the size of the backing store; allowing the initial backing store to have size zero rather than ten will cause this test to fail one extra time in the lifetime of a list whose first operation is an "add" which would require creating the initial ten-item array, but that cost is less than the cost of creating a ten-item array that never ends up getting used.
That having been said, it might have been possible to improve performance further in some contexts if there were a overload of "addAll" which specified how many items (if any) would likely be added to the list after the present one, and which could use that to influence its allocation behavior. In some cases code which adds the last few items to a list will have a pretty good idea that the list is never going to need any space beyond that. There are many situations where a list will get populated once and never modified after that. If at the point code knows that the ultimate size of a list will be 170 elements, it has 150 elements and a backing store of size 160, growing the backing store to size 320 will be unhelpful and leaving it at size 320 or trimming it to 170 will be less efficient than simply having the next allocation grow it to 170.
The question is 'why?'.
Memory profiling inspections (for example (https://www.yourkit.com/docs/java/help/inspections_mem.jsp#sparse_arrays) shows that empty (filled with nulls) arrays occupy tons of memory .
Default size of 10 objects means that we allocate 10 pointers (40 or 80 bytes) for underlying array at creation and fill them in with nulls. Real java applications create millions of array lists.
The introduced modification removes^W postpone this memory consumption till moment you will actually use the array list.
After above question I gone through ArrayList Document of Java 8. I found the default size is still 10 only.
ArrayList default size in JAVA 8 is stil 10. The only change made in JAVA 8 is that if a coder adds elements less than 10 then the remaining arraylist blank places are not specified to null. Saying so because I have myself gone through this situation and eclipse made me look into this change of JAVA 8.
You can justify this change by looking at below screenshot. In it you can see that ArrayList size is specified as 10 in Object[10] but the number of elements displayed are only 7. Rest null value elements are not displayed here. In JAVA 7 below screenshot is same with just a single change which is that the null value elements are also displayed for which the coder needs to write code for handling null values if he is iterating complete array list while in JAVA 8 this burden is removed from the head of coder/developer.
Screen shot link.

Why do we need capacity in StringBuilder

As we know, there is a attribute in StringBuilder called capacity, it is always larger than the length of StringBuilder object. However, what is capacity used for? It will be expanded if the length is larger than the capacity. If it does something, can someone give an example?
You can use the initial capacity to save the need to re-size the StringBuilder while appending to it, which costs time.
If you know if advance how many characters would be appended to the StringBuilder and you specify that size when you create the StringBuilder, it will never have to be re-sized while it is being used.
If, on the other hand, you don't give an initial capacity, or give a too small intial capacity, each time that capacity is reached, the storage of the StringBuilder has to be increased, which involves copying the data stored in the original storage to the larger storage.
The string builder has to store the string that is being built somewhere. It does so in an array of characters. The capacity is the length of this array. Once the array would overflow, a new (longer) array is allocated and contents are transferred to it. This makes the capacity rise.
If you are not concerned about performance, simply ignore the capacity. The capacity might get interesting once you are about to construct huge strings and know their size upfront. Then you can request a string builder with a capacity being equal to the expected size (or slightly larger if you are not sure about the size).
Example when building a string with a content size of 1 million:
StringBuilder sb = new StringBuilder(1000000);
for(int i = 0; i < 1000000; i++){
sb.append("x");
}
Initializing the string builder with one million will make it faster in comparison to a default string builder which has to copy its array repeatedly.
StringBuilder is backed by an array of characters. The default capacity is 16 + the length of the String argument. If you append to the StringBuilder and the number of characters cannot be fit in the array, then the capacity will have to be changed which will take time. So, if you have some idea about the the number of characters that you might have, initialize the capacity.
The answer is: performance. As the other answers already say, StringBuilder uses an internal array of some original size (capacity). Every time the building up string gets to large for the array to hold it, StringBuilder has to allocate a new, larger array, copy the data from the previous array to the new one and delete the previous array.
If you know beforehand what size the resulting string might be and pass that information to the constructor, StringBuilder can create a large enough array right away and thus can avoid the allocating and copying.
While for small strings, the performance gain is negelectible, it make quite a difference if you build really large strings.

What's a good resizable, random access, efficient byte vector class in Java?

I'm trying to find a class for storing a vector of bytes in Java, which supports: random access (so I can get or set a byte anywhere), resizing (so I can either append stuff to the end or else manually change the size), reasonable efficiency (I might be storing megabytes of data in these things), all in memory (I don't have a file system). Any suggestions?
So far the candidates are:
byte[]. Not resizable.
java.util.Vector<Byte>. Evil. Also painfully inefficient.
java.io.ByteArrayOutputStream. Not random-access.
java.nio.ByteBuffer. Not resizable.
org.apache.commons.collections.primitives.ArrayByteList. Not resizable. Which is weird, because it'll automatically resize if you add stuff to it, you just can't change the size explicitly!
pure RAM implementation of java.nio.channels.FileChannel. Can't find one. (Remember I don't have a file system.)
Apache Commons VFS and the RAM filesystem. If I must I will, but I really like something lighter weight...
This seems like such an odd omission that I'm sure I must have missed something somewhere. I just figure out what. What am I missing?
I would consider a class that wraps lots of chunks of byte[] arrays as elements of an ArrayList or Vector.
Make each chunk a multiple of e.g. 1024 bytes, so you accessor functions can take index >> 10 to access the right element of the ArrayList, and then index & 0x3ff to access the specific byte element of that array.
This will avoid the wastage of treating each byte as a Byte object, at the expensive of wastage of whatever's left over at the end of the last chunk.
In those cases, I simply initialize a reference to an array of reasonable length, and when it gets too small, create and copy it to a larger one by calling Arrays.copyOf(). e.g.:
byte[] a = new byte[16];
if (condition) {
a = Arrays.copyOf(a, a.length*2);
}
And we may wrap that in a class if needed...
ArrayList is more efficient than Vector because it's not synchronized.
To improve efficiency you can start with a decent initial capacity. See the constructor:
ArrayList(int initialCapacity)
I think the default is only 16, while you probably may need much bigger size.
Despite what it seems, ArrayList is very efficient even with a million record! I wrote a small test program that adds a million record to an ArrayList, without declaring an initial capacity and on my Linux, 5 year old laptop (Core 2 T5200 cpu), it takes around 100 milliseconds only to fill the whole list. If I declare a 1 million bytes initial space it takes around 60-80 ms, but If I declare 10,000 items it can take around 130-160ms, so maybe is better to not declare anything unless you can make a really good guess of the space needed.
About the concern for memory usage, it take around 8 Mb of memory, which I consider totally reasonable, unless you're writing phone software.
import java.util.ArrayList;
import java.util.List;
import java.util.Vector;
import org.obliquid.util.StopWatch;
public class ArrayListTest {
public static void main(String args[]) {
System.out.println("tot=" + Runtime.getRuntime().totalMemory()
+ " free=" + Runtime.getRuntime().freeMemory());
StopWatch watch = new StopWatch();
List<Byte> bytes = new ArrayList<Byte>();
for (int i = 0; i < 1000000; i++) {
bytes.add((byte) (i % 256 - 128));
}
System.out.println("Elapsed: "
+ watch.computeElapsedMillisSeconds());
System.out.println("tot=" + Runtime.getRuntime().totalMemory()
+ " free=" + Runtime.getRuntime().freeMemory());
}
}
As expected, Vector performs a little worse in the range of 160-200ms.
Example output, without specifying a start size and with ArrayList implementation.
tot=31522816 free=31023176
Elapsed: 74
tot=31522816 free=22537648

What is the capacity of a StringBuffer?

When I run this code:
StringBuffer name = new StringBuffer("stackoverflow.com");
System.out.println("Length: " + name.length() + ", capacity: " + name.capacity());
it gives output:
Length: 17, capacity: 33
Obvious length is related to number of characters in string, but I am not sure what capacity is?
Is that number of characters that StringBuffer can hold before reallocating space?
See: JavaSE 6 java.lang.StringBuffer capacity()
But your assumption is correct:
The capacity is the amount of storage available for newly inserted characters, beyond which an allocation will occur
It's the size of internal buffer. As Javadoc says:
Every string buffer has a capacity. As long as the length of the
character sequence contained in the string buffer does not exceed the
capacity, it is not necessary to allocate a new internal buffer array.
If the internal buffer overflows, it is automatically made larger.
Yes, you're correct, see the JavaDoc for more information:
As long as the length of the character sequence contained in the string buffer does not exceed the capacity, it is not necessary to allocate a new internal buffer array. If the internal buffer overflows, it is automatically made larger.
Internally StringBuffer uses a char array in order to store characters. Capacity is the initial size of that char array.
More INFO can be found from http://download.oracle.com/javase/6/docs/api/java/lang/StringBuffer.html
From http://download.oracle.com/javase/6/docs/api/java/lang/StringBuffer.html#capacity%28%29
public int capacity()
Returns the current capacity. The capacity is the amount of storage available for newly inserted characters, beyond which an allocation will occur.
Also from the same document
As of release JDK 5, this class has been supplemented with an equivalent class designed for use by a single thread, StringBuilder. The StringBuilder class should generally be used in preference to this one, as it supports all of the same operations but it is faster, as it performs no synchronization.
Yes, it's exactly that. You can think of StringBuffer as being a bit like a Vector<char> in that respect (except obviously you can't use char as a type argument in Java...)
Every string buffer has a capacity. As long as the length of the
character sequence contained in the string buffer does not exceed the
capacity, it is not necessary to allocate a new internal buffer array.
If the internal buffer overflows, it is automatically made larger.
From: http://download.oracle.com/javase/1.4.2/docs/api/java/lang/StringBuffer.html
StringBuffer has a char[] in which it keeps the strings that you append to it. The amount of memory currently allocated to that buffer is the capacity. The amount currently used is the length.
Taken from the official J2SE documentation
The capacity is the amount of storage available for newly inserted characters, beyond which an allocation will occur.
Its generally length+16, which is the minimum allocation, but once the number of character ie its size exceed the allocated one, StringBuffer also increases its size (by fixed amount), but by how much amount will be assigned,we can't calculate it.
"Every string buffer has a capacity. As long as the length of the character sequence contained in the string buffer does not exceed the capacity, it is not necessary to allocate a new internal buffer array. If the internal buffer overflows, it is automatically made larger."
http://download.oracle.com/javase/1.3/docs/api/java/lang/StringBuffer.html
-see capacity() and ensurecapacity()
Capacity is amount of storage available for newly inserted characters.It is different from length().The length() returns the total number of characters and capacity returns value 16 by default if the number of characters are less than 16.But if the number of characters are more than 16 capacity is number of characters + 16.
In this case,no of characters=17
SO,Capacity=17+16=33
Ivan, just read the documentation for capacity() - it directly answers your question...
The initial capacity of StringBuffer/StringBuilder class is 16.
For the first time if the length of your String becomes >16.
The capacity of StringBuffer/StringBuilder class increases to 34 i.e [(16*2)+2]
But when the length of your String becomes >34 the capacity of StringBuffer/StringBuilder class becomes exactly equal to the current length of String.
It is already too late for the answer, But hoping this might help someone.
When we use default constructor of StringBuffer then the capacity amount get allocated is 16
StringBuffer name = new StringBuffer();
System.out.println("capacity: " + name.capacity()); /*Output - 16*/
But in case of String argument Constructor of StringBuffer the capacity calculation is like below
StringBuffer sb = new StringBuffer(String x);
Capacity = default StringBuffer Capacity + x.length()
Solution:
StringBuffer name = new StringBuffer("stackoverflow.com");
System.out.println("Length: " + name.length() + ", capacity: " + name.capacity());
Capacity Calculation: capacity = 16 + 17

Categories