I am learning Java JNI and trying to understand the GetStringUTFChars & ReleaseStringUTFChars. Still i can't able to understand the ReleaseStringUTFChars.
As per my understanding from some article, in most cases that the GetStringUTFChars return a reference to the original string data and not a copy. So actually the ReleaseStringUTFChars release the jstring or the const char* (if copied) or both.
I can get a better understanding if i get the answer to the below question.
In the below code do i need to call the ReleaseStringUTFChars in a for loop or only once (with any one of the const char*)?
#define array_size 10
const char* chr[array_size];
jboolean blnIsCopy;
for (int i = 0; i < array_size; i++) {
chr[i] = env->GetStringUTFChars(myjstring, &blnIsCopy);
printf((bool)blnIsCopy ? "true\n" : "false\n"); //displays always true
printf("Address = %p\n\n",chr[i]); //displays different address
}
//ReleaseStringUTFChars with in a for loop or single statement is enough
for (int i = 0; i < array_size; i++) {
env->ReleaseStringUTFChars(myjstring, chr[i]);
}
Thanks in advance.
Get/ReleaseStringUTFChars must always be called in pairs, regardless of whether a copy or not is returned.
In practice, you pretty much always get a copy (at least with the JVM implementations I checked: OpenJDK and Dalvik) so that the GC is free to move the original array. It obviously can't collect it because you've got a reference to the string but it'll still move objects around.
There is also a GetStringCritical/ReleaseStringCritical call pair, which will always attempt to return a pointer to the original array (though in theory it may still return a copy). This makes it faster but it comes at a cost: the GC must not move the array until you release it. Again, in practice this is usually implemented by establishing a mutex with the GC, and incrementing a lock count for Get and decrementing it for Release. This means these must be called in pairs too, otherwise the lock count will never get back to zero and GC will probably never run. Please note: Get/ReleaseStringCritical also comes with other limitations which are less relevant to this question but are no less important.
Related
I am trying to free the memory of t_data which is assigned as dummy variable. (The code is below). Now as soon as I free t_data the program throws a heap corruption error but instead if I copy all the stuff from body to a new memory for t_data, everything works fine. The delete code is called somewhere down the line in another class method (not shown here), it just uses t_Data pointer to delete the memory.
jshortArray val = (jshortArray)(m_pJVMInstance->m_pEnv->CallStaticObjectMethod(m_imageJ_cls, method_id, arr, (jint)t, (jint)c));
jsize len = m_pJVMInstance->m_pEnv->GetArrayLength(val);
jshort* body = m_pJVMInstance->m_pEnv->GetShortArrayElements(val, 0);
unsigned short int* dummy = reinterpret_cast<unsigned short int*>(body);
//t_data = dummy; //NOTE: Once you free t_data later exception is thrown.
t_data = new unsigned short int[len];
for (int i = 0; i < len; i++) {
unsigned short int test = *(body + i);
*((unsigned short int*)t_data + i) = test;
}
I am trying to figure out a way where I dont have to run the for loop to copy the body data to t_data and still be able to free the memory. (The for loop takes too much time for big images.)
What Michael said was correct and that indeed solved the problem. Referring to his comment:
Yes, definitely don't call free or delete on the pointer returned by GetShortArrayElements, because you don't know what GetShortArrayElements did internally. It might not have allocated any memory at all. Some implementations just pin the Java array to avoid having it moved by the GC, and then returns a pointer to the actual Java array contents. Just call ReleaseShortArrayElements when you're done with the pointer. – Michael
I came across a problem when I read the code of sun.misc.Unsafe.Java.
Is CAS a loop like spin?
At first, I think CAS is just an atomic operation in a low-live way. However, when I try to find the source code of the function compareAndSwapInt, I find the cpp code like this:
jbyte Atomic::cmpxchg(jbyte exchange_value, volatile jbyte* dest, jbyte compare_value) {
assert(sizeof(jbyte) == 1, "assumption.");
uintptr_t dest_addr = (uintptr_t)dest;
uintptr_t offset = dest_addr % sizeof(jint);
volatile jint* dest_int = (volatile jint*)(dest_addr - offset);
jint cur = *dest_int;
jbyte* cur_as_bytes = (jbyte*)(&cur);
jint new_val = cur;
jbyte* new_val_as_bytes = (jbyte*)(&new_val);
new_val_as_bytes[offset] = exchange_value;
while (cur_as_bytes[offset] == compare_value) {
jint res = cmpxchg(new_val, dest_int, cur);
if (res == cur) break;
cur = res;
new_val = cur;
new_val_as_bytes[offset] = exchange_value;
}
return cur_as_bytes[offset];
}
I saw "when" and "break " in this atomic function.
Is it a spin ways?
related code links:
http://hg.openjdk.java.net/jdk8u/jdk8u20/hotspot/file/190899198332/src/share/vm/prims/unsafe.cpp
http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/file/07011844584f/src/share/classes/sun/misc/Unsafe.java
http://hg.openjdk.java.net/jdk8u/jdk8u20/hotspot/file/55fb97c4c58d/src/share/vm/runtime/atomic.cpp
CAS is a single operation that returns a value of 1 or 0 meaning this operation has made it or not, since you are doing a compareAndSwapInt you want this operation to succeed, thus the operations gets repeated until it works.
I think you are also confusing this with a spin lock, that basically means do something while this value is "1" (for example); all other threads wait until this value is zero (via compareAndSwap), which in effect means that some thread is done with the work and has released the lock (this is referred as release/acquire semantics)
The CAS operation is not a spin, it's an atomic operation at hardware level. On x86 and SPARC processors CAS a single instruction, and it supports int and long operands.
Indeed the Atomic::cmpxchg int / long overloads are generated on x86 using a single cmpxchgl/cmpxchgq instruction.
What you're looking at is an Atomic::cmpxchg single-byte overload, which works around the CAS instruction's limitation to simulate CAS at byte level. It does so by performing a CAS for an int located at the same address as the byte, then checking just one byte out of it and repeating if CAS fails because of a change in the other 3 bytes. The compare-and-swap is still atomic, it just needs to be re-tried sometimes because it covers more bytes than is necessary.
CAS is typically a hardware instruction just like integer addition or comparison, for example (only slower). The instruction itself may be broken down into several steps of so-called microcode, and might indeed contain a low-level loop or a blocking wait for another processor component. However, these are implementation details of the processor architecture. Remember the saying that any problem in CS can be solved by adding another layer of indirection? This also applies here. An atomic operation in Java may actually involve the following layers:
The Java method signature.
A C(++) JNI method to implement it.
A C(++) "compiler intrinsic" such as GCC's __atomic_compare_exchange
The actual processor instruction.
The microcode that implements this instruction.
Additional layers to be used by said microcode, such as cache coherency protocols and the like.
My recommendation is not to worry about how all of this works unless either case applies:
For some reason, it doesn't work. This is likely due to a platform bug.
It is too slow.
Unit tests can help you identify the former case. Benchmarking can help you identify the latter case. But it should be pointed out that if the CAS provided to you by Java is slow, chances are that you will not be able to write a faster one yourself. Therefore, your best bet in this case would be to change your data structures or data flows such as to further reduce the amount of thread synchronization required.
What is the best way to make a Jni as fast as Possible?
I need to call a .dll for a conversation with a cxternal Measurement Box.
Atm i do call the values of the Box over a JNI with a static loaded Lib.
public class myJni{
static {
System.loadLibrary("myJniDll");
}
public native double Get4(String para);
}
very simple as you can see.
On C side i use:
HINSTANCE hInstLibrary = LoadLibrary("my_64.dll");
typedef void(*FunctionFunc)();
JNIEXPORT jdouble JNICALL my_Get4
(JNIEnv * penv, jclass clazz, jstring Para)
{
typedef double(__stdcall *Get4)(char FAR *lpszPara);
Get4 _Get4;
FunctionFunc _FunctionFunc2;
_Get4 = (Get4)GetProcAddress(hInstLibrary, "my_Get4");
_FunctionFunc2 = (FunctionFunc)GetProcAddress(hInstLibrary, "Function");
const char *nativeString = penv->GetStringUTFChars(Para, 0);
const char* parameter = nativeString;
double ret = _Get4((char*)parameter);
penv->ReleaseStringUTFChars(Para, nativeString);
return ret;
}
The Code needs about 20 ms to get the Value of the Com Portunit. The Lag when the value is changing doesn't "feel" good. It is sensible when I change the value that it needs time to go over the Jni.
Has someone got some tweeks to get it to about 10 ms?
#Edit: Gil´s Pointer Skip made a huge impact. Its now less "laggy". Still not as far as i want to but ok.
The Unit on the Com port is a Measurement device that works in a 0.000000 accuracy. So the Lag is shown by the last 4 Numbers not smoothly changing but skipping much of the scale when changed.
You can skip loading the function pointer for each call:
static Get4 _Get4 = NULL;
static FunctionFunc _FunctionFunc2 = NULL;
if(!_Get4)
_Get4 = (Get4)GetProcAddress(hInstLibrary, "my_Get4");
if(!_FunctionFunc2)
_FunctionFunc2 = (FunctionFunc)GetProcAddress(hInstLibrary, "Function");
This will save a ot of time.
Other Answers offer some useful optimizations (viewed in isolation), but I'm pessimistic that they will give you the amount of speed-up that you desire.
If this method really takes 20 milliseconds per call, amortized over a number of calls, then I can confidently predict that the vast majority of that time is spent in either the call to Get4, or in the call to GetStringUTFChars. Neither of those can be optimized, so the chances of getting a 50% speedup are (IMO) non-existent.
You don't state which of these methods does anything resembling 'get the value of the Com Portunit', but you don't need to get the native function addresses every time you call this method. They won't change. Stick them into static variables the first time. As a matter of fact you don't need to dynamically load 'my_64.dll' at all. Statically link to it.
I was browsing through the Java code today and I noticed something.
int[] m = mag;
int len = m.length;
int[] xm = xInt.mag;
if (len != xm.length)
return false;
(This is in the BigInteger class, which can be found by unzipping src.zip. It's in the equals method.) Why is an entirely new variable m created when it is only used once? Why isn't the code just int len = mag.length? I saw this in another method also (bitLength), and again, m is only used once. Is there any advantage to doing this or is it just a mistake by the creators of this class?
Edit: as #usernametbd pointed out, it is used a bit later:
for (int i = 0; i < len; i++)
if (xm[i] != m[i])
return false;
But they still could have just used mag. Why would an entirely new variable be made?
In a different function (in the same class, bitLength), a new variable m is made and it's only used a single time.
Because mag is a field, m is local variable. Access to local variable may be faster, though modern JITs can create such a substitute local variable automatically.
BTW you should have tell what the method you had in mind (I found it to be equals()), and cite original source (it is available) rather than decompiled one.
A bit (few lines) futher down, they use
for (int i = 0; i < len; i++)
if (xm[i] != m[i])
return false;
So m isn't completely isolated. They certainly could've used mag instead, but it's just a design choice.
When you call length (public final member variable of Array) via reflection which is constant time operation. But it is not same in C++. You have to get first array size in bytes and after divide this result to size of int to get exact value(Maybe there is better way). I think developer has the same reflex from him C++ times and carried value into local variable to use several times.
Why is it important to you? The statement is not copying an array, just copying a reference -- a pointer. And "m" will likely be allocated into a register, whereas the JVM standard requires that "mag" must usually be refetched from the object -- the JITC can't freely optimize away field references.
Say I have a simple PHP loop like this one
// Bad example
$array = array('apple','banana','cucumber');
for ($i = 1; $i < count($array); $i++) {
echo $array[$i];
}
I know this is a bad practice. It's better not using count() inside a loop.
// Nice example
$array = array('apple','banana','cucumber');
$limit = count($array);
for ($i = 1; $i < $limit; $i++) {
// do something...
}
In Java, I would do it this way
// Bad example?
String[] array = {"apple","banana","cucumber"};
for(int i = 0; i < array.length; i++){
System.out.println(array[i]);
}
Question: Isn't this above a bad practice too? Or it is just the same as the example below?
// Nice example?
String[] array = {"apple","banana","cucumber"};
int limit = array.length;
for(int i = 0; i < limit; i++){
System.out.println(array[i]);
}
Any decent compiler/interpreter should automatically optimise the first example to match the second (semantically speaking anyway, if not exactly literally), and probably the third to match the fourth. It's known as a loop invariant optimisation, where the compiler recognises that an entity (variable, expression, etc) does not vary within the loop (i.e. is invariant) and removes it to outside the loop (loosely speaking).
It's not bad practice at all anymore, if it ever was.
The "bad" examples you use are not equivalent, and thus are not comparable - even if they seem so on the surface. Using this description:
for (initialization; termination; increment) {
statement(s)
}
(which is descriptive of both PHP and java loops), the initialization statement is executed once, at the start of the loop. The termination statement and the increment are executed for each iteration of the loop.
The reason it is bad practice to use PHP's count in the termination statement is that, for each iteration, the count function call occurs. In your Java example, array.length is not a function call but a reference to a public member. Therefore, the termination statements used in your examples are not equivalent behavior. We expect a function call to be more costly than a property reference.
It is bad practice to place a function call (or call a property that masks a function) in the termination statement of a for loop in any language which has the described loop mechanics. That's what makes the PHP example "bad", and it would be equally bad if you used a count-type function in Java for loop's termination statement. The real question, then, is whether Java's Array.length does indeed mask a function call - the answer to that is "no" (see the potential duplicate question, and/or check out http://leepoint.net/notes-java/data/arrays/arrays.html)
The main difference is that count() is a function whereas array.length is a property and therefore not different from a limit variable.
They are not the same, in the Java "nice example" you are not calculating the length of the array every time. Instead, you are storing that in the limit variable and using that to stop the calculation instead of the result of calling the length function on the array every iteration through the for loop.
EDIT: Both of the things that you thought were "bad practice" are bad practice and the "nice examples" are the more efficient ways (at least in theory). But it is true that in implementation there will not be any noticeable difference.
In java this doesn't matter an array has this attribute as a constant (public final int).
The difference is in java arrays have a fixed size and can not grow so there would be no need to count the elements every time to access length.