What is the core technology of kryo? [closed] - java

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Kryo is really fast and small. what's the secret here?
I have diving into its code for a while, but still need some guidance.
Thanks.

From their page:
The 2.22 release fixes many reported issues and improves stability and
performance. It also introduces a number of new features, most notably
that it can use Unsafe to read and write object memory directly. This
is the absolute fastest way to do serialization, especially for large
primitive arrays.
It uses direct bytecode-level access to the fields - sun.misc.Unsafe or ASM library. Kryo was fast even before introducing unsafe usage. The general answer, I think, is that performance is their highest priority. Java's reflection is not that slow when used carefully - i.e. when the java.lang.Field and java.lang.Method are cached. I set up an experiment which sorted an array with two different comparators - one was using direct field access and the other was using cached fields. There was only 2x difference, which means unnoticeable in context with IO.
FieldSerializer:
By default, most classes will end up using FieldSerializer. It
essentially does what hand written serialization would, but does it
automatically. FieldSerializer does direct assignment to the object's
fields. If the fields are public, protected, or default access
(package private), bytecode generation is used for maximum speed (see
ReflectASM). For private fields, setAccessible and cached reflection
is used, which is still quite fast.

Related

Avoid array initialization in Java [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Is it possible to avoid array zeroing/initialization in Java?
I need to quickly allocate fresh byte arrays of fixed length that will completely be filled with predefined data. Therefore, I don't need the JVM to automatically zero the array on instantiation and I certainly don't mind if the array contains junk. All I need is constant time array allocation, which unfortunately becomes O(n) due to the mentioned zeroing issue.
Would using unsafe help?
JVM always initializes arrays. But you can reuse the same array and it will be initialized once.
The class sun.misc.Unsafe is officially undocumented.
From http://mishadoff.com/blog/java-magic-part-4-sun-dot-misc-dot-unsafe/
Avoid initialization
allocateInstance method can be useful when you need to skip object
initialization phase or bypass security checks in constructor or you
want instance of that class but don't have any public constructor.
Some resources indicate its removal from Java 9.
What kind of application do you have that array initialization became a performance bottleneck?
In response to "Sadly I can't because these arrays are held on to for an undefined period of time by an asynchronous networking library. Reusing them results in overwriting partially unsent messages.":
Then use a pool and reuse only arrays that are not currently in use. You will have to manage that, though. So, is array creation really that much of an issue?

How can Heap Pollution cause a security flaw [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I came across this rule in the CERT Secure Coding Standart for Java. Heap Pollution. I understand this can cause the programm throwing an exception at runtime, but i can't understand how this could cause a security issue like dos or something. Could someone explain a scenario where an attacker could exploit a heap pollution?
An attacker would need to be able to create an arbitrary object. If you expose Java Serialization for example this is possible. You can construct object from Java Serialization which wouldn't be valid in term sof generic and can thus cause exceptions to occur.
However, there are more serious problem to worry about such as deserializing objects which could execute code in ways that were not intended. Unfortunately some common libraries allow this. e.g. http://www.darkreading.com/informationweek-home/why-the-java-deserialization-bug-is-a-big-deal/d/d-id/1323237
In theory parameterised types could be accepted by trusted code from untrusted source (could be through serialisation, but also just untrusted code). In theory values passed indirectly could behave differently when called with methods on a common supertype (notable toString (may have unexpected escape characters or may change value) and equals (may lie or a malicious implementation may alter the argument object)).
In practice this does not happen. The Java library parameterised types are generally untrustworthy themselves. Trustable parameterised types of untrusted objects are uncommon, and where they are used there is typically an implicit checked cast even when using methods from Object.

Is extracting a large number of String literals a good idea? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
In a legacy code base I am dealing with, there are vast number of String literals. A large number of them are duplicates. For example, the string "userID" is used say in 500 places. There are maybe a thousand such literals which are used in a repeated manner. IntelliJ Idea static code analysis suggests that I extract those as constants. If the IDE does this refactoring automatically for me, without me typing a single line of code, should I go for it?
Is it a good idea, in general, to extract many such duplicate string literals as constants? This will obviously avoid duplication, will provide single point of access, declaration, etc.
However, some of these literals come into picture when accessed. If I declare all literals as constants (static final), then all those will be loaded together. In that context, is it a good idea to declare all those literals as constants? Could you provide some pointers to garbage collection, memory space precautions in such scenarios? What are best practices used in such scenario?
Some notes: I know that string literals are interned. So I don't thing I would be saving any memory in the worst case. Also it seems that jdk 7 will put those strings in heap rather that permgen. I saw a couple of questions like mine, but feel that it is different. So posting it here.
Thanks
All String Literals are interned automatically. From JDK7+, they will be GCed when the class (actually the classloader which loaded the class which defines the string literal) which defines them gets GCed (provided no other class refers to it (though this happens rarely..). Making them static and final and putting them into a common class is indeed useless from a Memory saving perspective but useful from a design perspective because it will provide a single point of access.
The same String literals are shared across all classes in the JVM. So, there will be no new Strings. Effectively, putting them into one class and accessing them from that place makes your code more structured/ more readable.
My suggestion, don't tinker with legacy code unless it makes a lot of difference. The trade-offs are yours' to choose. :P

Is it better to use local variables or chain methods inline? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
If I have a series of method invocations, the value of each used for the next call, should I store them in local variables, like so:
DynamicForm filledForm = Form.form().bindFromRequest();
String shareIdStr = filledForm.get("data[shareId]");
UUID shareId = UUID.fromString(shareIdStr);
Share share = Share.find.byId(shareId);
or as a single invocation chain, like so:
Share share = Share.find.byId(UUID.fromString(Form.form().bindFromRequest().get("data[shareId]")));
In this case, the only value that is used again is share. Perhaps the answer is somewhere in-between, or is something completely different. What's your opinion?
Not chaining Methods :
ADV
Enhances readability.
Gives an opportunity for re-usage.
Pin pointing exceptions (if any) becomes easier.
Debugging becomes easier, i.e. setting breakpoints on specific invocation is easy.
DisADV
Increases length( I wont say size :) ) of code.
IDE warnings (if any).
Chaining Methods
ADV
Reduces the need for creating multiple temp. variables.
Is a syntactic sugar
Reduces the number of lines to be written.
DisADV
Reduces readability of code.
Commenting becomes difficult (if any) for particular methods called.
Debugging the whole chain of invocation becomes very difficult.
The first way is only useful if you re-use these variables later in the method. If not, Eclipse will tell you they are not used. So the second way is better, I think.
To clarify a long line of code, I like to write it like this :
Share share = Share.find
.byId(UUID.fromString(Form.form()
.bindFromRequest()
.get("data[shareId]")
)
);
You can only compare these two forms if you consider you will not reuse variables. Otherwise, it doesn't make sense to compare them.
Generally the first variant gives your code more readability and potentially makes it easier to maintain.
Personally I develop a lot for embedded systems where the target platform has big constraints on computation power and size. Therefore I typically inline the code, so that my bytecode is smaller.
If I am to develop an application to run on a powerful server, or even the regular PC, then I would most likely opt for variant one.
Depends how you want to read your code. Local variables are useful if you are going to use them again. Otherwise proceed with chain invocation.

How do standard libraries implement hash tables in practice? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
Some programming languages such as python, Java and C++11 have hash tables (although sometimes under different names with extended functionality) as part of their standard library. I would like to understand from a high level algorithmic point of view what has been implemented. Specifically:
What function of the keys is used to give the location to place the data (i.e. what is the hash function that is used)?
Which algorithms do they use for resolving collisions? As an example, do any of them use simple chaining?
Is there any use of randomness to pick the hash functions?
For Java,
How are the hash functions themselves computed?
They are implemented by the class itself with int hashCode()
Which algorithms do they use for resolving collisions? As an example, do any of them use simple chaining?
Typically simple chaining. Java 8 will support trees for collisions of String.
Is there any use of randomness to pick the hash functions?
No, except for String elements/keys to avoid DOS attacks.

Categories