The code of method specialStateTransition(int, IntStream) is exceeding the 65535 bytes

The code of method specialStateTransition(int, IntStream) is exceeding the 65535 bytes - java

I have a pretty big grammar I don't want to break it into multiple smaller grammars. But The generated Lexer file is giving the following error:
The code of method specialStateTransition(int, IntStream) is exceeding the 65535 bytes
I am using ANTLR-3.2. Please tell me how to remove this compiler error.
Thanks
Preeti

Method specialStateTransition is not always generated. It may be related to some tokens that share common prefixes with other tokens.
See this question/answer for a case where specialStateTransition completely vanished by reformulating just one such token.

I had the same problem recently and managed to fix it by changing the options for the Antlr code generation tool..
C: java org.antlr.Tool –Xmaxinlinedfastates [a number less than 60] grammar.g
Using this option forces the code generator to create a table of DFA states rather than many nested if statements

You can't: you will have to refactor your code. The limit is inherent to Java class files.
From Section 4.10 (Limitations of the Java Virtual Machine) of the VM specification:
The amount of code per non-native, non-abstract method is limited to
65536 bytes by the sizes of the indices in the exception_table of the
Code attribute (§4.7.3), in the LineNumberTable attribute (§4.7.8),
and in the LocalVariableTable attribute (§4.7.9).

Related

limit set by 'FEATURE_SECURE_PROCESSING'

I used my own xlst transformator in java (XSLTTransformator) but transformation is very big and I have got error:
Caused by: javax.xml.transform.TransformerConfigurationException: JAXP0801002: the compiler encountered an XPath expression containing '107' operators that exceeds the '100' limit set by 'FEATURE_SECURE_PROCESSING'.
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl.newTemplates(TransformerFactoryImpl.java:990)
at com.aspp.dms.ruleengine.transformation.TemplatesCache.retrieveUncached(TemplatesCache.java:44)
at com.aspp.dms.ruleengine.transformation.TemplatesCache.retrieveUncached(TemplatesCache.java:21)
at com.gratex.java.util.SoftValueCache.get(SoftValueCache.java:41)
at com.aspp.dms.ruleengine.transformation.XSLTTransformator.transform(XSLTTransformator.java:73)
Can you please help me find correct argument for java to solve my problem? Something like -DxpathOperatorsLimit=150
thank you

That behaviour seems to come from new FEATURE_SECURE_PROCESSING, which Oracle introduced in a recent "update" of their Java. See: https://www.oracle.com/java/technologies/javase/11-0-15-relnotes.html
It is 3 parameters they introduced:
jdk.xml.xpathExprGrpLimit Description: Limits the number of groups
an XPath expression can contain. Default 10.
jdk.xml.xpathExprOpLimit Description: Limits the number of operators
an XPath expression can contain. Default 100.
jdk.xml.xpathTotalOpLimit Description: Limits the total number of
XPath operators in an XSL Stylesheet. Default 10000.
Your problem is on #2 (JAXP0801002, default 100).
We got a very similar issue on #3 (JAXP0801003, default 10.000), with this message (quoted, so google will find it):
ERROR: 'JAXP0801003: the compiler encountered XPath expressions with an accumulated '10.002' operators that exceeds the '10.000' limit set by 'FEATURE_SECURE_PROCESSING'.'
FATAL ERROR: 'JAXP0801003: the compiler encountered XPath expressions with an accumulated '10.002' operators that exceeds the '10.000' limit set by 'FEATURE_SECURE_PROCESSING'.'
We wasted 2 days in getting away of that sh*t.
We added some parameters to the java call:
java -Djdk.xml.xpathExprGrpLimit=0 -Djdk.xml.xpathExprOpLimit=0 -Djdk.xml.xpathTotalOpLimit=0 -Xmx2g -Xms512m -XX:-UseGCOverheadLimit ....
Parameters 1,2,3 to to solve the issue. Values "0" set the limits to "off". As XPath can now get huge, it might be advisable to set the heap and stack size and change behaviour of the garbage collection (parameters 4-6).
I hope it will help you too. Have fun!

If you are running into this problem on MacBook M1 or newer. Try a compatible jdk version from either amazon corretto or zulu. I had to try multiple versions for it to work but specifically 8.275.
https://www.azul.com/downloads/?version=java-8-lts&os=macos&architecture=arm-64-bit&package=jdk&show-old-builds=true
Hope this helps!

Rahul Meena
-Djdk.xml.xpathExprGrpLimit=0 -Djdk.xml.xpathExprOpLimit=0 -Djdk.xml.xpathTotalOpLimit=0
Put 3 steps in your restart files.

I encounter this error in my M1 mbp with Oracle OpenJDK17. It also occurs when running in product env that is based on OpenJDK17 and centos.
This answer surely helps me. But I can not modify all JVM configurations in our product cluster.
So I just set these three parameters using Java code:
System.setProperty("jdk.xml.xpathExprGrpLimit", "0");
System.setProperty("jdk.xml.xpathExprOpLimit", "0");
System.setProperty("jdk.xml.xpathTotalOpLimit", "0");
Notice: Set these parameters before TransformerFactory initialization.

Why do indexes to constantpool take up different amount of bytes in classfile format

I have been learning about the java class format and i was just wondering why sometimes an opcode with a constant pool entry following takes up two bytes in the class file such as with InvokeStatic, but then op-codes such as ldc with an integer index take up only one byte in the class file. Is there any pattern behind this, i am asking this because i am writing a simple byte-code manipulation library and i would like to be able to know weather to write a constant pool index as a byte or a short without hard-coding every single command into the library.

Yes there is a pattern: every instruction except ldc takes a two byte index.
Presumably the designers of the bytecode format deciding that loading constants was such a common task that they should provide a shorter instruction for it.

Why do we need a magic number in the beginning of the .class file?

I read a few posts here about the magic number 0xCAFEBABE in the beginning of each java .class file and wanted to know why it is needed - what is the purpose of this marking?
Is it still needed anymore? or is it just for backwards compatibility now?
Couldn't find a post that answers this - nor did I see the answer in the java spec

The magic number is basically an identifier for a file format. A JPEG for example always starts with FFD8. It is not necessary for Java itself, it simply helps to identify the file-type. You can read more about magic numbers here.

See: http://www.artima.com/insidejvm/whyCAFEBABE.html
EDIT: and http://radio-weblogs.com/0100490/2003/01/28.html
Some answers:
Well, they presumably had to pick
something as their magic number to
identify class files, and there's a
limit to how many Java or coffee
related words you can come up with
using just the letters A-F :-)
-
As to why the magic number is
3405691582 (0xCAFEBABE), well my guess
is that (a) 32-bit magic numbers are
easier to handle and more likely to be
unique, and (b) the Java team wanted
something with the Java-coffee
metaphor, and since there's no 'J' or
'V' in hexadecimal, settled for
something with CAFE in it. I guess
they figured "CAFE BABE" was sexier
than something like "A FAB CAFE" or
"CAFE FACE", and definitely didn't
like the implications of "CAFE A FAD"
(or worse, "A BAD CAFE").
-
Don't know why I missed this before,
but they could have used the number
12648430, if you choose to read the
hex zeros as the letter 'O'. That
gives you 0xC0FFEE, or 0x00C0FFEE to
specify all 32 bits. OO COFFEE? Object
Oriented, of course... :-)
-
I originally saw 0xCAFEBABE as a magic
number used by NeXTSTEP. NX used "fat
binaries", which were basically
binaries for different platforms stuck
together in one executable file. If
you were running on NX Intel, it would
run the Intel binary; if on HP, it
would run the HP binary. 0xCAFEBABE
was the magic number to distinguish
either the Intel or the Motorola
binaries ( can't remember which ).

Magic numbers are a common technique to make things, such as files, identifiable.
The idea is that you just have to read the first few bytes of the file to know if this is most likely a Java class file or not. If the first bytes are not equal to the magic number, then you know for sure that it is not a valid Java class file.

It's fairly common practice with binary files to have some sort of fixed identifier at the beginning (e.g. zip files begin with the characters PK). This reduces the possibility of accidentally trying to interpret the wrong sort of file as a class file.

Analyzing memory with MAT - question about UTF characters

I get an .hprof file and I'm analyzing it with Eclipse Memory Analyser (MAT).
I run Top Component report and, in Duplicate Strings section, MAT detects some String instances with identical content.
I'm working with String.intern() and other homework for me, but now this is not my question.
That report shows me duplicated Strings like these:
\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000....
\u000a\u0009\u0009
\u000a\u0009\u0009\u0009\u0009
And so on.
Other Strings are readable, but, how about these ones? I'm thinking they are from XML parsing (I use JibX in my app).
My questions are:
What do you think these strings are coming? How can I analyse them better?
If they are from XML parsing or something else, how can I clean/clear them after parsing? Maybe is JibX 1.0.1 Release too old for these issues?
Any suggestion about these UTF-8 like Strings would be very appreciated. Thanks in advance.

You can right-click on the suspicious String and select List Objects/With Incoming References. This will show you the objects that reference your Strings.

It is interesting to see Strings with many \u0000 characters, which is very uncommon given the fact that Strings are not 0-terminated in Java, so they are created from a String(byte[]) constructor, maybe a String(byte[],encoding) constructor, from byte arrays containing 0s.
I would use a profiler and analyse the call graphs of these constructors. Then you will find the culprit.

Java and JVM confusion (if Java can handle a large string why can't groovy?)

I recently ran into an issue with Groovy where I was attempting to deal with a very large string (100k characters). I got an error that said the string could not be more than 65,535 characters. I did some searches to try to find out more info and ran across this link that said the problem was with the JVM - https://issues.apache.org/jira/browse/GROOVY-2382.
I thought Java ran on the JVM as well and in Java I have had much larger strings. Just trying to understand. Can anyone shed some light on this for me. Thank you in advance.
Sean

This is a limitation on string literals, i.e. Strings in the source code.
It is not a problem for Strings read from a File or some other InputStream.
You should move your huge String into a separate text file.

Looking at the source for java.lang.String the limit is that of Integer.MAX_VALUE which is pretty big.
So yes there is a limit but 100K is no where near it.
The limit that the Groovy bug refers to it that of a string literal, this isn't the same as creating a very big string.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.