Forth Interpreter in Java

Forth Interpreter in Java - java

Here I found a Simple Forth Interpreter implemented in Java.
However I don't understand the significance of it if I want to use it?
What could be the advantage of the Forth Interpreter:
If the final compiled code to be
executed by the JVM is still "Byte
code" what would we the Forth
Interpreter be doing?
Will it help in writing
efficient/tight programs?
Will I be writing my code in Forth
and the interpreter will convert it
to Java?
Your thoughts...

The author on the page describes at as implementing a subset of FORTH and being suitable for incorporationg in other applications; presumably it is intended to provide a scripting capability for an application. It's fairly unlikely that the system works by spitting out java or JVM byte codes; it almostly certainly uses an interpreter written in Java.
Traditionally, a FORTH interpreter can be implemented in a very small memory footprint. I know someone that implemented one on a COSMAC and the core interpreter was 30 bytes long. The stack oriented byte code was also very compact as it did not need to specify the location of operands - it just read from the stack and deposited the result on the top of the stack. This made it popular in embedded systems circles where the small overhead of the interpreter was more than offset by the compact representation of the program logic.
These days it's less important as machines tend to be much larger, although digitalross makes a good point about other situations where FORTH is still used.

Will it help in writing
efficient/tight programs?
That's debatable.
I'm sure that FORTH folks will tell you that it is fast. But I doubt that the execution speed of a FORTH program running on a FORTH interpreter implemented in Java will line up against the speed of an equivalent program implemented directly in Java. For a start, the JIT compiler won't be able to do as a good job of optimizing the FORTH interpreter as it can for the plain Java version.
If by "tight" you mean "using less memory", I think that the difference will be marginal. Remember that in both the "FORTH in Java" and "plain Java" cases you have all of the memory overheads of a Java JVM. This is likely to swamp any comparison of FORTH code density versus equivalent compiled Java code density.

Not a bytecode translator
The answers to your questions are: "see below, sort of, and no".
It's just a program that takes some input and produces some output. The input is a Forth script. Except for some very major systems, it's rare to actually produce bytecode. jRuby, Clojure, Scala .. big systems like that do produce bytecode.
However your Forth interpreter is probably just that: a script interpreter that happens to be written in java. The input it accepts is a program of sorts, so you do end up with a nice double-indirect execution. Forth executing via bytecode interpreter executing via jvm running on the CPU.
Now, if you ran that on a CPU emulator, or wrote an interpreter in Forth, you could make it triple-indirect. (And in a sense it is already, because your Intel CPU is translating most x86 opcodes into micro-ops before executing them. :-)
Anyway, the point is that a program written in a rather static language like java might want to take some complex user input and execute it, or perhaps the author of the program has things that are more easily done in forth, and this allows him to write in both java and forth.
You should think about all of this until you understand it.

It does allow you to write efficient/tight programs. Partly because the ability to define defining words (words executing at compile time) can have the effect of effectively defining a Domain Specific Language (DSL). Forth also encourage refactoring (otherwise the stack stuff simply becomes incomprehensible ...) and thus the code will be tight.

7th is IMHO closer to the original design of forth than any other RPN language on the JVM. There is an editor with line-numbering and code beautyfier. There is a matching implementation of the "interpiler" and the dictionary with vocabulary/current/context.
Conditionally most of the hardware dependent words – to store to or fetch from a specific memory address – are missing. Any words for memory calculating on the JVM would be quite senseless anyway.
Some useful additions over the forth syntax has been made in 7th:
the word help
7th is object oriented
there is a perl like pattern matching
complex numbers and arrays are part of the language
a redirectable/nopeable mechanismus to optionally send output to stack or console and prevent execution of file-io words eg. while testing

There are several Forth systems that implement an Forth interpreter in Java. There are two that I know of that actually compile the forth source into a JVM class and allow you to execute the Forth code directly without the need for the interpreter.
HolonJ by Wolf Wejgaard
Misty Beach Forth

The main advantage of a FORTH alike interpreter is its interactivity – means I enter a word and get a response immediately. If it needs an intermediate step first, to generate a file or something, this advantage is gone. The second thing is the POL (Problem Oriented Language) aspect: the language must be expandable seamlessly. So, if a FORTH alike language is unable to compile new words, its worthless.

Related

AnyLogic memory error: how to know how much the threshold is exceeded?

I have a lot of road traffic and markup elements, charts, nodes and arcs within my Main agent. When running the simulation it throws the following error:
Description: The code of method _createPersistentElementsBP4_xjal() is exceeding the 65535 bytes limit.
I read this article: https://noorjax.com/2018/10/17/your-agent-is-too-big-memory-problem/
However, I would like to know how much have I exceeded the limit. Is there any way of getting this information? Because if it is not that far from the threshold, I can make some modifications to drop below that threshold. Otherwise it is painful to create so many new agents, etc.

This is a Java Virtual Machine (JVM) restriction on the Java bytecode size for the method body (i.e., the compiled code size) as I understand it (e.g., see Baeldung's description which links to the relevant JVM specification details). Thus, even though you can see the generated Java source code for the offending method, it isn't actually the length of that that is the limitation (though obviously the length of the source code correlates to some degree with the size of the compiled bytecode).
[As such, I'm surprised if Felipe's idea of reducing variable name lengths makes any difference since they're not stored explicitly like that in the bytecode...]
So, no, you can't tell how much you've exceeded it by (unless I guess you actually interrogate compiled class files and know exactly what you're doing). Even though it is AnyLogic's code generation that is 'causing' the problem, any such situation will normally always be something that you could re-architect better (as with Felipe's example) from an object-oriented (or data structuring) design perspective in the model.

Efficient bit manipulation in Java?

I am a Java programmer, and I have recently started competitive coding (Codechef, Hackerearth, etc..)
I have a feeling that bit manipulation is really very slow in Java when it comes to very large input values. From my experience, the same C/C++ code (By same i mean if converted from java with same algorithm, same strategy, etc. I am not changing my logic when converting from Java to C/C++ or vice à versa) which runs all test cases successfully, generates a time limit exceeded in Java. I am aware that most competitive programming sites provide 2x execution time for Java programs, but still it crosses the time limit.
In languages like C++, we have functions like __builtin_popcount which can exploit CPU inbuilt functions that are very fast. Such things are not available in Java. Some functions like java.lang.Integer.bitCount() will only work for a 32-bit int.
So should we prefer going with C++ for such problems? Should we even consider solving bit manipulation type problems using Java? If not, then are there any super fast efficient tricks rather than applying our own logic?
(There is also the fact that different architecture machines will take different amount of time, but lets ignore that. My question is in the context of competitive programming)

Long has bitCount too, and lowestOneBit, numberOfLeadingZeroes, numberOfTrailingZeroes etcetera.
And then there is BitSet, more for boolean purposes.
Those operations are in small short running programs horribly slow as the Just-In-Time compiler does not kick in.
Java is not C, but reverting inside a java program to C/C++ for those kind of operations, often do not merit because of the JNI overhead.
Till now I found java sufficient for reaching similar performance, even where one would not expect it. It is in allocations where java can beat C/C++ at times. But nobody beats C on the calculatory level, how for instance n % 12 is calculated.

Sensibility of converting Matlab program in Java to improve performance

His
I have a somewhat hypothetical question. We've just programmed some code implementing genetic algorithm to find a solution to a sudoku game as part of the Computational Intelligence course project. Unfortunately it runs very slowly which limits our ability to perform adequate number of runs to find the optimal parameters. The question is whether reprogramming the whole thing - the code basis is not that big - into java would be a viable solution to boost up the speed of the software. Like we need 10x performance improvement really and i am doubtful that a Java version would be so much snappier. Any thoughts?
Thanks
=== Update 1 ===
Here is the code of the function that is computationally most expensive. It's a GA fitness function, that iterates through the population (different sudoku boards) and computes for each row and column how many elements are duplicates. The parameter n is passed, and is currently set to 9. That is, the function computes how many elements a row has that come up within the range 1 to 9 more then once. The higher the number the less is the fitness of the board, meaning that it is a weak candidate for the next generation.
The profiler reports that the two lines calling intersect in the for loops causing the poor performance, but we don't know how to really optimize the code. It follows below:
function [fitness, finished,d, threshold]=fitness(population_, n)
finished=false;
threshold=false;
V=ones(n,1);
d=zeros(size(population_,2),1);
s=[1:1:n];
for z=1:size(population_,2)
board=population_{z};
t=0;
l=0;
for i=1:n
l=l+n-length(intersect(s,board(:,i)'));
t=t+n-length(intersect(s,board(i,:)));
end
k=sum(abs(board*V-t));
f=t+l+k/50;
if t==2 &&l==2
threshold=true;
end
if f==0
finished=true;
else
fitness(z)=1/f;
d(z)=f;
end
end
end
=== Update 2 ===
Found a solution here: http://www.mathworks.com/matlabcentral/answers/112771-how-to-optimize-the-following-function
Using histc(V, 1:9), it's much faster :)

This is rather impossible to say without viewing your code, knowing if you use parallelization, etc. Indeed, as MrAzzaman says, profiling is the first thing to do. If you find a single bottleneck, especially if it is loop-heavy, it might be sufficient to write that part in C and connect it to Matlab via MEX.
In genetics algorithms, I'd believe that a 10x speed increase could be obtained rather than not. I do not quite agree with MrAzzaman here - in some cases (for loops, working with dynamic objects) is much, much slower than C/C++/Java. That is not to say that Matlab is always slow, for it is not, but there is plenty of algorithms where it would be slow.
I.e., I'd say that if you don't spend so much time looping over things, don't use objects, are not limited by Matlab's data structures, you might be ok with Matlab. That said, if I was to write GAs in Java or Matlab, I'd rather pick the former (and I'm using Matlab a lot more than Java these days, it's not just a matter of habit).
Btw. if you don't want to program it yourself, have a look at JGAP, it's a rather useful Java library for GAs.

OK, the first step is just to write a faster MATLAB function. Save the new languages for later.
I'm going to make the assumption that the board is full of valid guesses: that is, each entry is in [1, 9]. Now, what we're really looking for are duplicate entries in each row/column. To find duplicates, we sort. On a sorted row, if any element is equal to its neighbor, we have a duplicate. In MATLAB, the diff function does sliding pairwise differencing, and a zero in its output means that two neighboring values are equal. Both sort and diff operate on entire matrices, so no need for looping. Here's the code for the columnwise check:
l=sum(sum(diff(sort(board)) == 0));
The rowwise check is exactly the same, just using the transpose. Now let's put that in a test harness to compare results and timing with the previous version:
n = 9;
% Generate a test board. Random integers numbers from 1:n
board = randi(n, n);
s = 1:n;
K=1000; % number of iterations to use for timing
% Repeat current code for comparison
tic
for k=1:K
t=0;
l=0;
for i=1:n
l=l+n-length(intersect(s,board(:,i)'));
t=t+n-length(intersect(s,board(i,:)));
end
end
toc
% New code based on sort/diff for finding repeated values
tic
for k=1:K
l2=sum(sum(diff(sort(board)) == 0));
t2=sum(sum(diff(sort(board.')) == 0));
end
toc
% Check that reported values match
disp([l l2])
disp([t t2])
I encourage you to break down the sort/diff/sum code, and build it up on a sample board right at the command line, and try to understand exactly how it works.
On my system, the new code is about 330x faster.

For traditional GA applications for studying and research purposes it is better to use a native machine compiled source code programming language, like C, C++. Which I used when working with Genetic
Programming in the past and it is really fast.
However if you are planning to put this inside a more modern type of application that can be deployed in a web container or run in a mobile device, different OS, etc. Then Java is your best alternative as it is platform independent.
Another thing that can be important is about concurrency. For example lets us suppose that you want to put your GA in the Internet and you will have a growing number of users that are connected concurrently and all of them want to solve a different sudoku, Java applications are very good for scaling horizontally and works great with big number of concurrent connections.
Other thing that can be good if you migrate to Java is the number of libraries and frameworks that you can use, the Java universe is so big that you can find useful tools for almost any kind of application.
Java is a Virtual Machine compiled language, but it is important to note that currently the JVMs are very good in performance and are able to optimize the programs, for example they will find which methods are being more heavily used and compile them to native code, which means that for some applications you will find a Java program to be almost same fast than a native compiled from C.
Matlab is a platform that is very useful for engineering training and math, vector, matrix based calculations, also for some control stuff with Simulink. I used these products when in my electrical engineering bachelor, however those product's goal is to be mainly a tool for academic purposes I won't definitely go for Matlab if I am wanting to build a production application for the real world. It is not scalable, it is expensive to maintain and fine-tune it, also there are not lot of infrastructure providers that will support this kind of technology.
About the complexity of rewriting your code to Java, the Matlab code and Java code syntax is pretty similar, they also live in the same paradigm: Procedural OOP, even if you are not using OO in your code it can be easy rewritten in Java, the painful stuff will be when working with Matlab shortcuts to Math structures like matrix and passing functions as parameters.
For the matrix stuff, there are lot of java libraries like EJML that will make your life easier. About assigning functions to variables and then pass them as parameters to another functions, Java is not currently able to do that (Java 8 will be with Lambda Expressions) but you can have a equivalent functionality by using Class closures. Maybe these will be the only little painful things that you will find if migrating.

Found a solution here: http://www.mathworks.com/matlabcentral/answers/112771-how-to-optimize-the-following-function
Using histc(V, 1:9), it's much faster :)

What can I do in Java code to optimize for CPU caching?

When writing a Java program, do I have influence on how the CPU will utilize its cache to store my data? For example, if I have an array that is accessed a lot, does it help if it's small enough to fit in one cache line (typically 128 byte on a 64-bit machine)? What if I keep a much used object within that limit, can I expect the memory used by it's members to be close together and staying in cache?
Background: I'm building a compressed digital tree, that's heavily inspired by the Judy arrays, which are in C. While I'm mostly after its node compression techniques, Judy has CPU cache optimization as a central design goal and the node types as well as the heuristics for switching between them are heavily influenced by that. I was wondering if I have any chance of getting those benefits, too?
Edit: The general advice of the answers so far is, don't try to microoptimize machine-level details when you're so far away from the machine as you're in Java. I totally agree, so felt I had to add some (hopefully) clarifying comments, to better explain why I think the question still makes sense. These are below:
There are some things that are just generally easier for computers to handle because of the way they are built. I have seen Java code run noticeably faster on compressed data (from memory), even though the decompression had to use additional CPU cycles. If the data were stored on disk, it's obvious why that is so, but of course in RAM it's the same principle.
Now, computer science has lots to say about what those things are, for example, locality of reference is great in C and I guess it's still great in Java, maybe even more so, if it helps the optimizing runtime to do more clever things. But how you accomplish it might be very different. In C, I might write code that manages larger chunks of memory itself and uses adjacent pointers for related data.
In Java, I can't (and don't want to) know much about how memory is going to be managed by a particular runtime. So I have to take optimizations to a higher level of abstraction, too. My question is basically, how do I do that? For locality of reference, what does "close together" mean at the level of abstraction I'm working on in Java? Same object? Same type? Same array?
In general, I don't think that abstraction layers change the "laws of physics", metaphorically speaking. Doubling your array in size every time you run out of space is a good strategy in Java, too, even though you don't call malloc() anymore.

The key to good performance with Java is to write idiomatic code, rather than trying to outwit the JIT compiler. If you write your code to try to influence it to do things in a certain way at the native instruction level, you are more likely to shoot yourself in the foot.
That isn't to say that common principles like locality of reference don't matter. They do, but I would consider the use of arrays and such to be performance-aware, idiomatic code, but not "tricky."
HotSpot and other optimizing runtimes are extremely clever about how they optimize code for specific processors. (For an example, check out this discussion.) If I were an expert machine language programmer, I'd write machine language, not Java. And if I'm not, it would be unwise to think that I could do a better job of optimizing my code than the experts.
Also, even if you do know the best way to implement something for a particular CPU, the beauty of Java is write-once-run-anywhere. Clever tricks to "optimize" Java code tend to make optimization opportunities harder for the JIT to recognize. Straight-forward code that adheres to common idioms is easier for an optimizer to recognize. So even when you get the best Java code for your testbed, that code might perform horribly on a different architecture, or at best, fail to take advantages of enhancements in future JITs.
If you want good performance, keep it simple. Teams of really smart people are working to make it fast.

If the data you're crunching is primarily or wholly made up of primitives (eg. in numeric problems), I would advise the following.
Allocate a flat structure of fixed size arrays-of-primitives at initialisation-time, and make sure the data therein is periodically compacted/defragmented (0->n where n is the smallest max index possible given your element count), to be iterated over using a for-loop. This is the only way to guarantee contiguous allocation in Java, and compaction further serves to improves locality of reference. Compaction is beneficial, as it reduces the need to iterate over unused elements, reducing the number of conditionals: As the for loop iterates, the termination occurs earlier, and less iteration = less movement through the heap = fewer chances for a cache miss. While compaction creates an overhead in and of itself, this may be done only periodically (with respect to your primary areas of processing) if you so choose.
Even better, you can interleave values in these pre-allocated arrays. For instance, if you are representing spatial transforms for many thousands of entities in 2D space, and are processing the equations of motion for each such, you might have a tight loop like
int axIdx, ayIdx, vxIdx, vyIdx, xIdx, yIdx;
//Acceleration, velocity, and displacement for each
//of x and y totals 6 elements per entity.
for (axIdx = 0; axIdx < array.length; axIdx += 6)
{
ayIdx = axIdx+1;
vxIdx = axIdx+2;
vyIdx = axIdx+3;
xIdx = axIdx+4;
yIdx = axIdx+5;
//velocity1 = velocity0 + acceleration
array[vxIdx] += array[axIdx];
array[vyIdx] += array[ayIdx];
//displacement1 = displacement0 + velocity
array[xIdx] += array[vxIdx];
array[yIdx] += array[vxIdx];
}
This example ignores such issues as rendering of those entities using their associated (x,y)... rendering always requires non-primitives (thus, references/pointers). If you do need such object instances, then you can no longer guarantee locality of reference, and will likely be jumping around all over the heap. So if you can split your code into sections where you have primitive-intensive processing as shown above, then this approach will help you a lot. For games at least, AI, dynamic terrain, and physics can be some of the most processor-intensives aspect, and are all numeric, so this approach can be very beneficial.

If you are down to where an improvement of a few percent makes a difference, use C where you'll get an improvement of 50-100%!
If you think that the ease of use of Java makes it a better language to use, then don't screw it up with questionable optimizations.
The good news is that Java will do a lot of stuff beneath the covers to improve your code at runtime, but it almost certainly won't do the kind of optimizations you're talking about.
If you decide to go with Java, just write your code as clearly as you can, don't take minor optimizations into account at all. (Major ones like using the right collections for the right job, not allocating/freeing objects inside a loop, etc. are still worth while)

So far the advice is pretty strong, in general it's best not to try and outsmart the JIT. But as you say some knowledge about the details is useful sometimes.
Regarding memory layout for objects, Sun's Jvm (now Oracle's) lays objects into memory by type (i.e. doubles and longs first, then ints and floats, then shorts and chars, after that bytes and booleans and finally object references). You can get more details here..
Local variables are usually kept in the stack (that is references and primitive types).
As Nick mentions, the best way to ensure the memory layout in Java is by using primitive arrays. That way you can make sure that data is contiguous in memory. Be careful about array sizes though, GCs have trouble with large arrays. It also has the downside that you have to do some memory management yourself.
On the upside, you can use a Flyweight pattern to get Object-like usability while keeping fast performance.
If you need the extra oomph in performance, generating your own bytecode on the fly helps with some problems, as long as the generated code is executed enough times and your VM's native code cache doesn't get full (which disables the JIT for all practical purposes).

To the best of my knowledge: No. You pretty much have to be writing in machine code to get that level of optimization. With assembly you're a step away because you no longer control where things are stored. With a compiler you're two steps away because you don't even control the details of the generated code. With Java you're three steps away because there's a JVM interpreting your code on the fly.
I don't know of any constructs in Java that let you control things on that level of detail. In theory you could indirectly influence it by how you organize your program and data, but you're so far away that I don't see how you could do it reliably, or even know whether or not it was happening.

Why java has fixed data type size unlike C

In C as we know the size of data types (ex. int) can vary depending on compiler / hardware.
But why the size of data types is constant in java language? Why don't we have the flexibility for different data type size in java depending on compiler?

The JVM (Java Virtual Machine) is designed to be platform independent. If data type sizes were different across platforms, then cross-platform consistency is sacrificed.
The JVM isolates the program from the underlying OS and platform. This can make life difficult for performing system-specific work, but the benefits are that you can write-once, run-anywhere (this is largely true, with some unfortunate issues. Write-once, test-everywhere is a much more practical approach).

If data type size varies on different platforms you lose portability.

To get a really comprehensive answer to this, you'd need to do a great deal of historical reading from the early days of Java. Certainly, the designers could have included a more complicated primitive type system. However, when Java burst onto the broad stage, it was aimed at applets. Code to run in a browser, organizing complex UI, didn't (and doesn't) need to know whether it is running on the infamous MNS-49 (7 7-bit chars per word), or the Honeywell 68000 (4 9-bit chars per word), or a boring modern processor. It's much more important than anyone can code bit arithmetic on an int and know what's going to happen after 32 shifts.

The flexibility of C for this has some advantages (reduced memory/storage consumption if you use 32 instead of 64 bits), but these advantages tend to become less relevant as the hardware improves (this was designed in the 70s).
This flexibility however comes with severe interoperability and long-term vision problems (Y 2038 bugs).
In contrast, a Java object has anyway some storage overhead, so saving 4 bytes on each Date object would be quite pointless and only troublesome.

Because that's Java. See the Java language specification.

The idea of java was "Write once, Run anywhere" without recompiling. That means every VM has the same data size. Of course, on 64 bit machines, it uses 64 bit references, but you don't have access to those so it doesn't matter.
It works pretty well, but one thing I do wish is that wish we could get 64 bit array indexes. This didn't really matter back in the day, but for large memory mapped files it's a huge pain. You have to break them up into 2gb chunks.

c language has its own advantage of varying the data type size. that time main memory is not too much...every programmer has to write code that is space optimized.
nowadays space is no more a issue...a portable program is much more preferable.
that why to make java portable java do not support varying size datatypes

because java is platform independent language. that's why in java size of data type is fixed

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.