How to perform high precision calculations with mutable class in Java? - java

I'm designing a tool using Java 6, that will read data from Medical Devices.
Each Medical Device manufacturer implements its own Firmware/Protocol. Vendors (like me) write their own interface that
uses the manufacturer's firmware commands to acquire data from the Medical Device. Most firmwares will output data in a cryptic fashion, so the vendor receiving it, is supposed to scale it by doing some calculations on it, in order to figureout the true value.
Its safe to assume that medical data precision is as important as financial data precision etc.
I've come to the conclusion of using BigDecimal to do all numerical calculations and store the final value. I'll be receiving a new set of data almost every second, which means, I'll be doing calculations and updating the same set of values every second. Example:Data coming across from a ventilator for each breath.
Since BigDecimal is immutable, I'm worried about the number of objects generated in the heap every second. Especially since the tool will have to scale up to read data from lets say 50 devices at the same time.
I can increase the heap size and all that, but still here's my questions....
Questions
Is there any mutable cousin of BigDecimal I could use?
Is there any existing opensource framework to do something like this?
Is Java the right language for this kind of functionality?
Should I look into Apfloat? But Apfloat is immutable too. How about JScience?
Any Math library for Java I can use for high precision
I'm aiming for a precision of upto 10 digits only. Dont need more than that. So whats the best library or course of action for this type of precision?

I would recommend that before you jump to the conclusion that BigDecimal doesn't suit your needs, that you actually profile your scenario. It is not a foregone conclusion that immutable nature is going to have a significant impact on your scenario. A modern JVM is very good at allocate and destroying large quantities of objects.

The double primitive type offers approximately 16 digits of decimal precision, why not just use that? Then you won't be touching the heap at all.

The Garbage collector should do a decent enough job of cleaning the objects up, however if you still do want immutable Numbers, you can always access the back-data of BigDecimal using reflections, that way you can create a wrapper class for BigDecimal to do what you want.

If you only need 10 digits of precision you can simply use a double, which has 16 digits of precision.

You say specifically that you need only 10 significant digits. You don't say whether you're considering binary or decimal digits, but standard 64-bit IEEE floating point (Java double) offers 52 binary digits (approximately 16 decimal digits), which sounds like it more than meets your needs.
However, I do recommend that you put some thought into the numerical stability of whatever operations you apply to the input numbers. For example, Math.log() and Math.exp() can have unexpected effects depending on the range of the inputs (in some cases, you might find Math.log1p() and Math.exp1m() to be more appropriate, but again - that depends on the specific operations you're performing).

Related

Why would I use byte, double, long, etc. when I could just use int?

I'm a pretty new programmer and I'm using Java. My teacher said that there was many types of integers. I don't know when to use them. I know they have different sizes, but why not use the biggest size all the time? Any reply would be awesome!!!
Sometimes, when you're building massive applications that could take up 2+ GB of memory, you really want to be restrictive about what primitive type you want to use. Remember:
int takes up 32 bits of memory
short takes up 16 bits of memory, 1/2 that of int
byte is even smaller, 8 bits.
See this java tutorial about primitive types: http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
The space taken up by each type really matters if you're handling large data sets. For example, if your program has an array of 1 million ints, then you're taking up 3.81 MB of RAM. Now let's say you know for certain that those 1,000,000 numbers are only going to be in the range of 1-10. Why not, then use a byte array? 1 million bytes only take up 976 Kilobytes, less than 1 MB.
You always want to use the number type that is just "large" enough to fit, just as you wouldn't put an extra-large T-shirt on a newborn baby.
So you could. Memory is cheap these days and assuming you are writing a simple program it is probably not a big deal.
Back in the days when memory was expensive, you needed to be a lot more careful with how much memory you use.
Let's say if you are word processing on an IBM 5100 as in one of the first PCs in the 70s -- which has a minimum of 16KB RAM (unimaginable these days) -- if you use 64-bit values all day, you can keep at most 2048 characters without any memory for the word processing program itself, that's not enough to hold what I'm typing right now!
Knowing that English has a limited number of characters and symbols, if you choose to use ASCII to represent the text, you would use 8 bits (or 1 byte) per character which allows you to go up to about 16,000 characters, and that's quite a bit more room for typing.
Generally you will use a data type that's just big enough to hold the biggest number you might need to save on memory. Let's say you are writing a database for the IRS to keep track of all the tax IDs, if you are able to save 1 bit of memory per record that's billions of bits (gigabytes!) of memory savings.
The ones that can hold higher numbers use more memory. Using more memory is bad. One int verses one byte is not a big difference right now. But if you write big programs in the future, the used memory adds up.
Also you said double in the title. A double is not like an int. It does hold a number, but it can have decimal places (eg. 2.36), unlike a int which can only hold numbers like 8.
Because we like to be professional and also the byte uses less memory than int and double can include decimal places.

Which is faster, int to String or String to int?

This may seem like a fairly basic question, for which I apologise in advance.
I'm writing an Android app that uses a set of predefined numbers. At the moment I'm working with int values, but at some point I will probably need to use float and double values, too.
The numbers are used for two things. First, I need to display them to the user, for which I need a String (I'm creating a custom View and drawing the String on a Canvas). Second, I need will be using them in a sort of calculator, for which they obviously need to be int (or float/double).
Since the numbers are the same whether they are used as String or int, I only want to store them once (this will also reduce errors if I need to change any of them; I'll only need to change them in the one place).
My question is: should I store them as String or as int? Is it faster to write an int as a String, or to parse an int from a String? My gut tells me that parsing would take more time/resources, so I should store them as ints. Am I right?
Actually, your gut may be wrong (and I emphasise may, see my comments below on measuring). To convert a string to an integer requires a series of multiply/add operations. To convert an integer to a string requires division/modulo. It may well be that the former is faster than the latter.
But I'd like to point out that you should measure, not guess! The landscape is littered with the corpses of algorithms that relied on incorrect assumptions.
I would also like to point out that, unless your calculator is expected to do huge numbers of calculations each second (and I'm talking millions if not billions), the difference will be almost certainly be irrelevant.
In the vast majority of user-interactive applications, 99% of all computer time is spent waiting for the user to do something.
My advice is to do whatever makes your life easier as a developer and worry about performance if (and only if) it becomes an issue. And, just to clarify, I would suggest that storing them in native form (not as strings) would be easiest for a calculator.
I did a test on a 1 000 000 size array of int and String. I only timed the parsing and results says :
Case 1, from int to String : 1 000 000 in an average of 344ms
Case 2, from String to int : 1 000 000 in an average of 140ms
Conclusion, you're guts were wrong :) !
And I join the others saying, this is not what is going to make you're application slow. Better concentrate on making it simpler and safer.
I'd say that's not really relevant. What should matter more is type safety: since you have numbers int (or float and double) would force you to use numbers and not store "arbitrary" data (which String would allow to some extent).
The best is to do a bench test. Write two loops
one that converts 100000 units from numeric to String
one that converts 100000 units from String to numeric
And measure the time elapsed by getting System.currentTimeMillis() before and after each loop.
But personally, if I would need to do calculation on these numbers, I would store them in their native format (int or float) and I would only convert them to String for display. This is more a question of design and maintainability than a question of execution speed. Focusing on execution speed is sometime counterproductive: to gain a few µSec nobody will notice is not worth sacrifying the design and the robustness (of course, some compromise may have to be done when this is a question of saving a lot of CPU time). This reading may interest you.
A human who is using the calculator will not notice a performance difference, but as others have said. Using strings as your internal representation is a bad idea since you don't get type safety in that case.
You will most likely get into maintenance problems later on if you decide to use strings.
It's better design practice to have the view displayed to the user being derived from the underlying data, rather than the other way around - at some point you might decide to render the calculator using your own drawing functions or fixed images, and having your data as strings would be a pain here.
That being said, neither of these operations are particularly time consuming using modern hardware.
Parsing is a slow thing, printing a number is not. The internal representation as number allows you to compute, which is probably what you intend to d with your numbers. Storing numbers as, well, numbers (ints, floats, decimals) also takes up less space than their string representations, so … you'll probably want to go with storing them as ints, floats, or whatever they are.
You are writing an application for mobile devices, where the memory comsumption is a huge deal.
Storing an int is cheap, storing a String is expensive. Go for int.
Edit: more explanation. Storing an int bteween -2^31 and 2^31-1 costs 32 bits. No matter what the number is. Storing it in a String is 16 bits per digit in its base 10 representation.

Java resources to do number crunching?

What are the best resources on learning 'number crunching' using Java ? I am referring to things like correct methods of decimal number processing , best practices , API , notable idioms for performance and common pitfalls ( and their solutions ) while coding for number processing using Java.
This question seems a bit open ended and open to interpretation. As such, I will just give two short things.
1) Decimal precision - never assume that two floating point (or double) numbers are equal, even if you went through the exact same steps to calculate them both. Due to a number of issues with rounding in various situations, you often cannot be certain that a decimal number is exactly what you expect. If you do double myNumber = calculateMyNumber() and then do a bunch of things and then come back to it and check if(myNumber == calculateMyNumber(), that evaluation could be false even if you have not changed the calculations done in calculateMyNumber()
2) There are limitations in the size and precision of numbers that you can keep track of. If you have int myNumber = 2000000000 and if(myNumber*2 < myNumber), that will actually evaluate to true, as myNumber*2 will result in a number less than myNumber, because the memory allocated for the number isn't big enough to hold a number that large and it will overflow, becoming smaller than it was before. Look into classes that encapsulate large numbers, such as BigInteger and BigDecimal.
You will figure stuff like this out as a side effect if you study the computer representations of numbers, or binary representations of numbers.
First, you should learn about floating point math. This is not specific to java, but it will allow you to make informed decisions later about, for example, when it's OK to use Java primitives such as float and double. Relevant topics include (copied from a course that I took on scientific computing):
Sources of error: roundoff, truncation error, incomplete convergence, statistical error,
program bug.
Computer floating point arithmetic and the IEEE standard.
Error amplification through cancellation.
Conditioning, condition number, and error amplification.
This leads you to decisions about whether to use Java's BigDecimal, BigInteger, etc. There are lots of questions and answers about this already.
Next, you're going to hit performance, including both CPU and memory. You probably will find various rules of thumb, such as "autoboxing a lot is a serious performance problem." But as always, the best thing to do is profile your own application. For example, let's say somebody insists that some optimization is important, even though it affects legibility. If that optimization doesn't matter for your application, then don't do it!
Finally, where possible, it often pays to use a library for numerical stuff. Your first attempt probably will be slower and more buggy than existing libraries. For example, for goodness sake, don't implement your own linear programming routine.

Is it better to cast a whole expression, or just the different type variable

I am using floats for some Android Java game graphics, but the Math library trig functions all return double, so I have to explicitly cast them.
I understand that floats are quicker to process than doubles, and I do not need high precision answers.
e.g. which is better:
screenXf = (float) (shipXf + offsetXf * Math.sin(headingf) - screenMinXf);
or
screenXf = shipXf + offsetXf * (float) (Math.sin(headingf)) - floatScreenMinXf;
I suppose other questions would be 'how can I test this on an emulator without other factors (e.g. PC services) confusing the issue?' and 'Is it going to be different on different hardware anyway?'
Oh dear, that's three questions. Life is never simple :-(
-Frink
Consider using FloatMath.sin() instead.
FloatMath
Math routines similar to those found in Math. Performs computations on float values directly without incurring the overhead of conversions to and from double.
But note this blurb in the android docs:
http://developer.android.com/guide/practices/design/performance.html#avoidfloat
Designing for Performance
...
In speed terms, there's no difference between float and double on the more modern hardware. Space-wise, double is 2x larger. As with desktop machines, assuming space isn't an issue, you should prefer double to float.
Although this guy #fadden, purportedly one of the guys who wrote the VM, says:
Why are there so many floats in the Android API?
On devices without an FPU, the single-precision floating point ops are much faster than the double-precision equivalents. Because of this, the Android framework provides a FloatMath class that replicates some java.lang.Math functions, but with float arguments instead of double.
On recent Android devices with an FPU, the time required for single- and double-precision operations is about the same, and is significantly faster than the software implementation. (The "Designing for Performance" page was written for the G1, and needs to be updated to reflect various changes.)
His last sentence ("page ... need to be updated") refers to the page I referenced above, so I wonder if he is referring to that sentence about "no difference" that I quoted above.
This is definitely dependent on the HW. I know nothing about the target platforms, but on a current PC it takes the same amount of time while floats were about twice as fast as doubles on a i386.
Unless your emulator can report the cycle count, you can't find it out as the HW of your PC has little in common with the HW of the target platform. When the target platform were your PC, than I'd recommend http://code.google.com/p/caliper/ for this microbenchmark.

Anyone using short and byte primitive types, in real apps?

I have been programming in Java since 2004, mostly enterprise and web applications. But I have never used short or byte, other than a toy program just to know how these types work. Even in a for loop of 100 times, we usually go with int. And I don't remember if I have ever came across any code which made use of byte or short, other than some public APIs and frameworks.
Yes I know, you can use a short or byte to save memory in large arrays, in situations where the memory savings actually matters. Does anyone care to practice that? Or its just something in the books.
[Edited]
Using byte arrays for network programming and socket communication is a quite common usage. Thanks, Darren, to point that out. Now how about short? Ryan, gave an excellent example. Thanks, Ryan.
I use byte a lot. Usually in the form of byte arrays or ByteBuffer, for network communications of binary data.
I rarely use float or double, and I don't think I've ever used short.
Keep in mind that Java is also used on mobile devices, where memory is much more limited.
I used 'byte' a lot, in C/C++ code implementing functionality like image compression (i.e. running a compression algorithm over each byte of a black-and-white bitmap), and processing binary network messages (by interpreting the bytes in the message).
However I have virtually never used 'float' or 'double'.
The primary usage I've seen for them is while processing data with an unknown structure or even no real structure. Network programming is an example of the former (whoever is sending the data knows what it means but you might not), something like image compression of 256-color (or grayscale) images is an example of the latter.
Off the top of my head grep comes to mind as another use, as does any sort of file copy. (Sure, the OS will do it--but sometimes that's not good enough.)
The Java language itself makes it unreasonably difficult to use the byte or short types. Whenever you perform any operation on a byte or short value, Java promotes it to an int first, and the result of the operation is returned as an int. Also, they're signed, and there are no unsigned equivalents, which is another frequent source of frustration.
So you end up using byte a lot because it's still the basic building block of all things cyber, but the short type might as well not exist.
Until today I haven't notice how seldom I use them.
I've use byte for network related stuff, but most of the times they were for my own tools/learning. In work projects these things are handled by frameworks ( JSP for instance )
Short? almost never.
Long? Neither.
My preferred integer literals are always int, for loops, counters, etc.
When data comes from another place ( a database for instance ) I use the proper type, but for literals I use always int.
I use bytes in lots of different places, mostly involving low-level data processing. Unfortunately, the designers of the Java language made bytes signed. I can't think of any situation in which having negative byte values has been useful. Having a 0-255 range would have been much more helpful.
I don't think I've ever used shorts in any proper code. I also never use floats (if I need floating point values, I always use double).
I agree with Tom. Ideally, in high-level languages we shouldn't be concerned with the underlying machine representations. We should be able to define our own ranges or use arbitrary precision numbers.
when we are programming for electronic devices like mobile phone we use byte and short.In this case we should take care on memory management.
It's perhaps more interesting to look at the semantics of int. Are those arbitrary limits and silent truncation what you want? For application-level code really wants arbitrary sized integers, it's just that Java has no way of expressing those reasonably.
I have used bytes when saving State while doing model checking. In that application the space savings are worth the extra work. Otherwise I never use them.
I found I was using byte variables when doing some low-level image processing. The .Net GDI+ draw routines were really slow so I hand-rolled my own.
Most times, though, I stick with signed integers unless I am forced to use something larger, given the problem constraints. Any sort of physics modeling I do usually requires floats or doubles, even if I don't need the precision.
Apache POI was using short quite a few times. Probably because of Excel's row/column number limitation.
A few months ago they changed to int replacing
createCell(short columnIndex)
with
createCell(int column).
On in-memory datagrids, it can be useful.
The concept of a datagrid like Gemfire is to have a huge distributed map.
When you don't have enough memory you can overflow to disk with LRU strategy, but the keys of all entries of your map remains in memory (at least with Gemfire).
Thus it is very important to make your keys with a small footprint, particularly if you are handling very large datasets.
For the entry value, when you can it's also better to use the appropriate type with a small memory footprint...
I have used shorts and bytes in Java apps communicating to custom usb or serial micro-controllers to receive 10bit values wrapped in 2 bytes as shorts.
bytes and shorts are extensively used in Java Card development. Take a look at my answer to Are there any real life uses for the Java byte primitive type?.

Categories