Which is faster, int to String or String to int? - java

This may seem like a fairly basic question, for which I apologise in advance.
I'm writing an Android app that uses a set of predefined numbers. At the moment I'm working with int values, but at some point I will probably need to use float and double values, too.
The numbers are used for two things. First, I need to display them to the user, for which I need a String (I'm creating a custom View and drawing the String on a Canvas). Second, I need will be using them in a sort of calculator, for which they obviously need to be int (or float/double).
Since the numbers are the same whether they are used as String or int, I only want to store them once (this will also reduce errors if I need to change any of them; I'll only need to change them in the one place).
My question is: should I store them as String or as int? Is it faster to write an int as a String, or to parse an int from a String? My gut tells me that parsing would take more time/resources, so I should store them as ints. Am I right?

Actually, your gut may be wrong (and I emphasise may, see my comments below on measuring). To convert a string to an integer requires a series of multiply/add operations. To convert an integer to a string requires division/modulo. It may well be that the former is faster than the latter.
But I'd like to point out that you should measure, not guess! The landscape is littered with the corpses of algorithms that relied on incorrect assumptions.
I would also like to point out that, unless your calculator is expected to do huge numbers of calculations each second (and I'm talking millions if not billions), the difference will be almost certainly be irrelevant.
In the vast majority of user-interactive applications, 99% of all computer time is spent waiting for the user to do something.
My advice is to do whatever makes your life easier as a developer and worry about performance if (and only if) it becomes an issue. And, just to clarify, I would suggest that storing them in native form (not as strings) would be easiest for a calculator.

I did a test on a 1 000 000 size array of int and String. I only timed the parsing and results says :
Case 1, from int to String : 1 000 000 in an average of 344ms
Case 2, from String to int : 1 000 000 in an average of 140ms
Conclusion, you're guts were wrong :) !
And I join the others saying, this is not what is going to make you're application slow. Better concentrate on making it simpler and safer.

I'd say that's not really relevant. What should matter more is type safety: since you have numbers int (or float and double) would force you to use numbers and not store "arbitrary" data (which String would allow to some extent).

The best is to do a bench test. Write two loops
one that converts 100000 units from numeric to String
one that converts 100000 units from String to numeric
And measure the time elapsed by getting System.currentTimeMillis() before and after each loop.
But personally, if I would need to do calculation on these numbers, I would store them in their native format (int or float) and I would only convert them to String for display. This is more a question of design and maintainability than a question of execution speed. Focusing on execution speed is sometime counterproductive: to gain a few µSec nobody will notice is not worth sacrifying the design and the robustness (of course, some compromise may have to be done when this is a question of saving a lot of CPU time). This reading may interest you.

A human who is using the calculator will not notice a performance difference, but as others have said. Using strings as your internal representation is a bad idea since you don't get type safety in that case.
You will most likely get into maintenance problems later on if you decide to use strings.

It's better design practice to have the view displayed to the user being derived from the underlying data, rather than the other way around - at some point you might decide to render the calculator using your own drawing functions or fixed images, and having your data as strings would be a pain here.
That being said, neither of these operations are particularly time consuming using modern hardware.

Parsing is a slow thing, printing a number is not. The internal representation as number allows you to compute, which is probably what you intend to d with your numbers. Storing numbers as, well, numbers (ints, floats, decimals) also takes up less space than their string representations, so … you'll probably want to go with storing them as ints, floats, or whatever they are.

You are writing an application for mobile devices, where the memory comsumption is a huge deal.
Storing an int is cheap, storing a String is expensive. Go for int.
Edit: more explanation. Storing an int bteween -2^31 and 2^31-1 costs 32 bits. No matter what the number is. Storing it in a String is 16 bits per digit in its base 10 representation.

Related

Best way to define a constant identifier?

In my java application one of my objects has exactly one value from a set of values. Now I wonder how to define them to increase the performance:
private static final String ITEM_TYPE1= "type1"
private static final int ITEM_TYPE1= 1
Does defining int better than string? (I should convert the value to string so I like to define as string but just fearing for performance reasons because comparing ints is simpler than srtings maybe)
EDIT: I am aware of enums but I just want to know whether ints has more performance than strings or not? This depends on how JDK and JRE handle the undergoing. (In Android dalvik or ART ..)
In my java application one of my objects has exactly one value from a set of values
That is what java enums are for.
Regarding the question "have ints more performance than strings", that is almost nonsensical.
You are talking about static constants. Even if they are used a 100 or a 1000 times in your app, performance doesn't matter here. What matters is to write code that is easy to read and maintain. Because then the JIT can kick in and turn it into nicely optimized machine code.
Please understand: premature optimisation is the root of all evil! Good or bad performance of your app depends on many other factors, definitely not on representing constants as ints or strings.
Beyond that: the type of some thing in Java should reflect its nature. If it is a string, make it a string (like when you want to mainly use it as string, and concatenate it to other strings). When you have numbers and deal with them as numbers, make it an int.
First of all, an int has always a fixed size which it uses in memory, on most systems it's 4 bytes (I guess on Java always).
A String is a complex type which means that it takes not only the bytes of the actual string data but also additional like the length of the string and so on.
So if you have the choice between String and int, you should always chose int. it does not take so much place and is faster to operate with.

Using chars as digits,

I am writing a simulation that will use a lot of memory and needs to be fast. Instead of using ints I am using chars (8 bits not 32). I need to operate on them as if these chars were ints.
To achieve that I have done something like
char a = 1;
char b = 2;
System.out.println(a*1 + b*1); //it give me 3 in console so it has int-like behavior;
I don't know what's going on "under the mask" when I multiply char with an integer. Is this the fastest way to do it?
Thank you for help!
Performance wise it's not worth using char instead of int, because all modern hardware architectures are optimized for 32- or 64-bit wide memory and register access.
Only reason to use char would be if you want to reduce memory footprint, i.e. if you work with large amount of data.
Additional info: Performance of built-in types : char vs short vs int vs. float vs. double
A char is simply a number (not a 32bit int but a number) which is normally represented ascii-encoded. By multiply it with an integer the compiler does an implicit cast from char to int that's why you get printed 3 on the console instead of the ascii representative.
You should use ints. the size of the character will not affect the fetch speed of the data, nor the processing time unless it is composed of multiple computer words, which is unlikely. There is no reason to do this. Moreover, chars are stored as integers and are therefore the same size. an alternative would be to use small ints but again this is not helpful for what you want.
Also, I notice your code has 'System.Out.Println' which means you are using java. There are several implications of this, the first is that no, you are not going to be going fast or using very little memory. Period, it will not happen. There is a large amount of overhead involved in running the JVM, JIT, garbage collector and other parts of the Java platform. If efficiency is a relavent factor you are starting off wrong. The second implication is that your choice of datatypes will have no impact on processing time because they will be identical to the physical hardware. Only the Virtual machine will distinguish between them, and in the case of primitives, there is no difference anyways.

int in java runs out of 2 billion, any alternative?

I used int as PK in Java application, now it has reached the max int value (2 billions), even in DB it can store the number more than it. but java int is only able to hold around 2 billions.
I am unable to change int to long to align to DB. because it's huge effort.
except this, anybody have any approach?
The maximum value for an Integer in Java is 2,147,483,647. If you need a bigger number, you'll have to change to a long. You could also use the negative range of the Integer, but if you've already hit the maximum, the likelihood is that you'll run out of room pretty soon.
However, if you don't have 2 billion elements in the DB, you could reuse the unused primary keys. This would probably be inefficient, because you'd have to search for unused keys.
I'd suggest just going through the effort of changing the code. Putting in the effort now will pay off in the long run.
I am unable to change int to long to align to DB. because it's huge effort.
You have no alternatives in the long term. Start coding.
Actually, if you are methodical about it, you will probably find that it is not a huge effort at all. Java IDEs are good for helping you with this sort of change.
#jjnguy suggested you let the keys wrap around to negative. That would give you 2 billion or so extra keys, but:
you will probably use the second 2 billion faster quicker than the first 2 billion, and
it is possible that your application (or the database) depends on keys always increasing.
So I would avoid that, unless roll-over was imminent.
I'm assuming you haven't used negative values for the IDs. If that's the case, you can let the value overflow to negative values. This will give you some time to refactor your code to use a long to store your data.
you need just to replace int with uint, long and etc. there is no other way
Everyone here suggests to change to long. This seems to me also the most straightforward approach. However, you ask for a different one.
You could also create another column, with long values, copy the value from the PK, and setting that henceforth as the PK. Although I would see this technically as more work, maybe for your situation it is better, and strictly it is an answer to your question of another approach.
(Maybe you have some sort of sharded situation with thousand shards which you can´t possibly all in the same time swap over). Either way be very careful! Run tests!
Use BigInteger class in Java. You can get large and by large i mean really very large values. Please refer to this link: How to use BigInteger?

How to perform high precision calculations with mutable class in Java?

I'm designing a tool using Java 6, that will read data from Medical Devices.
Each Medical Device manufacturer implements its own Firmware/Protocol. Vendors (like me) write their own interface that
uses the manufacturer's firmware commands to acquire data from the Medical Device. Most firmwares will output data in a cryptic fashion, so the vendor receiving it, is supposed to scale it by doing some calculations on it, in order to figureout the true value.
Its safe to assume that medical data precision is as important as financial data precision etc.
I've come to the conclusion of using BigDecimal to do all numerical calculations and store the final value. I'll be receiving a new set of data almost every second, which means, I'll be doing calculations and updating the same set of values every second. Example:Data coming across from a ventilator for each breath.
Since BigDecimal is immutable, I'm worried about the number of objects generated in the heap every second. Especially since the tool will have to scale up to read data from lets say 50 devices at the same time.
I can increase the heap size and all that, but still here's my questions....
Questions
Is there any mutable cousin of BigDecimal I could use?
Is there any existing opensource framework to do something like this?
Is Java the right language for this kind of functionality?
Should I look into Apfloat? But Apfloat is immutable too. How about JScience?
Any Math library for Java I can use for high precision
I'm aiming for a precision of upto 10 digits only. Dont need more than that. So whats the best library or course of action for this type of precision?
I would recommend that before you jump to the conclusion that BigDecimal doesn't suit your needs, that you actually profile your scenario. It is not a foregone conclusion that immutable nature is going to have a significant impact on your scenario. A modern JVM is very good at allocate and destroying large quantities of objects.
The double primitive type offers approximately 16 digits of decimal precision, why not just use that? Then you won't be touching the heap at all.
The Garbage collector should do a decent enough job of cleaning the objects up, however if you still do want immutable Numbers, you can always access the back-data of BigDecimal using reflections, that way you can create a wrapper class for BigDecimal to do what you want.
If you only need 10 digits of precision you can simply use a double, which has 16 digits of precision.
You say specifically that you need only 10 significant digits. You don't say whether you're considering binary or decimal digits, but standard 64-bit IEEE floating point (Java double) offers 52 binary digits (approximately 16 decimal digits), which sounds like it more than meets your needs.
However, I do recommend that you put some thought into the numerical stability of whatever operations you apply to the input numbers. For example, Math.log() and Math.exp() can have unexpected effects depending on the range of the inputs (in some cases, you might find Math.log1p() and Math.exp1m() to be more appropriate, but again - that depends on the specific operations you're performing).

What data-structure should I use to create my own "BigInteger" class?

As an optional assignment, I'm thinking about writing my own implementation of the BigInteger class, where I will provide my own methods for addition, subtraction, multiplication, etc.
This will be for arbitrarily long integer numbers, even hundreds of digits long.
While doing the math on these numbers, digit by digit isn't hard, what do you think the best datastructure would be to represent my "BigInteger"?
At first I was considering using an Array but then I was thinking I could still potentially overflow (run out of array slots) after a large add or multiplication. Would this be a good case to use a linked list, since I can tack on digits with O(1) time complexity?
Is there some other data-structure that would be even better suited than a linked list? Should the type that my data-structure holds be the smallest possible integer type I have available to me?
Also, should I be careful about how I store my "carry" variable? Should it, itself, be of my "BigInteger" type?
Check out the book C Interfaces and Implementations by David R. Hanson. It has 2 chapters on the subject, covering the vector structure, word size and many other issues you are likely to encounter.
It's written for C, but most of it is applicable to C++ and/or Java. And if you use C++ it will be a bit simpler because you can use something like std::vector to manage the array allocation for you.
Always use the smallest int type that will do the job you need (bytes). A linked list should work well, since you won't have to worry about overflowing.
If you use binary trees (whose leaves are ints), you get all the advantages of the linked list (unbounded number of digits, etc) with simpler divide-and-conquer algorithms. You do not have in this case a single base but many depending the level at which you're working.
If you do this, you need to use a BigInteger for the carry. You may consider it an advantage of the "linked list of ints" approach that the carry can always be represented as an int (and this is true for any base, not just for base 10 as most answers seem to assume that you should use... In any base, the carry is always a single digit)
I might as well say it: it would be a terrible waste to use base 10 when you can use 2^30 or 2^31.
Accessing elements of linked lists is slow. I think arrays are the way to go, with lots of bound checking and run time array resizing as needed.
Clarification: Traversing a linked list and traversing an array are both O(n) operations. But traversing a linked list requires deferencing a pointer at each step. Just because two algorithms both have the same complexity it doesn't mean that they both take the same time to run. The overhead of allocating and deallocating n nodes in a linked list will also be much heavier than memory management of a single array of size n, even if the array has to be resized a few times.
Wow, there are some… interesting answers here. I'd recommend reading a book rather than try to sort through all this contradictory advice.
That said, C/C++ is also ill-suited to this task. Big-integer is a kind of extended-precision math. Most CPUs provide instructions to handle extended-precision math at comparable or same speed (bits per instruction) as normal math. When you add 2^32+2^32, the answer is 0… but there is also a special carry output from the processor's ALU which a program can read and use.
C++ cannot access that flag, and there's no way in C either. You have to use assembler.
Just to satisfy curiosity, you can use the standard Boolean arithmetic to recover carry bits etc. But you will be much better off downloading an existing library.
I would say an array of ints.
An Array is indeed a natural fit. I think it is acceptable to throw OverflowException, when you run out of place in your memory. The teacher will see attention to detail.
A multiplication roughly doubles digit numbers, addition increases it by at most 1. It is easy to create a sufficiently big Array to store the result of your operation.
Carry is at most a one-digit long number in multiplication (9*9 = 1, carry 8). A single int will do.
std::vector<bool> or std::vector<unsigned int> is probably what you want. You will have to push_back() or resize() on them as you need more space for multiplies, etc. Also, remember to push_back the correct sign bits if you're using two-compliment.
i would say a std::vector of char (since it has to hold only 0-9) (if you plan to work in BCD)
If not BCD then use vector of int (you didnt make it clear)
Much less space overhead that link list
And all advice says 'use vector unless you have a good reason not too'
As a rule of thumb, use std::vector instead of std::list, unless you need to insert elements in the middle of the sequence very often. Vectors tend to be faster, since they are stored contiguously and thus benefit from better spatial locality (a major performance factor on modern platforms).
Make sure you use elements that are natural for the platform. If you want to be platform independent, use long. Remember that unless you have some special compiler intrinsics available, you'll need a type at least twice as large to perform multiplication.
I don't understand why you'd want carry to be a big integer. Carry is a single bit for addition and element-sized for multiplication.
Make sure you read Knuth's Art of Computer Programming, algorithms pertaining to arbitrary precision arithmetic are described there to a great extent.

Categories