I have written an Euler's Method code to find an approximate value for x(10) and compare it to the value of x(10) given by the exact solution given in separable ODE. However, my code displays a chaotic number for x(10). Can you please identify a major error.
Thank you.
//#(#)euler.java
//This method attempts to find solutions to dx/dt = (e^t)(sin(x)) via
//Euler's iterative method and find an approximate value for x(10)
import java.text.DecimalFormat;
public class euler
{
public static void main(String[] Leonhard)
{
DecimalFormat df = new DecimalFormat("#.0000");
double h = (1.0/3.0); // h is the step-size
double t_0 = 0; // initial condition
double x_0 = .3; // initial condition
double x_f = 10; // I want to find x(10) using this method and compare it to an exact value of x(10)
double[] t_k;
t_k = new double[ (int)( ( x_f - x_0 ) / h ) + 1 ] ; // this two arrays hold the values of x_k and t_k
double[] x_k;
x_k = new double[ (int)( ( x_f - x_0 ) / h ) + 1 ] ;
int i; // the counter
System.out.println( "k\t t_k\t x_k" ); // table header
for ( i = 0; k < (int)( ( x_f - x_0 ) / h ) + 1; i++ )
{
if ( i == 0 ) // this if statement handles the initial conditions
{
t_k[i] = t_0;
x_k[i] = x_0;
}
else if ( i > 0 )
{
t_k[i] += i*h;
x_k[i] = x_k[i-1] + h*( Math.exp(t_k[i-1]))*(Math.sin(x_k[i-1]) );
}
System.out.println( k + " " + df.format(t_k[i]) + " " + df.format( x_k[i]) );
}
}
}
Your code seems to work. The problem is that Euler's method is a fairly simplistic way of approximately integrating a differential equation. Its accuracy is strongly dependent upon the step size you're using, as you noticed.
I ran your code and compared with another implementation of the same algorithm. The results overlap in the regime where the approximation is working, and quite a while beyond. They only differ once the method breaks down strongly:
A thing to note is that the Euler method doesn't work very well for this particular differential equation, for the point you wish to reach. A step size of 1/3 is much too big to begin with, but even if you choose a much smaller step size, e.g 1/10000, the method tends to break down before reaching t=10. Something like exp(t)sin(x) is hard to deal with. The real solution becomes flat, approaching pi, so sin(x) should go to zero, making the derivative zero as well. However, exp(t) blows up, so the derivative is numerically unstable.
Related
I have a program that takes in anywhere from 20,000 to 500,000 velocity vectors and must output these vectors multiplied by some scalar. The program allows the user to set a variable accuracy, which is basically just how many decimal places to truncate to in the calculations. The program is quite slow at the moment, and I discovered that it's not because of multiplying a lot of numbers, it's because of the method I'm using to truncate floating point values.
I've already looked at several solutions on here for truncating decimals, like this one, and they mostly recommend DecimalFormat. This works great for formatting decimals once or twice to print nice user output, but is far too slow for hundreds of thousands of truncations that need to happen in a few seconds.
What is the most efficient way to truncate a floating-point value to n number of places, keeping execution time at utmost priority? I do not care whatsoever about resource usage, convention, or use of external libraries. Just whatever gets the job done the fastest.
EDIT: Sorry, I guess I should have been more clear. Here's a very simplified version of what I'm trying to illustrate:
import java.util.*;
import java.lang.*;
import java.text.DecimalFormat;
import java.math.RoundingMode;
public class MyClass {
static class Vector{
float x, y, z;
#Override
public String toString(){
return "[" + x + ", " + y + ", " + z + "]";
}
}
public static ArrayList<Vector> generateRandomVecs(){
ArrayList<Vector> vecs = new ArrayList<>();
Random rand = new Random();
for(int i = 0; i < 500000; i++){
Vector v = new Vector();
v.x = rand.nextFloat() * 10;
v.y = rand.nextFloat() * 10;
v.z = rand.nextFloat() * 10;
vecs.add(v);
}
return vecs;
}
public static void main(String args[]) {
int precision = 2;
float scalarToMultiplyBy = 4.0f;
ArrayList<Vector> velocities = generateRandomVecs();
System.out.println("First 10 raw vectors:");
for(int i = 0; i < 10; i++){
System.out.print(velocities.get(i) + " ");
}
/*
This is the code that I am concerned about
*/
DecimalFormat df = new DecimalFormat("##.##");
df.setRoundingMode(RoundingMode.DOWN);
long start = System.currentTimeMillis();
for(Vector v : velocities){
/* Highly inefficient way of truncating*/
v.x = Float.parseFloat(df.format(v.x * scalarToMultiplyBy));
v.y = Float.parseFloat(df.format(v.y * scalarToMultiplyBy));
v.z = Float.parseFloat(df.format(v.z * scalarToMultiplyBy));
}
long finish = System.currentTimeMillis();
long timeElapsed = finish - start;
System.out.println();
System.out.println("Runtime: " + timeElapsed + " ms");
System.out.println("First 10 multiplied and truncated vectors:");
for(int i = 0; i < 10; i++){
System.out.print(velocities.get(i) + " ");
}
}
}
The reason it is very important to do this is because a different part of the program will store trigonometric values in a lookup table. The lookup table will be generated to n places beforehand, so any velocity vector that has a float value to 7 places (i.e. 5.2387471) must be truncated to n places before lookup. Truncation is needed instead of rounding because in the context of this program, it is OK if a vector is slightly less than its true value, but not greater.
Lookup table for 2 decimal places:
...
8.03 -> -0.17511085919
8.04 -> -0.18494742685
8.05 -> -0.19476549993
8.06 -> -0.20456409661
8.07 -> -0.21434223706
...
Say I wanted to look up the cosines of each element in the vector {8.040844, 8.05813164, 8.065688} in the table above. Obviously, I can't look up these values directly, but I can look up {8.04, 8.05, 8.06} in the table.
What I need is a very fast method to go from {8.040844, 8.05813164, 8.065688} to {8.04, 8.05, 8.06}
The fastest way, which will introduce rounding error, is going to be to multiply by 10^n, call Math.rint, and to divide by 10^n.
That's...not really all that helpful, though, considering the introduced error, and -- more importantly -- that it doesn't actually buy anything. Why drop decimal points if it doesn't improve efficiency or anything? If it's about making the values shorter for display or the like, truncate then, but until then, your program will run as fast as possible if you just use full float precision.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
The community reviewed whether to reopen this question 10 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I have a crossword puzzle and a list of words which can be used to solve it (words can be placed multiple times or not even once). There is always a solution for the given crossword and word list.
I searched for clues on how to solve this problem and found out that it is NP-Complete. My maximal crossword size is 250 by 250, the maximal length of the list (amount of words which can be used to solve it) is 200. My goal is to solve crosswords of this size by brute force/backtracking, which should be possible within a few seconds (this is a rough estimation by me, correct me if I am wrong).
For example:
A list of given words which can be used to solve the crossword:
can
music
tuna
hi
The given empty crossword (X are fields which cannot be filled out, the empty fields need to be filled):
The solution:
Now my current approach is to represent the crossword as a 2-D array and search for empty spaces (2 iterations over the crossword). Then I match words to empty spaces depending on their length, then I try all combinations of words to empty spaces which have the same length. This approach got very messy very fast, I got lost trying to implement this, is there a more elegant solution?
The basic idea you have is pretty sensible:
Identify slots on the board.
Try each slot with each word that fits.
If every slots can be filled without conflict, it is solved.
It's an excellent plan.
The next step is to translate it into a design.
For small program like this we can go straight to pseudo code.
The gist of it, as explained by other answers, is recursion:
1 Draw a slot from the slot pool.
2 If slot pool is empty (all slots filled), stop solving.
3 For each word with correct length:
4 If part of the slot is filled, check conflict.
5 If the word does not fit, continue the loop to next word.
// No conflict
6 Fill the slot with the word.
// Try next slot (down a level)
7 Recur from step 1.
8 If the recur found no solution, revert (take the word back) and try next.
// None of them works
9 If no words yield a solution, an upper level need to try another word.
Revert (put the slot back) and go back.
Below is a short but complete example that I cooked up from your requirements.
There is more than one way to skin a cat.
My code swapped step 1 and 2, and combines step 4 to 6 in one fill loop.
Key points:
Use a formatter to fit the code to your style.
The 2D board is stored in a linear character array in row-major order.
This allow the board to be save by clone() and restored by arraycopy.
On creation, the board is scanned for slots in two passes from two directions.
The two slot lists are solved by the same loop, differ mainly in how the slots are filled.
The recur process is displayed, so you can see how it works.
Many assumptions are made. No single letter slot, all words in same case, board is correct etc.
Be patient. Learn whatever is new and give yourself time to absorb it.
Source:
import java.awt.Point;
import java.util.*;
import java.util.function.BiFunction;
import java.util.function.Supplier;
import java.util.stream.Stream;
public class Crossword {
public static void main ( String[] args ) {
new Crossword( Arrays.asList( "5 4 4\n#_#_#\n_____\n#_##_\n#_##_\ntuna\nmusic\ncan\nhi".split( "\n" ) ) );
new Crossword( Arrays.asList( "6 6 4\n##_###\n#____#\n___#__\n#_##_#\n#____#\n##_###\nnice\npain\npal\nid".split( "\n" ) ) );
}
private final int height, width; // Board size
private final char[] board; // Current board state. _ is unfilled. # is blocked. other characters are filled.
private final Set<String> words; // List of words
private final Map<Point, Integer> vertical = new HashMap<>(), horizontal = new HashMap<>(); // Vertical and horizontal slots
private String indent = ""; // For formatting log
private void log ( String message, Object... args ) { System.out.println( indent + String.format( message, args ) ); }
private Crossword ( List<String> lines ) {
// Parse input data
final int[] sizes = Stream.of( lines.get(0).split( "\\s+" ) ).mapToInt( Integer::parseInt ).toArray();
width = sizes[0]; height = sizes[1];
board = String.join( "", lines.subList( 1, height+1 ) ).toCharArray();
words = new HashSet<>( lines.subList( height+1, lines.size() ) );
// Find horizontal slots then vertical slots
for ( int y = 0, size ; y < height ; y++ )
for ( int x = 0 ; x < width-1 ; x++ )
if ( isSpace( x, y ) && isSpace( x+1, y ) ) {
for ( size = 2 ; x+size < width && isSpace( x+size, y ) ; size++ ); // Find slot size
horizontal.put( new Point( x, y ), size );
x += size; // Skip past this horizontal slot
}
for ( int x = 0, size ; x < width ; x++ )
for ( int y = 0 ; y < height-1 ; y++ )
if ( isSpace( x, y ) && isSpace( x, y+1 ) ) {
for ( size = 2 ; y+size < height && isSpace( x, y+size ) ; size++ ); // Find slot size
vertical.put( new Point( x, y ), size );
y += size; // Skip past this vertical slot
}
log( "A " + width + "x" + height + " board, " + vertical.size() + " vertical, " + horizontal.size() + " horizontal." );
// Solve the crossword, horizontal first then vertical
final boolean solved = solveHorizontal();
// Show board, either fully filled or totally empty.
for ( int i = 0 ; i < board.length ; i++ ) {
if ( i % width == 0 ) System.out.println();
System.out.print( board[i] );
}
System.out.println( solved ? "\n" : "\nNo solution found\n" );
}
// Helper functions to check or set board cell
private char get ( int x, int y ) { return board[ y * width + x ]; }
private void set ( int x, int y, char character ) { board[ y * width + x ] = character; }
private boolean isSpace ( int x, int y ) { return get( x, y ) == '_'; }
// Fit all horizontal slots, when success move to solve vertical.
private boolean solveHorizontal () {
return solve( horizontal, this::fitHorizontal, "horizontally", this::solveVertical );
}
// Fit all vertical slots, report success when done
private boolean solveVertical () {
return solve( vertical, this::fitVertical, "vertically", () -> true );
}
// Recur each slot, try every word in a loop. When all slots of this kind are filled successfully, run next stage.
private boolean solve ( Map<Point, Integer> slot, BiFunction<Point, String, Boolean> fill, String dir, Supplier<Boolean> next ) {
if ( slot.isEmpty() ) return next.get(); // If finished, move to next stage.
final Point pos = slot.keySet().iterator().next();
final int size = slot.remove( pos );
final char[] state = board.clone();
/* Try each word */ indent += " ";
for ( String word : words ) {
if ( word.length() != size ) continue;
/* If the word fit, recur. If recur success, done! */ log( "Trying %s %s at %d,%d", word, dir, pos.x, pos.y );
if ( fill.apply( pos, word ) && solve( slot, fill, dir, next ) )
return true;
/* Doesn't match. Restore board and try next word */ log( "%s failed %s at %d,%d", word, dir, pos.x, pos.y );
System.arraycopy( state, 0, board, 0, board.length );
}
/* No match. Restore slot and report failure */ indent = indent.substring( 0, indent.length() - 2 );
slot.put( pos, size );
return false;
}
// Try fit a word to a slot. Return false if there is a conflict.
private boolean fitHorizontal ( Point pos, String word ) {
final int x = pos.x, y = pos.y;
for ( int i = 0 ; i < word.length() ; i++ ) {
if ( ! isSpace( x+i, y ) && get( x+i, y ) != word.charAt( i ) ) return false; // Conflict
set( x+i, y, word.charAt( i ) );
}
return true;
}
private boolean fitVertical ( Point pos, String word ) {
final int x = pos.x, y = pos.y;
for ( int i = 0 ; i < word.length() ; i++ ) {
if ( ! isSpace( x, y+i ) && get( x, y+i ) != word.charAt( i ) ) return false; // Conflict
set( x, y+i, word.charAt( i ) );
}
return true;
}
}
Exercise: You can rewrite recursion to iteration; faster and can support bigger boards.
Once that's done it can be converted to multi-thread and run even faster.
You are right the problem is NP-complete. So your best chance is to solve it by brute-force (if you find a polynomial algorithm please tell me, we can both be rich =)).
What I suggest you is to take a look at backtracking. It will allow you to write an elegant (and yet slow given your input size) solution to the crossword problem.
If you need more inspirational material take a look at this solver that uses backtracking as a method to navigate the solution tree.
Note that there are algorithms out there that might in practice perform better than a pure brute-force (even though still of exponential complexity).
Also, a quick search on scholar reveals a good number of papers on the topic that you might want to take a look at, such as the followings:
using genetic algorithm
using a probabilistic approach
A crossword puzzle is a Constraint satisfaction problem which is generally a NP-Complete, but there are many solvers that will apply the most efficient algorithms to a constraint problem that you specify. The Z3 SMT solver can solve these problems very easily and at scale. All you have to do is write a Java program that transforms the crossword puzzle into a SMT problem the solver can understand then gives it to the solver to solve it. Z3 has Java bindings so it should be pretty simple. I have written the Z3 code for solving the first example below. It should not be difficult for you to follow the pattern in your Java program to specify arbitrarily large crossroad puzzles.
; Declare each possible word as string literals
(define-const str1 String "tuna")
(define-const str2 String "music")
(define-const str3 String "can")
(define-const str4 String "hi")
; Define a function that returns true if the given String is equal to one of the possible words defined above.
(define-fun validString ((s String)) Bool
(or (= s str1) (or (= s str2) (or (= s str3) (= s str4)))))
; Declare the strings that need to be solved
(declare-const unknownStr1 String)
(declare-const unknownStr2 String)
(declare-const unknownStr3 String)
(declare-const unknownStr4 String)
; Assert the correct lengths for each of the unknown strings.
(assert (= (str.len unknownStr1) 4))
(assert (= (str.len unknownStr2) 5))
(assert (= (str.len unknownStr3) 3))
(assert (= (str.len unknownStr4) 2))
; Assert each of the unknown strings is one of the possible words.
(assert (validString unknownStr1))
(assert (validString unknownStr2))
(assert (validString unknownStr3))
(assert (validString unknownStr4))
; Where one word in the crossword puzzle intersects another assert that the characters at the intersection point are equal.
(assert (= (str.at unknownStr1 1) (str.at unknownStr2 1)))
(assert (= (str.at unknownStr2 3) (str.at unknownStr4 1)))
(assert (= (str.at unknownStr2 4) (str.at unknownStr3 0)))
; Solve the model
(check-sat)
(get-model)
I recommend the Z3 SMT solver, but there are plenty of other constraint solvers. There is no need for you to implement your own constraint solving algorithm any more than there is a need for you to implement your own sorting algorithm.
To make this problem easier to solve, I'll break this down into smaller, easier problems. Note that I am not including code/algorithms, as I believe that will not help here (If we wanted the best Code, there would be indexes and databases and black magic that makes your head explode just seeing it). Instead, this answer tries to answer the question by talking about methods of thought that will help the OP tackle this problem (and future ones) using the method that works best for the reader.
What you need to know
This answer assumes you know how to do the following
Create and use Objects that have properties and functions
Pick a data structure that works (not necessarily good) for what you want to do with its contents.
Modeling your space
So, it's easy enough to load your crossword into an n by m matrix (2D array, hereby 'grid'), but this is very heard to work with pragmatically. So lets start by parsing your crossword from a grid to a legitimate object.
As far as your program needs to know, each entry in the crossword has 4 properties.
An X-Y coordinate in the grid for the first letter
A direction (down or across)
Word length
Word value
Map of bound indexes
Key: Index of word that is shared with another entry
Value: Entry that index is shared with
(You can make this a tuple and include the shared index from the other entry for easy refrence)
You can find these in the grid based on these rules while scanning.
If Row_1_up is closed and Row_1_down is open, this is the start index of a down word. (scan down for for length. For bound indexes, left or right space will be open. scan left to get linked entry coord-id)
Same as 1 but rotated for across words (You can do this at the same time as the scan for 1)
In your crossword object, you can store the entries using the coordinate+direction as the key for easy reference and easy conversion to/from text grid form.
Using your model
You should now have an object containing a collection of crossword entries, which contain their relevant index bindings. You now need to find a set of values that will satisfy all your entries.
Your entry objects should have helper methods like isValidEntry(str) that checks for the given value, and the current state of the crossword, can I put this word here? By making each object in your model responsible for its own level of logic, the code for the problem one thought layer up can just call the logic without worrying about it's implementation (in this example, Your solver doesn't have to worry about the logic of is a value valid, it can just ask isValidEntry for that)
If you have done the above right, solving the problem is then a simple matter of iterating over all words for all entries to find a solution.
List of sub problems
For reference, here is my list of sub problems that you need to write something to solve.
How can I ideally model my work-space that is easy for me to work with?
For each piece of my model, what does it need to know? What logic can it handle for me?
How can I transform my text input into a usable model object?
How do I solve my problem using my model objects? (For you, it is iterate all words/all entries to find a valid set. Maybe using recursion)
I just implemented a code in Scala to solve such puzzles. I am just using recursion to solve the problem. In short, for each word, I find all possible slots, and pick a slot and fill it with the word, and try to solve the partial puzzle with recursion. If the puzzle cannot be filled with the rest of words, it tries another slot, etc. if not, the puzzle is solved.
Here is the link to my code:
https://github.com/mysilver/AMP/blob/master/Crossword.scala
public static double testElmanWithAnnealing(NeuralDataSet trainingSet,
NeuralDataSet validation,int maxEpoch)
{
// create an elman network
ElmanPattern pattern = new ElmanPattern();
pattern.setActivationFunction(new ActivationTANH());
pattern.setInputNeurons(trainingSet.getInputSize());
pattern.addHiddenLayer(8);
pattern.setOutputNeurons(trainingSet.getIdealSize());
BasicNetwork network = (BasicNetwork)pattern.generate();
network.reset();
// set up a hybrid strategy of resilient + simulated annealing
CalculateScore score = new TrainingSetScore(trainingSet)
final MLTrain trainAlt = new NeuralSimulatedAnnealing(
network, score, 10, 2, 100);
final MLTrain trainMain =
new ResilientPropagation(network, trainingSet);
trainMain.addStrategy(
new HybridStrategy(trainAlt,0.00001,100,3));
int epoch = 0;
do {
trainMain.iteration();
System.out
.println("Epoch #" + epoch + " Error:" + trainMain.getError());
epoch++;
} while(trainMain.getError() > 0.01 && epoch < maxEpoch);
int trueStuff = 0;
int falseStuff = 0;
for(MLDataPair pair: validation ) {
final MLData output = network.compute(pair.getInput());
System.out.println(
"actual=" + output.getData(0) + ",ideal=" + pair.getIdeal().getData(0));
if(output.getData(0) * pair.getIdeal().getData(0) > 0)
trueStuff++;
else
falseStuff++;
}
System.out.println("true classifications:" + trueStuff);
System.out.println("false classifications:" + falseStuff);
return network.calculateError(validation);
}
I have 8 inputs of floating point variables normalized using a simple
min/max scheme to values between -1 and 1.
Trying to classify into either a negative value or a positive value (binary classification). So in the training and validation set the ideal would be either 1 or -1.
Network always produces the same result, or it might have one or two results. For example: -0.05686225929855484 around 90% of the time and some other values occasionally.
am I using encog wrong? does anything in the code stand out to you as a bug?
can I do anything to punish such behaviour of the neural network?
this is even worse than a random guess, surely there's a way to get better predictions.
Thanks in advance.
I tried to make a program (in Java) that calculates pi with the Chudnovsky algorithm but it has the output NaN (Not a Number). Please help me find mistakes in my code, or improve my code. (I don't have a lot of Java programming knowledge)
You can find Chudnovsky's algorithm here:
https://en.wikipedia.org/wiki/Chudnovsky_algorithm
here is my code:
package main;
public class Class1 {
public static void main(String[] args)
{
double nr1=0,nr2=0,nr3=0,pi=0;
int fo1=1, fo2=1, fo3=1;
for(int i=0; i<=20; i++){
for(int fl1=1; fl1<=(6*i); fl1++){fo1 = fo1 * fl1;}
for(int fl2=1; fl2<=(3*i); fl2++){fo2 = fo2 * fl2;}
for(int fl3=1; fl3<=(i); fl3++){fo3 = fo3 * fl3;}
nr1 = ( (Math.pow(-1, i)) * (fo1) * ((545140134*i) + 13591409) );
nr2 = ( (fo2) * (Math.pow(fo3, i)) * ( Math.pow(Math.pow(640320, 3), (i+(1/2)) )) );
nr3 = 12 * (nr1/nr2);
}
pi = 1/nr3;
System.out.println((Math.PI));
System.out.println(pi);
}
}
There are many issues here.
As Andy mentioned, 1/2 is not 0.5.
You are using integers to compute things like 120! which is completely out of bounds for any primitive type.
f01,f02,f03 should be initialized inside each loop, otherwise they grow even bigger
It is not trivial to fix it. You can take a look at
Error calculating pi using the Chudnovsky algorithm - Java
and
http://www.craig-wood.com/nick/articles/pi-chudnovsky/
for some hints, but don't expect built-in primitive types to work with that algorithm.
Am trying to create a well-optimised bit of code to create number of X-digits in length (where X is read from a runtime properties file), based on a DB-generated sequence number (Y), which is then used a folder-name when saving a file.
I've come up with three ideas so far, the fastest of which is the last one, but I'd appreciate any advice people may have on this...
1) Instantiate a StringBuilder with initial capacity X. Append Y. While length < X, insert a zero at pos zero.
2) Instantiate a StringBuilder with initial capacity X. While length < X, append a zero. Create a DecimalFormat based on StringBuilder value, and then format the number when it's needed.
3) Create a new int of Math.pow( 10, X ) and add Y. Use String.valueOf() on the new number and then substring(1) it.
The second one can obviously be split into outside-loop and inside-loop sections.
So, any tips? Using a for-loop of 10,000 iterations, I'm getting similar timings from the first two, and the third method is approximately ten-times faster. Does this seem correct?
Full test-method code below...
// Setup test variables
int numDigits = 9;
int testNumber = 724;
int numIterations = 10000;
String folderHolder = null;
DecimalFormat outputFormat = new DecimalFormat( "#,##0" );
// StringBuilder test
long before = System.nanoTime();
for ( int i = 0; i < numIterations; i++ )
{
StringBuilder sb = new StringBuilder( numDigits );
sb.append( testNumber );
while ( sb.length() < numDigits )
{
sb.insert( 0, 0 );
}
folderHolder = sb.toString();
}
long after = System.nanoTime();
System.out.println( "01: " + outputFormat.format( after - before ) + " nanoseconds" );
System.out.println( "Sanity check: Folder = \"" + folderHolder + "\"" );
// DecimalFormat test
before = System.nanoTime();
StringBuilder sb = new StringBuilder( numDigits );
while ( sb.length() < numDigits )
{
sb.append( 0 );
}
DecimalFormat formatter = new DecimalFormat( sb.toString() );
for ( int i = 0; i < numIterations; i++ )
{
folderHolder = formatter.format( testNumber );
}
after = System.nanoTime();
System.out.println( "02: " + outputFormat.format( after - before ) + " nanoseconds" );
System.out.println( "Sanity check: Folder = \"" + folderHolder + "\"" );
// Substring test
before = System.nanoTime();
int baseNum = (int)Math.pow( 10, numDigits );
for ( int i = 0; i < numIterations; i++ )
{
int newNum = baseNum + testNumber;
folderHolder = String.valueOf( newNum ).substring( 1 );
}
after = System.nanoTime();
System.out.println( "03: " + outputFormat.format( after - before ) + " nanoseconds" );
System.out.println( "Sanity check: Folder = \"" + folderHolder + "\"" );
I would stop doing optimizations based on micro-benchmarks and go for something that looks elegant codewise, such as String.format("%0"+numDigits+"d", testNumber)
Use String.format("%0[length]d", i)
For length of 8 it would be
String out = String.format("%08d", i);
It's slower, but the time spent typing and debugging the more complex code will probably exceed the total extra time ever used during execution.
In fact, if you add up all the man-hours already spent discussing this, it most likely exceeds the execution time savings by a large factor.
Inserting padding characters one by one is obviously slow. If performance is really that big a concern, you could use predefined string constants of lengts 1..n-1 instead (where n is the biggest expected length), stored in an ArrayList at the corresponding indexes.
If n is very big, at least you could still insert in bigger chunks instead of single chars.
But overall, as others pointed out too, optimization is only feasible if you have profiled your application under real circumstances and found which specific piece of code is the bottleneck. Then you can focus on that (and of course profile again to verify that your changes actually improve performance).
Here is a solution that is basically the same thing as your StringBuilder with two optimizations:
It directly writes to an array
bypassing the StringBuilder overhead
It does the operations in reverse
instead of insert(0), which requries
an arraycopy each time
It also makes the assumptions that numDigits will be >= to the actual characters required, but will properly handle negative numbers:
before = System.nanoTime();
String arrString=null;
for ( int j = 0; j < numIterations; j++ ){
char[] arrNum = new char[numDigits];
int i = numDigits-1;
boolean neg = testNumber<0;
for(int tmp = neg?-testNumber:testNumber;tmp>0;tmp/=10){
arrNum[i--] = (char)((tmp%10)+48);
}
while(i>=0){
arrNum[i--]='0';
}
if(neg)arrNum[0]='-';
arrString = new String(arrNum);
}
after = System.nanoTime();
System.out.println( "04: " + outputFormat.format( after - before ) + " nanoseconds" );
System.out.println( "Sanity check: Folder = \"" + arrString + "\"" );
This method well outperformed your samples on my machine for negatives and was comparable for positives:
01: 18,090,933 nanoseconds
Sanity check: Folder = "000000742"
02: 22,659,205 nanoseconds
Sanity check: Folder = "000000742"
03: 2,309,949 nanoseconds
Sanity check: Folder = "000000742"
04: 6,380,892 nanoseconds
Sanity check: Folder = "000000742"
01: 14,933,369 nanoseconds
Sanity check: Folder = "0000-2745"
02: 21,685,158 nanoseconds
Sanity check: Folder = "-000002745"
03: 3,213,270 nanoseconds
Sanity check: Folder = "99997255"
04: 1,255,660 nanoseconds
Sanity check: Folder = "-00002745"
Edit: I noticed your tests resued some of the objects within the iteration loop, which I had not done in mine (such as not recalculating baseNum in the substring version). When I altered the tests to be consistent (not resuing any objects / calculations my version performed better than yours:
01: 18,377,935 nanoseconds
Sanity check: Folder = "000000742"
02: 69,443,911 nanoseconds
Sanity check: Folder = "000000742"
03: 6,410,263 nanoseconds
Sanity check: Folder = "000000742"
04: 996,622 nanoseconds
Sanity check: Folder = "000000742"
Of course as others have mentioned micro benchmarking is incredibly difficult / "fudgy" with all of the optimization performed by the VM and the inability to control them.
This probably related link discusses many of the ways to do it. I would recommend the Apache option, StringUtils, it may or may not be the absolute fastest, but its usually one of the easiest to understand, and has had the )&### pounded out of it, so it probably won't break in some unforeseen edge case. ;)