optimization - converting std input to integer array in java

optimization - converting std input to integer array in java - java

I want to read each line of input, store the numbers in an int[] array preform some calculations, then move onto my next line of input as fast as possible.
Input (stdin)
2 4 8
15 10 5
12 14 3999 -284 -71
0 -213 18 4 2
0
This is a pure optimization problem and not entirely good practice in the real world as I'm assuming perfect input. I'm interested in how to improve my current method for taking input from stdin and representing it as an integer array. I have seen methods using scanner where they use a getnextint method, however I've read in multiple places scanner is a lot slower than BufferedReader.
Can this taking in of input step be improved?
Current Method
BufferedReader bufferedInput = new BufferedReader(new InputStreamReader(System.in));
String line;
String[] lineArray;
try{
// a line with just "0" indicates end of std input
while((line = bufferedInput.readLine()) != "0"){
lineArray = line.split("\\s+"); // is "\\s+" the optimized regex
int arrlength = lineArray.length;
int[] lineInt = new int[arrlength];
for(int i = 0; i < arrlength; i++){
lineInt[i] = Integer.parseInt(lineArray[i]);
}
// Preform some operations on lineInt, then regenerate a new
// lineInt with inputs from next line of stdin
}
}catch(IOException e){
}
judging from other questions Difference between parseInt and valueOf in java? parseint seems to be the most efficient method for converting strings to integers1. Any enlightenment would be of great help.
Thank you :)
Edit 1: removed GCD information and 'algorithm' tag
Edit 2: (hopefully) made question more concise, grammatical fix ups

First of all, I just want out that it is totally pointless optimizing in your particular example.
For your example, most people would agree that the best solution is not the optimal one. Rather the most readable solution is will be the best.
Having said that, if you want the most optimal solution, then don't use Scanner, don't use BufferedReader.readLine(), don't use String.split and don't use Integer.parseInt(...).
Instead read characters one at a time using BufferedReader.read() and parse and convert them to int by hand. You also need to implement your own "extendable array of int" type that behaves like an ArrayList<Integer>.
This is a lot of (unnecessary) work, and many more lines of code to maintain. BAD IDEA ...

I second what Stephen said, the speed of parsing is likely to massively outperform the speed of actual I/O done, therefore improving parsing won't give you much.
Seriously, don't do this unless you've built the whole system, profiled it and found that inefficient parsing is what keeps it from hitting its performance targets.
But strictly just as an exercise, and because the general principle may be useful elsewhere, here's an example of how to parse it straight from a string.
The assumptions are:
You will use a sensible encoding, where the characters 0..9 are consecutive.
The only characters in the stream will be 0..9, minus sign and space.
All the numbers are well-formed.
Another important caveat is that for the sake of simplicity I used ArrayList, which is a bad idea for storing primitives, the overhead of boxing/unboxing probably wipes out all improvement in parsing speed. In the real world I'd use a list variant custom-made for primitives.
public static List<Integer> parse(String s) {
List<Integer> ret = new ArrayList<Integer>();
int sign = 1;
int current = 0;
boolean inNumber = false;
for (int i = 0; i < s.length(); i++) {
char c = s.charAt(i);
if (c >= '0' && c <= '9') { //we assume a sensible encoding
current = current * 10 + sign * (c-'0');
inNumber = true;
}
else if (c == ' ' && inNumber) {
ret.add(current);
current = 0;
inNumber = false;
sign = 1;;
}
else if (c == '-') {
sign = -1;
}
}
if (inNumber) {
ret.add(current);
}
return ret;
}

Related

NZEC error in Hackerearth problem in java

I'm trying the solve this hacker earth problem https://www.hackerearth.com/practice/basic-programming/input-output/basics-of-input-output/practice-problems/algorithm/anagrams-651/description/
I have tried searching through the internet but couldn't find the ideal solution to solve my problem
This is my code:
String a = new String();
String b = new String();
a = sc.nextLine();
b = sc.nextLine();
int t = sc.nextInt();
int check = 0;
int againCheck =0;
for (int k =0; k<t; k++)
{
for (int i =0; i<a.length(); i++)
{
char ch = a.charAt(i);
for (int j =0; j<b.length(); j++)
{
check =0;
if (ch != b.charAt(j))
{
check=1;
}
}
againCheck += check;
}
}
System.out.println(againCheck*againCheck);
I expect the output to be 4, but it is showing the "NZEC" error
Can anyone help me, please?

The requirements state1 that the input is a number (N) followed by 2 x N lines. Your code is reading two strings followed by a number. It is probably throwing an InputMismatchException when it attempts to parse the 3rd line of input as a number.
Hints:
It pays to read the requirements carefully.
Read this article on CodeChef about how to debug a NZEC: https://discuss.codechef.com/t/tutorial-how-to-debug-an-nzec-error/11221. It explains techniques such as catching exceptions in your code and printing out a Java stacktrace so that you can see what is going wrong.
1 - Admittedly, the requirements are not crystal clear. But in the sample input the first line is a number.

As I've written in other answers as well, it is best to write your code like this when submitting on sites:
def myFunction():
try:
#MY LOGIC HERE
except Exception as E:
print("ERROR Occurred : {}".format(E))
This will clearly show you what error you are facing in each test case. For a site like hacker earth, that has several input problems in various test cases, this is a must.
Coming to your question, NZEC stands for : NON ZERO EXIT CODE
This could mean any and everything from input error to server earthquake.

Regardless of hacker-whatsoever.com I am going to give two useful things:
An easier algorithm, so you can code it yourself, becuase your algorithm will not work as you expect;
A Java 8+ solution with totally a different algorithm, more complex but more efficient.
SIMPLE ALGORITM
In you solution you have a tipical double for that you use to check for if every char in a is also in b. That part is good but the rest is discardable. Try to implement this:
For each character of a find the first occurence of that character in b
If there is a match, remove that character from a and b.
The number of remaining characters in both strings is the number of deletes you have to perform to them to transform them to strings that have the same characters, aka anagrams. So, return the sum of the lenght of a and b.
NOTE: It is important that you keep track of what you already encountered: with your approach you would have counted the same character several times!
As you can see it's just pseudo code, of a naive algorithm. It's just to give you a hint to help you with your studying. In fact this algorithm has a max complexity of O(n^2) (because of the nested loop), which is generally bad. Now, a better solution.
BETTER SOLUTION
My algorithm is just O(n). It works this way:
I build a map. (If you don't know what is it, to put it simple it's a data structure to store couples "key-value".) In this case the keys are characters, and the values are integer counters binded to the respective character.
Everytime a character is found in a its counter increases by 1;
Everytime a character is found in b its counter decreases by 1;
Now every counter represents the diffences between number of times its character is present in a and b. So, the sum of the absolute values of the counters is the solution!
To implement it actually add an entry to map whenever I find a character for the first time, instead of pre-costructing a map with the whole alphabet. I also abused with lambda expressions, so to give you a very different sight.
Here's the code:
import java.util.HashMap;
public class HackerEarthProblemSolver {
private static final String a = //your input string
b = //your input string
static int sum = 0; //the result, must be static because lambda
public static void main (String[] args){
HashMap<Character,Integer> map = new HashMap<>(); //creating the map
for (char c: a.toCharArray()){ //for each character in a
map.computeIfPresent(c, (k,i) -> i+1); //+1 to its counter
map.computeIfAbsent(c , k -> 1); //initialize its counter to 1 (0+1)
}
for (char c: b.toCharArray()){ //for each character in b
map.computeIfPresent(c, (k,i) -> i-1); //-1 to its counter
map.computeIfAbsent(c , k -> -1); //initialize its counter to -1 (0-1)
}
map.forEach((k,i) -> sum += Math.abs(i) ); //summing the absolute values of the counters
System.out.println(sum)
}
}
Basically both solutions just counts how many letters the two strings have in common, but with different approach.
Hope I helped!

How to make a program to print the series : 1,12,123...12345678910,1234567891011...& so on to the nth term?

//I tried this one but output was wrong for tenth term
import java.io.*;
public class series
{
public static void main(String args[])throws IOException
{
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
int n,i,i1=0,s=0,c=0;
System.out.println("Enter the term of the series you want to get");
n=Integer.parseInt(in.readLine());
for (i=1;i<=n;i++)
{
i1=i;
while (i1!=0)
{
c+=1;
i1=i1/10;
}
s=(int)(s*(Math.pow(10,c))+i);
c=0;
System.out.print(s+" ");
}
}
}

I don't know why you are using your current approach. I would go about this by keeping track of the previous term which was printed.
StringBuilder term = new StringBuilder("");
final int N = 20;
for (int i=1; i <= N; ++i) {
term.append(i);
if (i > 1) System.out.print(",");
System.out.print(term.toString());
}
Demo
Edit: The reason I suggest using a string to display each term is that your requirement appears to mainly be one of presentation. That is, you're not actually doing any math with each term, so why not just avoid a numeric type completely, which also avoids things like overflow and potential loss of precision.

Whilst Tim's answer is neat, I think that the exercise is sufficiently basic that StringBuilder is beyond its scope (*).
Instead, you can use a nested for loop:
final int N = 20;
for (int i=1; i <= N; ++i) {
if (i > 1) System.out.print(",");
for (int a = 1; a <= i; ++a) {
System.out.print(a);
}
}
(This is also going to be more memory efficient, since there is no need to keep reallocating the StringBuffer's internal buffer as i increases. But this is really of secondary (or even lesser) concern).
(*) Yes, you could do the same without StringBuilder, just using String concatenation; but that would be inefficient in ways that beginners may not "get", and so it is something that is best just steered around. Nested loops are far more generally useful than string concatenation in whatever form as a concept to get your head around.

The main reason you are getting wrong output after 10th term is Integer Overflow. You can use long to get rid of that to certain more terms but a definitely better solution is as answered using Strings

Executing code N times and other code N+1 times

The question is about while-loops in which I need some code to be executed N times and some other code N+1 times. NOT about concatening Strings, I just use this as bad-coded yet short example.
Let me explain my question by providing an example.
Say I want to concatenate N+1 Strings, by glueing them with "\n", for example. I will have N+1 lines of text then, but I only need to add N times "\n".
Is there any boilerplate solution for this type of loop in which you have to execute some code N times and other code N+1 times? I'm NOT asking for solution to concatenate Strings! That is just a (bad) example. I'm looking for the general solution.
The problem I have with this is code duplication, so to code my example I'll do this (bad pseudo code, I know I have to use StringBuilder etc.):
String[] lines = <some array of dimension N+1>;
String total = lines[0];
for (int i = 1; i < N + 1; i++){
total += "\n" + lines[i];
}
The problem becomes worse if the code that has to be executed N+1 times, becomes larger, of course. Then I would do something like
codeA(); // adding the line of text
for (int i = 1; i < N + 1; i++){
codeB(); // adding the "\n"
codeA();
}
To remove the duplication, you can do this different by checking inside the loop, too, but then I find this quite stupid as I know beforehand that the check is pre-determined, as it will only be false the first iteration:
for (int i = 0; i < N + 1; i++){
if (i > 0){
codeB(); // adding the "\n"
}
codeA();
}
Is there any solution for this, a sort of while-loop that initializes once with codeA() en then keeps looping over codeB() and codeA()?
People must have run into this before, I guess. Just wondering if there are any beautiful solutions for this.

To my dissapointment, I believe that there is no such construct that satisfies the conditions as you have stated them and I will attempt to explain why (though I can't prove it in a strictly mathematical way).
The requirements of the problem are:
We have two parts of code: codeA() and codeB()
The two parts are executed a different number of times, N and N+1
We want to avoid adding a condition inside the loop
We want to execute each part only as many times as strictly necessary
2) is a direct consequence of 1). If we didn't have two parts of code we would not need a different number of executions. We would have a single loop body.
4) is again a consequence of 1). There is no redundant execution if we have a single loop body. We can control its execution through the loop's condition
So the restrictions are basically 1) and 3).
Now inside the loop we need to answer two questions on each iteration: a) do we execute codeA()? and b) do we execute codeB()? We simply do not have enough information to decide since we only have a single condition (the condition of the loop) and that condition will be used to decide if both of the code parts would be executed or not.
So we need to break 1) and/or 3) Either we add the extra condition inside the loop or we delegate the decision to some other code (thus not having two parts anymore).
Apparently an example of delegation could be (I am using the string concatenation example):
String [] lines = ...
for (int i = 0; i < N; i++){
// delegate to a utility class LineBuilder (perhaps an extension of StringBuilder) to concatenate lines
// this class would still need to check a condition e.g. for the first line to skip the "\n"
// since we have delegated the decisions we do not have two code parts inside the loop
lineBuilder.addLine( lines[i] );
}
Now a more interesting case of delegation would be if we could delegate the decision to the data itself (this might worth keeping in mind). Example:
List<Line> lines = Arrays.asList(
new FirstLine("Every"), // note this class is different
new Line("word"),
new Line("on"),
new Line("separate"),
new Line("line") );
StringBuffer sb = new StringBuffer();
for (Line l : lines) {
// Again the decision is delegated. Data knows how to print itself
// Line would return: "\n" + s
// FirstLine would return: s
sb.append( l.getPrintVersion() );
}
Of course all of the above does not mean that you couldn't implement a class that tries to solve the problem. I believe though this is beyond the scope of your original question not to mention that would be an overkill for simple loops

Concatenating Strings like this is a bad idea and a much bigger issue IMHO.
However to answer your question I would do
String sep = "";
StringBuilder sb= new StringBuilder();
for(String s: lines) {
sb.append(sep).append(s);
sep = "\n";
}
String all = sb.toString();
Note: there is usually a good way to avoid needing to create this String at all such a processing the lines as you get them. It is hard to say without more context.

This kind of thing is fairly common, like when you build sql. This is the pattern that I follow:
String[] lines ...//init somehow;
String total = lines[0];
boolean firstTime = true;
StringBuilder sb = new StringBuilder();
for (int i = 0; i < length; i++){
if(firstTime) firstTime = false;
else sb.append('\n');
sb.append(lines[i]);
}
Note that this is not the same, as the first example and here is why:
String[] lines = <some array of dimension N+1>;
String total = lines[0];
for (int i = 1; i < N + 1; i++){
total += "\n" + lines[i];
}
Assuming you have an array of [0] = 'line1' and [1] = 'line2'
Here you end up with line1line2\n, when the desired output is:
line1\nline2.
The example I provided is clear, and does not perform poorly. In fact a much bigger performance gain is made by utilizing StringBuilder/Buffer. Having clear code is essential for the pro.

Personally i have most of the time the same problem, on the String example i use the StringBuilder as you said, and just delete the characters added to much:
StringBuilder sb = new StringBuilder();
for(int i=0; i<N; i++) {
sb.append(array[i]).append("\n");
}
sb.delete(sb.length-1, sb.length); // maybe check if sb contains something
In the common case i suppose there is no other way than adding the if you suggested. To make the code more clear i would check at the end of the for loop:
StringBuilder sb = new StringBuilder();
for(int i=0; i<N; i++) {
sb.append(array[i]);
if(i < N) {
sb.append("\n");
}
}
But i totally agree this is sad to have this double logic

which code is more efficient?

which of the following is an efficient way to reverse words in a string ?
public String Reverse(StringTokenizer st){
String[] words = new String[st.countTokens()];
int i = 0;
while(st.hasMoreTokens()){
words[i] = st.nextToken();i++}
for(int j = words.length-1;j--)
output = words[j]+" ";}
OR
public String Reverse(StringTokenizer st, String output){
if(!st.hasMoreTokens()) return output;
output = st.nextToken()+" "+output;
return Reverse(st, output);}
public String ReverseMain(StringTokenizer st){
return Reverse(st, "");}
while the first way seems more readable and straight forward, there are two loops in it. In the 2nd method, I've tried doing it in tail-recursive way. But I am not sure whether java does optimize tail-recursive code.

you could do this in just one loop
public String Reverse(StringTokenizer st){
int length = st.countTokens();
String[] words = new String[length];
int i = length - 1;
while(i >= 0){
words[i] = st.nextToken();i--}
}

But I am not sure whether java does optimize tail-recursive code.
It doesn't. Or at least the Sun/Oracle Java implementations don't, up to and including Java 7.
References:
"Tail calls in the VM" by John Rose # Oracle.
Bug 4726340 - RFE: Tail Call Optimization
I don't know whether this makes one solution faster than the other. (Test it yourself ... taking care to avoid the standard micro-benchmarking traps.)
However, the fact that Java doesn't implement tail-call optimization means that the 2nd solution is liable to run out of stack space if you give it a string with a large (enough) number of words.
Finally, if you are looking for a more space efficient way to implement this, there is clever way that uses just a StringBuilder.
Create a StringBuilder from your input String
Reverse the characters in the StringBuilder using reverse().
Step through the StringBuilder, identifying the start and end offset of each word. For each start/end offset pair, reverse the characters between the offsets. (You have to do this using a loop.)
Turn the StringBuilder back into a String.

You can test results by timing both of them on a large amount of results
eg. You reverse 100000000 strings and see how many seconds it takes. You could also compare start and end system timestamps to get the exact difference between the two functions.

StringTokenizer is not deprecated but if you read the current JavaDoc...
StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.
String[] strArray = str.split(" ");
StringBuilder sb = new StringBuilder();
for (int i = strArray.length() - 1; i >= 0; i--)
sb.append(strArray[i]).append(" ");
String reversedWords = sb.substring(0, sb.length -1) // strip trailing space

Compare first three characters of two strings

Strings s1 and s2 will always be of length 1 or higher.
How can I speed this up?
int l1 = s1.length();
if (l1 > 3) { l1 = 3; }
if (s2.startsWith(s1.substring(0,l1)))
{
// do something..
}
Regex maybe?

Rewrite to avoid object creation
Your instincts were correct. The creation of new objects (substring()) is not very fast and it means that each one created must incur g/c overhead as well.
This might be a lot faster:
static boolean fastCmp(String s1, String s2) {
return s1.regionMatches(0, s2, 0, 3);
}

This seems pretty reasonable. Is this really too slow for you? You sure it's not premature optimization?

if (s2.startsWith(s1.substring(0, Math.min(3, s1.length())) {..};
Btw, there is nothing slow in it. startsWith has complexity O(n)
Another option is to compare the char values, which might be more efficient:
boolean match = true;
for (int i = 0; i < Math.min(Math.min(s1.length(), 3), s2.length()); i++) {
if (s1.charAt(i) != s2.charAt(i)) {
match = false;
break;
}
}

My java isn't that good, so I'll give you an answer in C#:
int len = Math.Min(s1.Length, Math.Min(s2.Length, 3));
for(int i=0; i< len; ++i)
{
if (s1[i] != s2[i])
return false;
}
return true;
Note that unlike yours and Bozho's, this does not create a new string, which would be the slowest part of your algorithm.

Perhaps you could do this
if (s1.length() > 3 && s2.length() > 3 && s1.indexOf (s2.substring (0, 3)) == 0)
{
// do something..
}

There is context missing here:
What are you trying to scan for? What type of application? How often is it expected to run?
These things are important because different scenarios call for different solutions:
If this is a one-time scan then this is probably unneeded optimization. Even for a 20MB text file, it wouldn't take more than a couple of minutes in the worst case.
If you have a set of inputs and for each of them you're scanning all the words in a 20MB file, it might be better to sort/index the 20MB file to make it easy to look up matches and skip the 99% of unnecessary comparisons. Also, if inputs tend to repeat themselves it might make sense to employ caching.
Other solutions might also be relevant, depending on the actual problem.
But if you boil it down only to comparing the first 3 characters of two strings, I believe the code snippets given here are as good as you're going to get - they're all O(1)*, so there's no drastic optimization you can do.
*The only place where this might not hold true is if getting the length of the string is O(n) rather than O(1) (which is the case for the strlen function in C++), which is not the case for Java and C# string objects.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

optimization - converting std input to integer array in java - java

Related

NZEC error in Hackerearth problem in java

How to make a program to print the series : 1,12,123...12345678910,1234567891011...& so on to the nth term?

Executing code N times and other code N+1 times

which code is more efficient?

Compare first three characters of two strings

Categories

Resources