Why use bit shifting instead of a for loop? - java

I created the following code to find parity of a binary number (i.e output 1 if the number of 1's in the binary word is odd, output 0 if the number of 1's is even).
public class CalculateParity {
String binaryword;
int totalones = 0;
public CalculateParity(String binaryword) {
this.binaryword = binaryword;
getTotal();
}
public int getTotal() {
for(int i=0; i<binaryword.length(); i++) {
if (binaryword.charAt(i) == '1'){
totalones += 1;
}
}
return totalones;
}
public int calcParity() {
if (totalones % 2 == 1) {
return 1;
}
else {
return 0;
}
}
public static void main(String[] args) {
CalculateParity bin = new CalculateParity("1011101");
System.out.println(bin.calcParity());
}
}
However, all of the solutions I find online almost always deal with using bit shift operators, XORs, unsigned shift operations, etc., like this solution I found in a data structure book:
public static short parity(long x){
short result = 0;
while (x != 0) {
result A=(x&1);
x >>>= 1;
}
return result;
}
Why is this the case? What makes bitwise operators more of a valid/standard solution than the solution I came up with, which is simply iterating through a binary word of type String? Is a bitwise solution more efficient? I appreciate any help!

The code that you have quoted uses a loop as well (i.e., while):
public static short parity(long x){
short result = 9;
while (x != 9) {
result A=(x&1);
x >>>= 1;
}
return result;
}
You need to acknowledge that you are using a string that you know beforehand will be composed of only digits, and conveniently in a binary representation. Naturally, given those constraints, one does not need to use bitwise operations instead one just parsers char-by-char and does the desired computations.
On the other hand, if you receive as a parameter a long, as the method that you have quoted, then it comes in handy to use bitwise operations to go through each bit (at a time) in a number and perform the desired computation.
One could also convert the long into a string and apply the same logic code-wise that you have applied, but first, one would have to convert that long into binary. However, that approach would add extra unnecessary steps, more code, and would be performance-wise worse. Probably, the same applies vice-versa if you have a String with your constraints. Nevertheless, a String is not a number, even if it is only composed of digits, which makes using a type that represents a number (e.g., long) even a more desirable approach.
Another thing that you are missing is that you did some of the heavy lifting by converting already a number to binary, and encoded into a String new CalculateParity("1011101");. So you kind of jump a step there. Now try to use your approach, but this time using "93" and find the parity.

If you want know if a String is even. I think this method below is better.
If you convert a String too
long which the length of the String is bigger than 64. there will a error occur.
both of the method you
mention is O(n) performance.It will not perform big different. but
the shift method is more precise and the clock of the cpu use will a little bit less.
private static boolean isEven(String s){
char[] chars = s.toCharArray();
int i = 0;
for(char c : chars){
i ^= c;
}
return i == 0;
}

You use a string based method for a string input. Good choice.
The code you quote uses an integer-based method for an integer input. An equally good choice.

Related

Count the Characters in a String Recursively & treat "eu" as a Single Character

I am new to Java, and I'm trying to figure out how to count Characters in the given string and threat a combination of two characters "eu" as a single character, and still count all other characters as one character.
And I want to do that using recursion.
Consider the following example.
Input:
"geugeu"
Desired output:
4 // g + eu + g + eu = 4
Current output:
2
I've been trying a lot and still can't seem to figure out how to implement it correctly.
My code:
public static int recursionCount(String str) {
if (str.length() == 1) {
return 0;
}
else {
String ch = str.substring(0, 2);
if (ch.equals("eu") {
return 1 + recursionCount(str.substring(1));
}
else {
return recursionCount(str.substring(1));
}
}
}
OP wants to count all characters in a string but adjacent characters "ae", "oe", "ue", and "eu" should be considered a single character and counted only once.
Below code does that:
public static int recursionCount(String str) {
int n;
n = str.length();
if(n <= 1) {
return n; // return 1 if one character left or 0 if empty string.
}
else {
String ch = str.substring(0, 2);
if(ch.equals("ae") || ch.equals("oe") || ch.equals("ue") || ch.equals("eu")) {
// consider as one character and skip next character
return 1 + recursionCount(str.substring(2));
}
else {
// don't skip next character
return 1 + recursionCount(str.substring(1));
}
}
}
Recursion explained
In order to address a particular task using Recursion, you need a firm understanding of how recursion works.
And the first thing you need to keep in mind is that every recursive solution should (either explicitly or implicitly) contain two parts: Base case and Recursive case.
Let's have a look at them closely:
Base case - a part that represents a simple edge-case (or a set of edge-cases), i.e. a situation in which recursion should terminate. The outcome for these edge-cases is known in advance. For this task, base case is when the given string is empty, and since there's nothing to count the return value should be 0. That is sufficient for the algorithm to work, outcomes for other inputs should be derived from the recursive case.
Recursive case - is the part of the method where recursive calls are made and where the main logic resides. Every recursive call eventually hits the base case and stars building its return value.
In the recursive case, we need to check whether the given string starts from a particular string like "eu". And for that we don't need to generate a substring (keep in mind that object creation is costful). instead we can use method String.startsWith() which checks if the bytes of the provided prefix string match the bytes at the beginning of this string which is chipper (reminder: starting from Java 9 String is backed by an array of bytes, and each character is represented either with one or two bytes depending on the character encoding) and we also don't bother about the length of the string because if the string is shorter than the prefix startsWith() will return false.
Implementation
That said, here's how an implementation might look:
public static int recursionCount(String str) {
if(str.isEmpty()) {
return 0;
}
return str.startsWith("eu") ?
1 + recursionCount(str.substring(2)) : 1 + recursionCount(str.substring(1));
}
Note: that besides from being able to implement a solution, you also need to evaluate it's Time and Space complexity.
In this case because we are creating a new string with every call time complexity is quadratic O(n^2) (reminder: creation of the new string requires allocating the memory to coping bytes of the original string). And worse case space complexity also would be O(n^2).
There's a way of solving this problem recursively in a linear time O(n) without generating a new string at every call. For that we need to introduce the second argument - current index, and each recursive call should advance this index either by 1 or by 2 (I'm not going to implement this solution and living it for OP/reader as an exercise).
In addition
In addition, here's a concise and simple non-recursive solution using String.replace():
public static int count(String str) {
return str.replace("eu", "_").length();
}
If you would need handle multiple combination of character (which were listed in the first version of the question) you can make use of the regular expressions with String.replaceAll():
public static int count(String str) {
return str.replaceAll("ue|au|oe|eu", "_").length();
}

Map words to single characters

I'm building an hash function which should map any String (max length 100 characters) to a single [A-Z] character (I'm using it for sharding purposes).
I came up with this simple Java function, is there any way to make it faster?
public static final char stringToChar(final String s) {
long counter = 0;
for (char c : s.toCharArray()) {
counter += c;
}
return (char)('A'+(counter%26));
}
A quick trick to have an even distribution of the "shards" is using an hash function.
I suggest this method that uses the default java String.hashCode() function
public static char getShardLabel(String string) {
int hash = string.hashCode();
// using Math.flootMod instead of operator % beacause '%' can produce negavive outputs
int hashMod = Math.floorMod(hash, 26);
return (char)('A'+(hashMod));
}
As pointed out here this method is considered "even enough".
Based on a quick test it looks faster than the solution you suggested.
On 80kk strings of various lengths:
getShardLabel took 65 milliseconds
stringToChar took 571 milliseconds

Dynamic Programming approach - Interleaving Parentheses

Below is my code for the problem described on https://community.topcoder.com/stat?c=problem_statement&pm=14635. It keeps track of possible interleaves (as described in the problem description given) through a static variable countPossible.
public class InterleavingParentheses{
public static int countPossible = 0;
public static Set<String> dpyes = new HashSet<>(); //used for dp
public static Set<String> dpno = new HashSet<>(); //used for dp
public static void numInterleaves(char[] s1, char[] s2, int size1, int size2){
char[] result = new char[size1+size2];
numInterleavesHelper(result,s1,s2,size1,size2,0,0,0);
}
public static void numInterleavesHelper(char[] res, char[] s1, char[] s2, int size1, int size2, int pos, int start1, int start2){
if (pos == size1+size2){
if (dpyes.contains(new String(res))){
countPossible+=1;
}
else{
if(dpno.contains(new String(res))){
countPossible+=0;
}
else if (isValid(res)){
dpyes.add(new String(res));
countPossible+=1;
}
else{
dpno.add(new String(res));
}
}
}
if (start1 < size1){
res[pos] = s1[start1];
numInterleavesHelper(res,s1,s2,size1,size2,pos+1,start1+1,start2);
}
if (start2 < size2){
res[pos] = s2[start2];
numInterleavesHelper(res,s1,s2,size1,size2,pos+1,start1,start2+1);
}
}
private static boolean isValid(char[] string){
//basically checking to see if parens are balanced
LinkedList<Character> myStack = new LinkedList<>();
for (int i=0; i<string.length; i++){
if (string[i] == "(".charAt(0)){
myStack.push(string[i]);
}
else{
if (myStack.isEmpty()){
return false;
}
if (string[i] == ")".charAt(0)){
myStack.pop();
}
}
}
return myStack.isEmpty();
}
}
I use the scanner class to put in the input strings s1 = "()()()()()()()()()()()()()()()()()()()()" and s2 = "()()()()()()()()()()()()()()()()()" into this function and while the use of the HashSet greatly lowers the time because duplicate interleaves are accounted for, large input strings still take up a lot of time. The sizes of the input strings are supposed to be at most 2500 characters and my code is not working for strings that long. How can i modify this to make it better?
Your dp set is only used at the end, so at best you can save an O(n), but you've already done many O(n) operations to reach that point so the algorithm completexity is about the same. For dp to be effective, you need to be reducing O(2^n) operations to, say O(n^2).
As one of the testcases has an answer of 487,340,184, then for your program to produce this answer, it would need that number of calls to numInterleavesHelper because each call can only increment countPossible by 1. The question asking for the answer "modulo 10^9 + 7" as well indicates that there is a large number expected as an answer.
This rules out things like creating every possible resulting string, most string manipulation, and counting 1 string at a time. Even if you optimized it, then the number of iterations alone makes it unfeasible.
Instead, think of algorithms that have about 10,000,000 iterations. Each string has a length of 2500. These constraints were chosen on purpose so that 2500 * 2500 fits within this number of iterations, suggesting a 2D dp solution.
If you create an array:
int ways[2501][2501] = new int[2501][2501];
then you want the answer to be:
ways[2500][2500]
Here ways[x][y] is the number of ways of creating valid strings where x characters have been taken from the first string, and y characters have been taken from the second string. Each time you add a character, you have 2 choices, taking from the first string or taking from the second. The new number of ways is the sum of the previous ones, so:
ways[x][y] = ways[x-1][y] + ways[x][y-1]
You also need to check that each string is valid. They're valid if each time you add a character, the number of opening parens minus the number of closing parens is 0 or greater, and this number is 0 at the end. The number of parens of each type in every prefix of s1 and s2 can be precalculated to make this a constant-time check.

optimization - converting std input to integer array in java

I want to read each line of input, store the numbers in an int[] array preform some calculations, then move onto my next line of input as fast as possible.
Input (stdin)
2 4 8
15 10 5
12 14 3999 -284 -71
0 -213 18 4 2
0
This is a pure optimization problem and not entirely good practice in the real world as I'm assuming perfect input. I'm interested in how to improve my current method for taking input from stdin and representing it as an integer array. I have seen methods using scanner where they use a getnextint method, however I've read in multiple places scanner is a lot slower than BufferedReader.
Can this taking in of input step be improved?
Current Method
BufferedReader bufferedInput = new BufferedReader(new InputStreamReader(System.in));
String line;
String[] lineArray;
try{
// a line with just "0" indicates end of std input
while((line = bufferedInput.readLine()) != "0"){
lineArray = line.split("\\s+"); // is "\\s+" the optimized regex
int arrlength = lineArray.length;
int[] lineInt = new int[arrlength];
for(int i = 0; i < arrlength; i++){
lineInt[i] = Integer.parseInt(lineArray[i]);
}
// Preform some operations on lineInt, then regenerate a new
// lineInt with inputs from next line of stdin
}
}catch(IOException e){
}
judging from other questions Difference between parseInt and valueOf in java? parseint seems to be the most efficient method for converting strings to integers1. Any enlightenment would be of great help.
Thank you :)
Edit 1: removed GCD information and 'algorithm' tag
Edit 2: (hopefully) made question more concise, grammatical fix ups
First of all, I just want out that it is totally pointless optimizing in your particular example.
For your example, most people would agree that the best solution is not the optimal one. Rather the most readable solution is will be the best.
Having said that, if you want the most optimal solution, then don't use Scanner, don't use BufferedReader.readLine(), don't use String.split and don't use Integer.parseInt(...).
Instead read characters one at a time using BufferedReader.read() and parse and convert them to int by hand. You also need to implement your own "extendable array of int" type that behaves like an ArrayList<Integer>.
This is a lot of (unnecessary) work, and many more lines of code to maintain. BAD IDEA ...
I second what Stephen said, the speed of parsing is likely to massively outperform the speed of actual I/O done, therefore improving parsing won't give you much.
Seriously, don't do this unless you've built the whole system, profiled it and found that inefficient parsing is what keeps it from hitting its performance targets.
But strictly just as an exercise, and because the general principle may be useful elsewhere, here's an example of how to parse it straight from a string.
The assumptions are:
You will use a sensible encoding, where the characters 0..9 are consecutive.
The only characters in the stream will be 0..9, minus sign and space.
All the numbers are well-formed.
Another important caveat is that for the sake of simplicity I used ArrayList, which is a bad idea for storing primitives, the overhead of boxing/unboxing probably wipes out all improvement in parsing speed. In the real world I'd use a list variant custom-made for primitives.
public static List<Integer> parse(String s) {
List<Integer> ret = new ArrayList<Integer>();
int sign = 1;
int current = 0;
boolean inNumber = false;
for (int i = 0; i < s.length(); i++) {
char c = s.charAt(i);
if (c >= '0' && c <= '9') { //we assume a sensible encoding
current = current * 10 + sign * (c-'0');
inNumber = true;
}
else if (c == ' ' && inNumber) {
ret.add(current);
current = 0;
inNumber = false;
sign = 1;;
}
else if (c == '-') {
sign = -1;
}
}
if (inNumber) {
ret.add(current);
}
return ret;
}

Java: looking for the fastest way to check String for presence of Unicode chars in certain range

I need to implement a very crude language identification algorithm. In my world, there are only two languages: English and not-English. I have ArrayList and I need to determine if each String is likely in English or the other language which has its Unicode chars in a certain range. So what I want to do is to check each String against this range using some type of "presence" test. If it passes the test, I say the String is not English, otherwise it's English. I want to try two type of tests:
TEST-ANY: If any char in the string falls within the range, the string passes the test
TEST-ALL: If all chars in the string fall within the range, the string passes the test
Since the array might be very long, I need to implement this very efficiently. What would be the fastest way of doing this in Java?
Thx
UPDATE: I am specifically checking for non-English by looking at a specific range of Unicodes rather then checking for whether the characters are ASCII, in part to take care of the "resume" problem mentioned below. What I am trying to figure out is whether Java provides any classes/methods that essentially implement TEST-ANY or TEST-ALL (or another similar test) as efficiently as possible. In other words, I am trying to avoid reinventing the wheel especially if the wheel invented before me is better anyway.
Here's how I ended up implementing TEST-ANY:
// TEST-ANY
String str = "wordToTest";
int UrangeLow = 1234; // can get range from e.g. http://www.utf8-chartable.de/unicode-utf8-table.pl
int UrangeHigh = 2345;
for(int iLetter = 0; iLetter < str.length() ; iLetter++) {
int cp = str.codePointAt(iLetter);
if (cp >= UrangeLow && cp <= UrangeHigh) {
// word is NOT English
return;
}
}
// word is English
return;
I really don't think that this solution is ideal for determining language, but if you want to check to see if a string is all ascii, you could do something like this:
public static boolean isASCII(String s){
boolean ret = true;
for(int i = 0; i < s.length() ; i++) {
if(s.charAt(i)>=128){
ret = false;
break;
}
}
return ret;
}
So then if you try this:
boolean r = isASCII("Hello");
r would equal true. But if you try:
boolean r = isASCII("Grüß dich");
then r would equal false. I haven't tested performance, but this would work reasonably fast, because all it does is compare a character to the number 128.
But as #AlexanderPogrebnyak mentioned in the comments above, this will return false if you give it "résumé". Be aware of that.
Update:
I am specifically checking for non-English by looking at a specific range of Unicodes rather then checking for whether the characters are ASCII
But ASCII is a range in Unicode (well at least in UTF-8). Unicode is just an extension of ASCII. What the code #mP. and I provided does is it checks to see whether each character is in a certain range. I chose that range to be ASCII, which is any Unicode character that has a decimal value of less than 128. You can just as well choose any other range. But the reason I chose ASCII is because it's the one with the Latin alphabet, the Arabic numbers, and some other common characters that would normally be in an 'English' string.
public static boolean isAscii( String s ){
int length = s.length;
for( int i = 0; i < length; i++){
final char c = s.charAt( i );
if( c > 'z' ){
return false;
}
}
return true;
}
#Hassan thanks for picking the typo replaced test against big Z with little z.

Categories