Difference between this Java and C code? - java

I tried to convert some C code to Java, but it is working slightly different. It is XOR encryption and for some data it returns the same results, so I know it is pretty close, but for some data it doesn't work exactly the same (different results).
C code (runs on x86 Windows, compiled with Borland Builder):
void Crypt(unsigned char *data,long len)
{
char key[] = "X02$B:";
ULONG crypt_ptr;
long x;
for(crypt_ptr=0,x=0;x<len;x++)
{
data[x] ^= key[crypt_ptr];
if(crypt_ptr < (sizeof(key) - 2))
key[crypt_ptr] += key[crypt_ptr + 1];
else
key[crypt_ptr] += key[0];
if(!key[crypt_ptr])
key[crypt_ptr] += (char) 1;
if(++crypt_ptr >= sizeof(key) - 1)
crypt_ptr = 0;
}
}
Java Code (which runs on the Android platform if it matters):
public static void Crypt(byte[] data,int offset,int len)
{
// EDIT: Changing this to byte[] instead of char[] seems to have fixed code
//char[] key = {'X','0','2','$','B',':'};
byte[] key = {'X','0','2','$','B',':'};
int size_of_key = 7;
int crypt_ptr;
int x;
for(crypt_ptr=0,x=0;x<len;x++)
{
data[x+offset] ^= key[crypt_ptr];
if(crypt_ptr < (size_of_key - 2))
key[crypt_ptr] += key[crypt_ptr + 1];
else
key[crypt_ptr] += key[0];
if(key[crypt_ptr] == 0)
key[crypt_ptr] += (char) 1;
if(++crypt_ptr >= size_of_key - 1)
crypt_ptr = 0;
}
}
I have confirmed that the data going in to each function is the same, and for the Java version I am passing in the correct offset value within the byte array. As mentioned, it works sometimes, so I don't think it is a major/obvious issue, more like some minor issue between signed vs unsigned values. If it helps at all, the first byte that was different was at byte 125 (index 124 if zero-based) in the data. I didn't see a pattern though, like every 125 bytes, it was pretty much random after that. The data is only 171 bytes, and I can figure out how to post as an attachment probably if needed, but I don't think it is.

I guess it's because char is 16-bit in java. So when you increment key key[crypt_ptr] += (char) 1 or add two chars key[crypt_ptr] += key[crypt_ptr + 1], it acts in different way from c (where char is 8-bit).
Try to use bytes everywhere instead of chars, just use symbol codes for initialization.

Your key values need to be 8-bit. Try
byte[] key = "X02$B:".getBytes();

Why don't you show us an example where the differences manifest themselves?
BTW, I would write:
char[] key = {'X','0','2','$','B',':', '\0'};

Related

Reverse Engineering - What's Kind of Algorithm In This Code?

Someone can help me to discover what type of algorithm is this?
public class hf1 {
public static final String[] f3650a = {"0sFU#W>Ao*BT64?[L5aONSK.'"...};
public static final String[] f3651b = {" \bB\u0017\u001e)YBN\u001eT/\u001e4V\u0001ZT/6VV"...};
public static final String[] f3652c = {"\u0000\u0000\u0013\u00006\u0000\\\u0000\u0000¢\u0000¹\u0000¿\u0000"...};
public static String m5192a(int i) {
int i2 = i / 4096;
int i3 = i % 4096;
int i4 = i + 1;
int i5 = i4 / 4096;
int i6 = i4 % 4096;
String[] strArr = f3652c;
String str = strArr[i2];
String str2 = strArr[i5];
int i7 = i3 * 2;
int charAt = ((str.charAt(i7 + 1) & 65535) << 16) | (str.charAt(i7) & 65535);
int i8 = i6 * 2;
int charAt2 = ((str2.charAt(i8 + 1) << 16) | str2.charAt(i8)) - charAt;
char[] cArr = new char[charAt2];
for (int i9 = 0; i9 < charAt2; i9++) {
int i10 = charAt + i9;
int charAt3 = f3651b[i10 / 8192].charAt(i10 % 8192) & 65535;
cArr[i9] = f3650a[charAt3 / 8192].charAt(charAt3 % 8192);
}
return new String(cArr);
}
}
If i call m5192a(1) passing any int index as parameter, the code returns a String. It's like some kind of hiding plain strings on the source code.
Someone have any idea of a possible reverse code? To transform plain string in to this? Is this a known technique with a name?
I'm not sure about a "known technique with a name" - perhaps keywords like "inverse function" or "bidirectionalization" might help, e.g. In pure functional languages, is there an algorithm to get the inverse function?
For this particular function, luckily it seems pretty straightforward to invert as long as the f365xx string constants abide by some certain properties. Notice that cArr is filled one-by-one and each character is independent of each other. We can try to use both the characters and the length of the string to try to decode the input i. Specifically,
A single character will give us candidates for i10, which will give us candidates for charAt. Repeating over all the characters, hopefully only one candidate will work for every character, and that candidate will give us i7/i2
-> i2/i3 -> i if the strings are "benevolent".
If the strings are not benevolent, then the length of the string will give us candidates for charAt2 which, together with the candidates for charAt, will give us candidates for i8/i6. Hopefully there is only one candidate pair i6/i8 -> i5/i6 -> -> i4 -> i if the string constants are benevolent.
If the strings are still not benevolent, then it is unsolvable since the function is not one-to-one.
I'll outline the algorithm for the 2 bullet points now, but as a disclaimer, I haven't tested this and it's just pseudocode.
From a character
Let's say you start with the first character, cArr[0]. If we focus on the 3 lines inside the for loop,
int i10 = charAt + i9;
int charAt3 = f3651b[i10 / 8192].charAt(i10 % 8192) & 65535;
cArr[i9] = f3650a[charAt3 / 8192].charAt(charAt3 % 8192);
Then we can get charAt3 by finding where cArr[0] appears in f3650a. For example,
charAt3candidates = []
for i = 0 to length(f3650a)-1:
indexCandidates = f3650[i].indexOf(cArr[0])
for indexCandidate in indexCandidates:
charAt3candidate = i * 8192 + indexCandidate
if indexCandidate >= 8192 or charAt3candidate >= 65536:
continue
charAt3candidates.append(charAt3candidate)
Note that we filter out indexCandidate >= 8192 and charAt3candidate >= 65536 due to modulo's and bitwise and's that make greater values impossible (x & 65535 == x % 65536 since 65536 = 2^16).
Do a similar process to find the candidates for i10 and subtract i9 to get candidates for charAt. If you have more than one candidate, repeat this process for every character in cArr and keep only the candidates of charAt which are valid for every character in cArr. Even after repeating for every character in cArr, you may still have many candidates.
For each candidate of charAt, use the right-half of the expression for charAt:
charAt % 65536 = (str.charAt(i7) & 65535)
combined with str = f3652c[i2] to try to find i2/i7 the same way we tried to find charAt3 and i10. Note that the left half of the expression, ((str.charAt(i7 + 1) & 65535) << 16), doesn't matter (for now). Hope and pray that this only gives you one possible choice of i2/i7. If so, use it to find i2/i3 -> i. If there's more than one option, then we have to try using the length of the string.
From the length of the string
The length of the string gives us charAt2, and we can use a similar process as before to find all the candidates for i8 and str2 (i5). Note in this process we have to try with every possible candidate for charAt. Hopefully there is only one candidate pair for i8/i5 remaining. If so, use it to find i5/i6 -> i4 -> i. If there are more than 1 candidate, then the function is not one-to-one and it's impossible to invert.
Good luck!

ByteArray to DoubleArray in Kotlin

I want to extrapolate a byte array into a double array.
I know how to do it in Java. But the AS converter doesn't work for this... :-D
This is the class I want to write in Kotlin:
class ByteArrayToDoubleArrayConverter {
public double[] invoke(byte[] bytes) {
double[] doubles = new double[bytes.length / 2];
int i = 0;
for (int n = 0; n < bytes.length; n = n + 2) {
doubles[i] = (bytes[n] & 0xFF) | (bytes[n + 1] << 8);
i = i + 1;
}
return doubles;
}
}
This would be a typical example of what results are expected:
class ByteArrayToDoubleArrayConverterTest {
#Test
fun `check typical values`() {
val bufferSize = 8
val bytes = ByteArray(bufferSize)
bytes[0] = 1
bytes[1] = 0
bytes[2] = 0
bytes[3] = 1
bytes[4] = 0
bytes[5] = 2
bytes[6] = 1
bytes[7] = 1
val doubles = ByteArrayToDoubleArrayConverter().invoke(bytes)
assertTrue(1.0 == doubles[0])
assertTrue(256.0 == doubles[1])
assertTrue(512.0 == doubles[2])
assertTrue(257.0 == doubles[3])
}
}
Any idea? Thanks!!!
I think this would be clearest with a helper function.  Here's an extension function that uses a lambda to convert pairs of bytes into a DoubleArray:
inline fun ByteArray.mapPairsToDoubles(block: (Byte, Byte) -> Double)
= DoubleArray(size / 2){ i -> block(this[2 * i], this[2 * i + 1]) }
That uses the DoubleArray constructor which takes an initialisation lambda as well as a size, so you don't need to loop through setting values after construction.
The required function then simply needs to know how to convert each pair of bytes into a double.  Though it would be more idiomatic as an extension function rather than a class:
fun ByteArray.toDoubleSamples() = mapPairsToDoubles{ a, b ->
(a.toInt() and 0xFF or (b.toInt() shl 8)).toDouble()
}
You can then call it with e.g.:
bytes.toDoubleSamples()
(.toXxx() is the conventional name for a function which returns a transformed version of an object.  The standard name for this sort of function would be toDoubleArray(), but that normally converts each value to its own double; what you're doing is more specialised, so a more specialised name would avoid confusion.)
The only awkward thing there (and the reason why the direct conversion from Java fails) is that Kotlin is much more fussy about its numeric types, and won't automatically promote them the way Java and C do; it also doesn't have byte overloads for its bitwise operators.  So you need to call toInt() explicitly on each byte before you can call and and shl, and then call toDouble() on the result.
The result is code that is a lot shorter, hopefully much more readable, and also very efficient!  (No intermediate arrays or lists, and — thanks to the inline — not even any unnecessary function calls.)
(It's a bit more awkward than most Kotlin code, as primitive arrays aren't as well-supported as reference-based arrays — which are themselves not as well-supported as lists.  This is mainly for legacy reasons to do with Java compatibility.  But it's a shame that there's no chunked() implementation for ByteArray, which could have avoided the helper function, though at the cost of a temporary list.)

First Byte's Bit Off-By-One after Steganography

Currently working on a Steganography project where, given a message in bytes and the number of bits to modify per byte, hide a message in an arbitrary byte array.
In the first decoded byte of the resulting message, the value has it's first (leftmost) bit set to '1' instead of '0'. For example, when using message "Foo".getBytes() and maxBits = 1 the result is "Æoo", not "Foo" (0b01000110 gets changed to 0b11000110). With message "Æoo".getBytes() and maxBits = 1 result is "Æoo", meaning the bit is not getting flipped as far as I can tell.
Only certain values of maxBits for certain message bytes cause this error, for example "Foo" encounters this problem at maxBits equal to 1, 5, and 6, whereas "Test" encounters this problem at maxBits equal to 1, 3, and 5. Only the resulting first character ends up with its first bit set, and this problem only occurs at the specified values of this.maxBits related to the initial data.
Why, for certain values of maxBits, is the first bit of the
resulting decoded message always 1?
Why do different inputs have different values for maxBits that
work fine, and others that do not?
What is the pattern with the value of maxBits and the
resulting erroneous results in relation to the original data?
Encode and Decode Methods:
public byte[] encodeMessage(byte[] data, byte[] message) {
byte[] encoded = data;
boolean[] messageBits = byteArrToBoolArr(message);
int index = 0;
for (int x = 0; x < messageBits.length; x++) {
encoded[index] = messageBits[x] ? setBit(encoded[index], x % this.maxBits) : unsetBit(encoded[index], x % this.maxBits);
if (x % this.maxBits == 0 && x != 0)
index++;
}
return encoded;
}
public byte[] decodeMessage(byte[] data) {
boolean[] messageBits = new boolean[data.length * this.maxBits];
int index = 0;
for (int x = 0; x < messageBits.length; x++) {
messageBits[x] = getBit(data[index], x % this.maxBits);
if (x % this.maxBits == 0 && x != 0)
index++;
}
return boolArrToByteArr(messageBits);
}
Unset, Set, and Get Methods:
public byte unsetBit(byte data, int pos) {
return (byte) (data & ~((1 << pos)));
}
public byte setBit(byte data, int pos) {
return (byte) (data | ((1 << pos)));
}
public boolean getBit(byte data, int pos) {
return ((data >>> pos) & 0x01) == 1;
}
Conversion Methods:
public boolean[] byteArrToBoolArr(byte[] b) {
boolean bool[] = new boolean[b.length * 8];
for (int x = 0; x < bool.length; x++) {
bool[x] = false;
if ((b[x / 8] & (1 << (7 - (x % 8)))) > 0)
bool[x] = true;
}
return bool;
}
public byte[] boolArrToByteArr(boolean[] bool) {
byte[] b = new byte[bool.length / 8];
for (int x = 0; x < b.length; x++) {
for (int y = 0; y < 8; y++) {
if (bool[x * 8 + y]) {
b[x] |= (128 >>> y);
}
}
}
return b;
}
Sample Code and Output:
test("Foo", 1);//Æoo
test("Foo", 2);//Foo
test("Foo", 3);//Foo
test("Foo", 4);//Foo
test("Foo", 5);//Æoo
test("Foo", 6);//Æoo
test("Foo", 7);//Foo
test("Foo", 8);//Foo
test("Test", 1);//Ôest
test("Test", 2);//Test
test("Test", 3);//Ôest
test("Test", 4);//Test
test("Test", 5);//Ôest
test("Test", 6);//Test
test("Test", 7);//Test
test("Test", 8);//Test
private static void test(String s, int x) {
BinaryModifier bm = null;
try {
bm = new BinaryModifier(x);//Takes maxBits as constructor param
} catch (BinaryException e) {
e.printStackTrace();
}
System.out.println(new String(bm.decodeMessage(bm.encodeMessage(new byte[1024], s.getBytes()))));
return;
}
Your logic of incrementing index has two flaws, which overwrite the first bit of the first letter. Obviously, the bug is expressed when the overwriting bit is different to the first bit.
if (x % this.maxBits == 0 && x != 0)
index++;
The first problem has to do with embedding only one bit per byte, i.e. maxBits = 1. After you have embedded the very first bit and reached the above conditional, x is still 0, since it will be incremented at the end of the loop. You should be incrementing index at this point, but x != 0 prevents you from doing so. Therefore, the second bit will also be embedded in the first byte, effectively overwriting the first bit. Since this logic also exists in the decode method, you read the first two bits from the first byte.
More specifically, if you embed a 00 or 11, it will be fine. But a 01 will be read as 11 and a 10 will be read as 00, i.e., whatever value is the second bit. If the first letter has an ascii code less or equal than 63 (00xxxxxx), or greater or equal than 192 (11xxxxxx), it will come out fine. For example:
# -> # : 00100011 (35) -> 00100011 (35)
F -> Æ : 01000110 (70) -> 11000110 (198)
The second problem has to do with the x % this.maxBits == 0 part. Consider the case where we embed 3 bits per byte. After the 3rd bit, when we reach the conditional we still have x = 2, so the modulo operation will return false. After we have embedded a 4th bit, we do have x = 3 and we're allowed to move on to the next byte. However, this extra 4th bit will be written at the 0th position of the first byte, since x % this.maxBits will be 3 % 3. So again, we have a bit overwriting our very first bit. However, after the first cycle the modulo operation will correctly write only 3 bits per byte, so the rest of our message will be unaffected.
Consider the binary for "F", which is 01000110. By embedding N bits per byte, we effectively embed the following groups in the first few bytes.
1 bit 01 0 0 0 1 1 0
2 bits 010 00 11 0x
3 bits 0100 011 0xx
4 bits 01000 110x
5 bits 010001 10xxxx
6 bits 0100011 0xxxxx
7 bits 01000110
8 bits 01000110x
As you can see, for groups of 5 and 6 bits, the last bit of the first group is 1, which will overwrite our initial 0 bit. For all other cases the overwrite doesn't affect anything. Note that for 8 bits, we end up using the first bit of the second letter. If that happened to have an ascii code greater or equal than 128, it would again overwrite the firstmost 0 bit.
To address all problems, use either
for (int x = 0; x < messageBits.length; x++) {
// code in the between
if ((x + 1) % this.maxBits == 0)
index++;
}
or
for (int x = 0; x < messageBits.length; ) {
// code in the between
x++;
if (x % this.maxBits == 0)
index++;
}
Your code has another potential problem which hasn't been expressed. If your data array has a size of 1024, but you only embed 3 letters, you will affect only the first few bytes, depending on the value of maxBits. However, for the extraction, you define your array to have a size of data.length * this.maxBits. So you end up reading bits from all of the bytes of the data array. This is currently no problem, because your array is populated by 0s, which are converted to empty strings. However, if your array had actual numbers, you'd end up reading a lot of garbage past the point of your embedded data.
There are two general ways of addressing this. You either
append a unique sequence of bits at the end of your message (marker), such that when you encounter that sequence you terminate the extraction, e.g. eight 0s, or
you add a few bits before embedding your actual data (header), which will tell you how to extract your data, e.g., how many bytes and how many bits per byte to read.
One thing you're probably going to run afoul of is the nature of character encoding.
When you call s.getBytes() you are turning the string to bytes using your JVM's default encoding. Then you modify the bytes and you create a new String from the modified bytes again using the default encoding.
So the question is what is that encoding and precisely how does it work. For example, the encoding may well in some cases only be looking at the lower 7 bits of a byte relating to the character, then your setting of the top bit won't have any effect on the string created from the modified bytes.
If you really want to tell if your code is working right, do your testing by directly examining the byte[] being produced by your encode and decode methods, not by turning the modified bytes into strings and looking at the strings.

Byte to "Bit"array

A byte is the smallest numeric datatype java offers but yesterday I came in contact with bytestreams for the first time and at the beginning of every package a marker byte is send which gives further instructions on how to handle the package. Every bit of the byte has a specific meaning so I am in need to entangle the byte into it's 8 bits.
You probably could convert the byte to a boolean array or create a switch for every case but that can't certainly be the best practice.
How is this possible in java why are there no bit datatypes in java?
Because there is no bit data type that exists on the physical computer. The smallest allotment you can allocate on most modern computers is a byte which is also known as an octet or 8 bits. When you display a single bit you are really just pulling that first bit out of the byte with arithmetic and adding it to a new byte which still is using an 8 bit space. If you want to put bit data inside of a byte you can but it will be stored as a at least a single byte no matter what programming language you use.
You could load the byte into a BitSet. This abstraction hides the gory details of manipulating single bits.
import java.util.BitSet;
public class Bits {
public static void main(String[] args) {
byte[] b = new byte[]{10};
BitSet bitset = BitSet.valueOf(b);
System.out.println("Length of bitset = " + bitset.length());
for (int i=0; i<bitset.length(); ++i) {
System.out.println("bit " + i + ": " + bitset.get(i));
}
}
}
$ java Bits
Length of bitset = 4
bit 0: false
bit 1: true
bit 2: false
bit 3: true
You can ask for any bit, but the length tells you that all the bits past length() - 1 are set to 0 (false):
System.out.println("bit 75: " + bitset.get(75));
bit 75: false
Have a look at java.util.BitSet.
You might use it to interpret the byte read and can use the get method to check whether a specific bit is set like this:
byte b = stream.read();
final BitSet bitSet = BitSet.valueOf(new byte[]{b});
if (bitSet.get(2)) {
state.activateComponentA();
} else {
state.deactivateComponentA();
}
state.setFeatureBTo(bitSet.get(1));
On the other hand, you can create your own bitmask easily and convert it to a byte array (or just byte) afterwards:
final BitSet output = BitSet.valueOf(ByteBuffer.allocate(1));
output.set(3, state.isComponentXActivated());
if (state.isY){
output.set(4);
}
final byte w = output.toByteArray()[0];
How is this possible in java why are there no bit datatypes in java?
There are no bit data types in most languages. And most CPU instruction sets have few (if any) instructions dedicated to adressing single bits. You can think of the lack of these as a trade-off between (language or CPU) complexity and need.
Manipulating a single bit can be though of as a special case of manipulating multiple bits; and languages as well as CPU's are equipped for the latter.
Very common operations like testing, setting, clearing, inverting as well as exclusive or are all supported on the integer primitive types (byte, short/char, int, long), operating on all bits of the type at once. By chosing the parameters appropiately you can select which bits to operate on.
If you think about it, a byte array is a bit array where the bits are grouped in packages of 8. Adressing a single bit in the array is relatively simple using logical operators (AND &, OR |, XOR ^ and NOT ~).
For example, testing if bit N is set in a byte can be done using a logical AND with a mask where only the bit to be tested is set:
public boolean testBit(byte b, int n) {
int mask = 1 << n; // equivalent of 2 to the nth power
return (b & mask) != 0;
}
Extending this to a byte array is no magic either, each byte consists of 8 bits, so the byte index is simply the bit number divided by 8, and the bit number inside that byte is the remainder (modulo 8):
public boolean testBit(byte[] array, int n) {
int index = n >>> 3; // divide by 8
int mask = 1 << (n & 7); // n modulo 8
return (array[index] & mask) != 0;
}
Here is a sample, I hope useful for you!
DatagramSocket socket = new DatagramSocket(6160, InetAddress.getByName("0.0.0.0"));
socket.setBroadcast(true);
while (true) {
byte[] recvBuf = new byte[26];
DatagramPacket packet = new DatagramPacket(recvBuf, recvBuf.length);
socket.receive(packet);
String bitArray = toBitArray(recvBuf);
System.out.println(Integer.parseInt(bitArray.substring(0, 8), 2)); // convert first byte binary to decimal
System.out.println(Integer.parseInt(bitArray.substring(8, 16), 2)); // convert second byte binary to decimal
}
public static String toBitArray(byte[] byteArray) {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < byteArray.length; i++) {
sb.append(String.format("%8s", Integer.toBinaryString(byteArray[i] & 0xFF)).replace(' ', '0'));
}
return sb.toString();
}

Circular increment: Which is "better"?

When you have a circular buffer represented as an array, and you need the index to wraparound (i.e., when you reach the highest possible index and increment it), is it "better" to:
return (++i == buffer.length) ? 0: i;
Or
return ++i % buffer.length;
Has using the modulo operator any drawbacks? Is it less readable than the first solution?
EDIT:
Of course it should be ++i instead of i++, changed that.
EDIT 2:
One interesting note: I found the first line of code in ArrayBlockingQueue's implementation by Doug Lea.
Update: OP has admitted in a comment that it should have been pre-increment instead. Most of the other answers missed this. There lies proof that the increment in this scenario leads to horrible readability: there's a bug, and most people couldn't see it.
The most readable version is the following:
return (i == buffer.length-1) ? 0 : i+1;
Using ++ adds unnecessary side effect to the check (not to mention that I strongly feel that you should've used pre-increment instead)
What's the problem with the original code? Let's have a look, shall we?
return (i++ == N) ? 0 : i; // OP's original, slightly rewritten
So we know that:
i is post-incremented, so when i == N-1 before the return statement, this will return N instead of wrapping to 0 immediately
Is this intended? Most of the time, the intent is to use N as an exclusive upper bound
The variable name i suggests a local variable by naming convention, but is it really?
Need to double check if it's a field, due to side-effect
In comparison:
return (i == N-1) ? 0 : i+1; // proposed alternative
Here we know that:
i is not modified, doesn't matter if it's local variable or field
When i == N-1, the returned value is 0, which is more typical scenario
The % approach
Alternatively, you can also use the % version as follows:
return (i+1) % N;
What's the problem with %? Well, the problem is that even though most people think it's the modulo operator, it's NOT! It's the remainder operator (JLS 15.17.3). A lot of people often get this confused. Here's a classic example:
boolean isOdd(int n) {
return (n % 2) == 1; // does this work???
}
That code is broken!!! It returns false for all negative values! The problem is that -1 % 2 == -1, although mathematically -1 = 1 (mod 2).
% can be tricky, and that's why I recommend the ternary operator version instead. The most important part, though, is to remove the side-effect of the increment.
See also
Wikipedia: modulo operation
Don't ask me to choose between two options which both contain postincrement (*) mixed with expression evaluation. I'll say "none".
(*) Update: It was later fixed to preincrement.
Wouldn't the i++ % buffer.length version have the drawback that it keeps incrementing i, which could lead to it hitting some sort of max_int/max_long/max_whatever limit?
Also, I would split this into
i = (i++ == buffer.length) ? 0 : i;
return i;
since otherwise you'd most likely have a bug.
The first one will give you an ArrayIndexOutOfBoundsException because i is never actually reset to 0.
The second one will (probably) give you an overflow error (or related undesirable effect) when i == Integer.MAX_VALUE (which might not actually happen in your case, but isn't good practice, IMHO).
So I'd say the second one is "more correct", but I would use something like:
i = (i+1) % buffer.length;
return i;
Which I think has neither of the two problems.
I went ahead and tested everyone's code, and was sad to find that only one of the previous posts (at the time of this post's writing) works. (Which one? Try them all to find out! You might be surprised!)
public class asdf {
static int i=0;
static int[] buffer = {0,1,2};
public static final void main(String args[]){
for(int j=0; j<5; j++){
System.out.println(buffer[getIndex()]);
}
}
public static int getIndex(){
// return (++i == buffer.length) ? 0: i;
// return ++i % buffer.length;
// i = (i++ == buffer.length) ? 0 : i;
// return i;
// i++;
// if (i >= buffer.length)
// {
// i = 0;
// }
// return i;
// return (i+1 == buffer.length) ? 0 : i+1;
i = (i+1) % buffer.length;
return i;
}
}
Expected output is:
1
2
0
1
2
Apologies in advance if there's a coding error on my part and I accidentally insult someone! x.x
PS: +1 for the previous comment about not using post-increment with equality checks (I can't actually upmod posts yet =/ )
I prefer the condition approach even if we use unsigned type, modulo operation has drawbacks. Using modulo has a bad side effect when the number tested rolls back to zero
Example:
255 % 7 == 3
So if you use byte (unsigned char) for example, when the number roll after 255 (i.e. zero), it will not result to 4. Should result to 4 (when 256 % 7), so it rotates correctly. So just use testing(if and ternary operator) constructs for correctness
If for achieving performance, and if the number is multiple of 2 (i.e. 2, 4, 8, 16, 32, 64, ...), use & operator.
So if the buffer length is 16, use:
n & 15
If buffer length is 64, use 63:
n & 63
Those rotate correctly even if the number goes back to zero. By the way, if the number is multiple of 2, even the modulo/remainder approach would also fit the bill, i.e. it will rotate correctly. But I can hazard a guess that & operation is faster than % operation.
I think the second solution has the clear advantage that it works, whereas the first does not. The first solution will always return zero when i becomes bigger than buffer.length because i is never reset.
The modulo operator has no drawbacks.
Surely it would be more readable to use an if:
i++;
if (i >= buffer.length)
{
i = 0;
}
return i;
Depends a bit if buffer.length ever changes.
This is very subjective and depends on what your colleagues are used to see. I would personally prefer the first option, as it expresses explicitly what the code does, i.e. if the buffer length is reached, reset to 0. You don't have to perform any mathematical thinking or even know what the modulo does (of course you should! :)
Personally, I prefer the modulo approach. When I see modulo, I immediately think of range limiting and looping but when I see the ternary operator, I always want to think more carefully about it simply because there are more terms to look at. Readability is subjective though, as you already pointed out in your tagging, and I suspect that most people will disagree with my opinion.
However, performance is not subjective. Modulo implies a divison operation which is often slower than a comparison against zero. Obviously, this is more difficult to determine in Java since we're not compiling to native code until the jitter kicks in.
My advice would be write which ever you feel is most appropriate (so long as it works!) and get a colleague (assuming you have one) to asses it. If they disagree, ask another colleague - then go with the majority vote. #codingbydemocracy
It is also worth noting, that if our buffer has length of power of 2 then very efficient bit manipulation will work:
idx = (idx + 1) & (length - 1)
You can use also bit manipulation:
idx = idx & ((idx-length)>>31)
But it's not faster than the if-variant on my machine.
Here is some code to compare running time in C#:
Stopwatch sw = new Stopwatch();
long cnt = 0;
int k = 0;
int modulo = 10;
sw.Start();
k = 0;
cnt = 0;
for ( int j=0 ; j<100000000 ; j++ ) {
k = (k+1) % modulo;
cnt += k;
}
sw.Stop();
Console.WriteLine( "modulo cnt=" + cnt.ToString() + " " + sw.Elapsed.ToString() );
sw.Reset();
sw.Start();
k = 0;
cnt = 0;
for (int j = 0; j < 100000000; j++) {
if ( ++k == modulo )
k = 0;
cnt += k;
}
sw.Stop();
Console.WriteLine( "if cnt=" + cnt.ToString() + " " + sw.Elapsed.ToString() );
sw.Reset();
sw.Start();
k = 0;
cnt = 0;
for (int j = 0; j < 100000000; j++) {
++k;
k = k&((k-modulo)>>31);
cnt += k;
}
sw.Stop();
Console.WriteLine( "bit cnt=" + cnt.ToString() + " " + sw.Elapsed.ToString() );
The Output:
modulo cnt=450000000 00:00:00.6406035
if cnt=450000000 00:00:00.2058015
bit cnt=450000000 00:00:00.2182448
I prefer the modulo operator for the simple reason it is shorter. And any program should be able to dream in modulo since it is almost as common as a plus operator.

Categories