Why java doesn't give a bit read api?

Why java doesn't give a bit read api? - java

I am using java ByteBuffer to save some basic data into streams. One situation is that I must transfer a "Boolean list" from one machine to another through the internet, so I want the buffer to be as small as possible.
I know the normal way of doing this is using buffer like this:
public final void writeBool(boolean b) throws IOException {
writeByte(b ? 1 : 0);
}
public final void writeByte(int b) throws IOException {
if (buffer.remaining() < Byte.BYTES) {
flush();
}
buffer.put((byte) b);
}
public boolean readBool(long pos) {
return readByte(pos) == 1;
}
public int readByte(long pos) {
return buffer.get((int)pos) & 0xff;
}
This is a way of converting a boolean into byte and store into buffer.
But I'm wandering, why not just putting a bit into buffer, so that a a byte can represent eight booleans, right?
The code maybe like this? But java doesn't have a writeBit function.
public final void writeBool(boolean b) throws IOException {
// java doesn't have it.
buffer.writeBit(b ? 0x1 : 0x0);
}
public final boolean readBool(long pos) throws IOException {
// java doesn't have it
return buffer.getBit(pos) == 0x01;
}
So I think the only way doing that is "store eight booleans into a byte and write",like ((0x01f >>> 4) & 0x01) == 1 to check if the fifth boolean is true. But if I can get a byte, why not just let me get a bit?
Is there some other reason that java cannot let us operate bit?

Yeah, so I mean why not create a BitBuffer?
That would be a question for the Java / OpenJDK development team, if you want a definitive answer. However I expect they would make these points:
Such a class would have extremely limited utility in real applications.
Such a class is unnecessary given that an application doing (notionally) bit-oriented I/O can be implemented using ByteBuffer and a small amount of "bit-twiddling"
There is the technical issue that main-stream operating systems, and main-stream network protocols only support I/O down to the granularity of a byte1. So, for example, file lengths are recorded in bytes, and creating a file containing precisely 42 bits of data (for example) is problematic.
Anyway, there is nothing stopping you from designing and writing your own BitBuffer class; e.g. as a wrapper for ByteBuffer. And sharing it with other people who need such a thing.
Or looking on (say) Github for a Java class called BitBuffer.
1 - Indeed I don't know of any operating system, file system or network protocol that has a smaller granularity than this.

Related

Omitting an instance field at run time in Java

Java's assert mechanism allows disabling putting in assertions which have essentially no run time cost (aside from a bigger class file) if assertions are disabled. But this may cover all situations.
For instance, many of Java's collections feature "fail-fast" iterators that attempt to detect when you're using them in a thread-unsafe way. But this requires both the collection and the iterator itself to maintain extra state that would not be needed if these checks weren't there.
Suppose someone wanted to do something similar, but allow the checks to be disabled and if they are disabled, it saves a few bytes in the iterator and likewise a few more bytes in the ArrayList, or whatever.
Alternatively, suppose we're doing some sort of object pooling that we want to be able to turn on and off at runtime; when it's off, it should just use Java's garbage collection and take no room for reference counts, like this (note that the code as written is very broken):
class MyClass {
static final boolean useRefCounts = my.global.Utils.useRefCounts();
static {
if(useRefCounts)
int refCount; // want instance field, not local variable
}
void incrementRefCount(){
if(useRefCounts) refCount++; // only use field if it exists;
}
/**return true if ready to be collected and reused*/
boolean decrementAndTestRefCount(){
// rely on Java's garbage collector if ref counting is disabled.
return useRefCounts && --refCount == 0;
}
}
The trouble with the above code is that the static bock makes no sense. But is there some trick using low-powered magic to make something along these lines work? (If high powered magic is allowed, the nuclear option is generate two versions of MyClass and arrange to put the correct one on the class path at start time.)

NOTE: You might not need to do this at all. The JIT is very good at inlining constants known at runtime especially boolean and optimising away the code which isn't used.
The int field is not ideal, however, if you are using a 64 bit JVM, the object size might not change.
On the OpenJDK/Oracle JVM (64-bit), the header is 12 bytes by default. The object alignment is 8 byte so the object will use 16 bytes. The field, adds 4 bytes, which after alignment is also 16 bytes.
To answer the question, you need two classes (unless you use generated code or hacks)
class MyClass {
static final boolean useRefCounts = my.global.Utils.useRefCounts();
public static MyClass create() {
return useRefCounts ? new MyClassPlus() : new MyClass();
}
void incrementRefCount() {
}
boolean decrementAndTestRefCount() {
return false;
}
}
class MyClassPlus extends MyClass {
int refCount; // want instance field, not local variable
void incrementRefCount() {
refCount++; // only use field if it exists;
}
boolean decrementAndTestRefCount() {
return --refCount == 0;
}
}

If you accept a slightly higher overhead in the case you’re using your ref count, you may resort to external storage, i.e.
class MyClass {
static final WeakHashMap<MyClass,Integer> REF_COUNTS
= my.global.Utils.useRefCounts()? new WeakHashMap<>(): null;
void incrementRefCount() {
if(REF_COUNTS != null) REF_COUNTS.merge(this, 1, Integer::sum);
}
/**return true if ready to be collected and reused*/
boolean decrementAndTestRefCount() {
return REF_COUNTS != null
&& REF_COUNTS.compute(this, (me, i) -> --i == 0? null: i) == null;
}
}
There is a behavioral difference for the case that someone invokes decrementAndTestRefCount() more often than incrementRefCount(). While your original code silently runs into a negative ref count, this code will throw a NullPointerException. I prefer failing with an exception in this case…
The code above will leave you with the overhead of a single static field in case you’re not using the feature. Most JVMs should have no problems eliminating the conditionals regarding the state of a static final variable.
Note further that the code allows MyClass instances to get garbage collected while having a non-zero ref count, just like when it was an instance field, but also actively removes the mapping when the count reaches the initial state of zero again, to minimize the work needed for cleanup.

Modular Design Patterns

I've started drawing plugs in Java, like connectors using bezier curves, but just the visual stuff.
Then I begin wondering about making some kind of modular thing, with inputs and outputs. However, I'm very confused on decisions about how to implement it. Let's say for example, a modular synthesizer, or Pure Data / MaxMSP concepts, in which you have modules, and any module has attributes, inputs and outputs.
I wonder if you know what keywords should I use to search something to read about. I need some basic examples or abstract ideas concerning this kind of interface. Is there any some design pattern that fits this idea?

Since you're asking for a keyword real-time design patterns, overly OOP is often a performance bottleneck to real-time applications, since all the objects (and I guess polymorphism to some extent) add overhead.
Why real-time application? The graph you provided looks very sophisticated,
You process the incoming data multiple times in parallel, split it up, merge it and so on.
Every node in the graph adds different effects and makes different computations, where some computations may take longer than others - this leads to the conclusion, that in order to have uniform data (sound), you have to keep the data in sync. This is no trivial task.
I guess some other keywords would be: sound processing, filter. Or you could ask companies that work in that area for literature.
Leaving the time sensitivity aside, I constructed a little OOP example,
maybe an approach like that is sufficient for less complex scenarios
public class ConnectionCable implements Runnable, Closeable {
private final InputLine in;
private final OutputLine out;
public ConnectionCable(InputLine in, OutputLine out) {
this.in = in;
this.out = out;
// cable connects open lines and closes them upon connection
if (in.isOpen() && out.isOpen()) {
in.close();
out.close();
}
}
#Override
public void run() {
byte[] data = new byte[1024];
// cable connects output line to input line
while (out.read(data) > 0)
in.write(data);
}
#Override
public void close() throws IOException {
in.open();
out.open();
}
}
interface Line {
void open();
void close();
boolean isOpen();
boolean isClosed();
}
interface InputLine extends Line {
int write(byte[] data);
}
interface OutputLine extends Line {
int read(byte[] data);
}

Equivalent of a debug macro in Java [duplicate]

This question already has answers here:
#ifdef #ifndef in Java
(8 answers)
Closed 9 years ago.
I'm writing a program that reads structures from a file. For debugging purposes, it would be very convenient if I could have a compile-time toggle that prints the names and values of everything read which could be disabled for better performance/code size in the production version. In C, I could use the preprocessor like such to accomplish this:
#ifdef DEBUG
#define READ(name, in) { name = read(in); printf("#name: %d\n", name); }
#else
#define READ(name, in) { name = read(in); }
#endif
void myreader(mystream_t *in)
{
int a, b, c;
READ(a, in);
READ(b, in);
READ(c, in);
}
Is there any way I can reproduce this construct? I thought about this:
private static final boolean DEBUG_ENABLED = true;
private int debugRead(MyInputStream in, String name) {
int val = in.read();
if (DEBUG_ENABLED) {
System.out.println(String.format("%s: %d", name, val));
}
return val;
}
public MyReader(MyInputStream in) {
int a, b, c;
a = debugRead(in, "a");
b = debugRead(in, "b");
c = debugRead(in, "c");
}
However, this requires me to type the name of all the variables twice as well as storing the strings corresponding to all the names even on the release version. Is there a better approach to this?
EDIT: The biggest concern I have is code verbosity. The last thing I want is to have my code cluttered with debug/print/trace statements that obscure the actual reading logic.

I'm not sure if this is an apples to apples solution, but one thing that's idiomatic in Java is to use a logging framework. With it, you can execute log.debug wherever you might need debugging. SLF4j is a common facade for logging frameworks. You could just use it with JUL logging.
Usually you leave the logging code there and you configure the logger externally to either print or not print the messages.
If you're using SLF4j, the debug message will look like this:
log.debug("Setting the creation timestamp to {}", timestamp);
Most loggers can be configured to tell you what time, class and method the logging message came from.
This has some pros and cons compared to what you're used to.
Cons
I have to admit, this will take a lot of effort to learn when all you really want right now is a System.out.println.
Most of the loggers are configured on the classpath. The classpath is a non-trivial part of java to learn but you will have to learn it eventually anyway. It's really important to understand.
It won't automatically print out the name of the variable passed in. AFAIK, you'll have to write that detail yourself
Pros
Using a logger is very robust. You can leave the code there for production and dev mode and just configure the verbosity appropriately
It can automatically print out the context of the class/method and date of the message and other things like that
You can configure the output in lots of different ways. For example, "output to the console and log.txt but when that file becomes > 100mb, rollover the old data to log2.txt and keep at most 5 log files."

To the general problem of emulating C/C++ macros in Java, there are several solutions. The particular case of logging is usually resolved in a simpler way. The simplest, conceptually closest, and puristic form is abstracting the macro in an interface and producing alternative implementations:
public class Sample {
class mystream_t {
}
public int read(mystream_t is) {
return 0 ;
}
static final boolean DEBUG= false ;
interface ReadType {
public void apply(int[] name,mystream_t in);
}
ReadType READ; {
if( DEBUG ) {
READ= new ReadType(){
public void apply(int[] name,mystream_t in) {
name[0]= read(in) ; System.out.printf("#name: %d\n",name);
}
};
} else {
READ= new ReadType(){
public void apply(int[] name,mystream_t in) {
name[0]= read(in) ;
}
};
}
}
void myreader(mystream_t in) {
int[] a= new int[1], b= new int[1], c= new int[1];
READ.apply(a, in);
READ.apply(b, in);
READ.apply(c, in);
}
}
This makes use of a simple, static form of code injection. I tried to make the code as close as possible to the original.
The second most relevant way of emulating C/C++ macros in Java requires Annotations and Annotation Processing. It's even closer to C/C++ macros, but requires more effort and resorts to a mechanism that could not be considered pure part of the language.
And the third one is using an Aspect-Oriented Programming framework like AspectJ.

Parsing field access flags in java

I have an assignment wherein I have to parse the field access flags of a java .class file.
The specification for a .class file can be found here: Class File Format (page 26 & 27 have the access flags and hex vals).
This is fine, I can do this no worries.
My issue is that there is a large number of combinations.
I know the public, private and protected are mutually exclusive, which reduces the combinations somewhat. Final and transient are also mutually exclusive. The rest however are not.
At the moment, I have a large switch statement to do the comparison. I read in the hex value of the access flag and then increment a counter, depending on if it is public, private or protected. This works fine, but it seems quite messy to just have every combination listed in a switch statement. i.e. public static, public final, public static final, etc.
I thought of doing modulo on the access flag and the appropriate hex value for public, private or protected, but public is 0x0001, so that won't work.
Does anyone else have any ideas as to how I could reduce the amount of cases in my switch statement?

What is the problem? The specification says that it's a bit flag, that means that you should look at a value as a binary number, and that you can test if a specific value is set by doing a bitwise AND.
E.g
/*
ACC_VOLATILE = 0x0040 = 10000000
ACC_PUBLIC = 0x0001 = 00000001
Public and volatile is= 10000001
*/
publicCount += flag & ACC_PUBLIC > 0 ? 1 : 0;
volatileCount += flag & ACC_VOLATILE > 0 ? 1 : 0;

If you are trying to avoid a pattern like this one I just stole:
if (access_flag & ACC_PUBLIC != 0)
{
public++;
}
if (access_flag & ACC_FINAL != 0)
{
final++;
}
...
It's a great instinct. I make it a rule never to write code that looks redundant like that. Not only is it error-prone and more code in your class, but copy & paste code is really boring to write.
So the big trick is to make this access "Generic" and easy to understand from the calling class--pull out all the repeated crap and just leave "meat", push the complexity to the generic routine.
So an easy way to call a method would be something like this that gives an array of bitfields that contain many bit combinations that need counted and a list of fields that you are interested in (so that you don't waste time testing fields you don't care about):
int[] counts = sumUpBits(arrayOfFlagBitfields, ACC_PUBLIC | ACC_FINAL | ACC_...);
That's really clean, but then how do you access the return fields? I was originally thinking something like this:
System.out.println("Number of public classes="+counts[findBitPosition(ACC_PUBLIC]));
System.out.println("Number of final classes="+counts[findBitPosition(ACC_FINAL)]);
Most of the boilerplate here is gone except the need to change the bitfields to their position. I think two changes might make it better--encapsulate it in a class and use a hash to track positions so that you don't have to convert bitPosition all the time (if you prefer not to use the hash, findBitPosition is at the end).
Let's try a full-fledged class. How should this look from the caller's point of view?
BitSummer bitSums=new BitSummer(arrayOfFlagBitfields, ACC_PUBLIC, ACC_FINAL);
System.out.println("Number of public classes="+bitSums.getCount(ACC_PUBLIC));
System.out.println("Number of final classes="+bitSums.getCount(ACC_FINAL));
That's pretty clean and easy--I really love OO! Now you just use the bitSums to store your values until they are needed (It's less boilerplate than storing them in class variables and more clear than using an array or a collection)
So now to code the class. Note that the constructor uses variable arguments now--less surprise/more conventional and makes more sense for the hash implementation.
By the way, I know this seems like it would be slow and inefficient, but it's probably not bad for most uses--if it is, it can be improved, but this should be much shorter and less redundant than the switch statement (which is really the same as this, just unrolled--however this one uses a hash & autoboxing which will incur an additional penalty).
public class BitSummer {
// sums will store the "sum" as <flag, count>
private final HashMap<Integer, Integer> sums=new HashMap<Integer, Integer>();
// Constructor does all the work, the rest is just an easy lookup.
public BitSummer(int[] arrayOfFlagBitfields, int ... positionsToCount) {
// Loop over each bitfield we want to count
for(int bitfield : arrayOfFlagBitfields) {
// and over each flag to check
for(int flag : positionsToCount) {
// Test to see if we actually should count this bitfield as having the flag set
if((bitfield & flag) != 0) {
sums.put(flag, sums.get(flag) +1); // Increment value
}
}
}
}
// Return the count for a given bit position
public int getCount(int bit) {
return sums.get(bit);
}
}
I didn't test this but I think it's fairly close. I wouldn't use it for processing video packets in realtime or anything, but for most purposes it should be fast enough.
As for maintaining code may look "Long" compared to the original example but if you have more than 5 or 6 fields to check, this will actually be a shorter solution than the chained if statements and significantly less error/prone and more maintainable--also more interesting to write.
If you really feel the need to eliminate the hashtable you could easily replace it with a sparse array with the flag position as the index (for instance the count of a flag 00001000/0x08 would be stored in the fourth array position). This would require a function like this to calculate the bit position for array access (both storing in the array and retrieving)
private int findBitPosition(int flag) {
int ret;
while( ( flag << 1 ) != 0 )
ret++;
return ret;
}
That was fun.

I'm not sure that's what you're looking for, but I would use if-cases with binary AND to check if a flag is set:
if (access_flag & ACC_PUBLIC != 0)
{
// class is public
}
if (access_flag & ACC_FINAL != 0)
{
// class is final
}
....

Java MemoryMapping big files

The Java limitation of MappedByteBuffer to 2GIG make it tricky to use for mapping big files. The usual recommended approach is to use an array of MappedByteBuffer and index it through:
long PAGE_SIZE = Integer.MAX_VALUE;
MappedByteBuffer[] buffers;
private int getPage(long offset) {
return (int) (offset / PAGE_SIZE)
}
private int getIndex(long offset) {
return (int) (offset % PAGE_SIZE);
}
public byte get(long offset) {
return buffers[getPage(offset)].get(getIndex(offset));
}
this can be a working for single bytes, but requires rewriting a lot of code if you want to handle read/writes that are bigger and require crossing boundaries (getLong() or get(byte[])).
The question: what is your best practice for these kind of scenarios, do you know any working solution/code that can be re-used without re-inventing the wheel?

Have you checked out dsiutil's ByteBufferInputStream?
Javadoc
The main usefulness of this class is that of making it possible creating input streams that are really based on a MappedByteBuffer.
In particular, the factory method map(FileChannel, FileChannel.MapMode) will memory-map an entire file into an array of ByteBuffer and expose the array as a ByteBufferInputStream. This makes it possible to access easily mapped files larger than 2GiB.
long length()
long position()
void position(long newPosition)
Is that something you were thinking of? It's LGPL too.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.