I have read the The Garbage Collection Handbook. It says when doing card table,they use bytemap instead of bitmap, and said it is faster than bitmap, is it due to high speed cache line ? But as what i know ,the cache line normally is 64 bytes, if we make change on byte, race contention still exists , other cpu will still make the line invalidate ,it is the same as bit map ,anyone can help me on this ?
Not sure I got the context right but in general:
bit map access
requires address manipulation and read write of the whole BYTE/WORD/... as most architectures does not support bit read/write memory access.
So for 8 bit bit map like:
BYTE map[];
the code for read it is:
readed_bit=(map[bit>>3]>>(bit&7))&1;
set:
map[bit>>3]|=1<<(bit&7);
clear:
map[bit>>3]&=255^(1<<(bit&7));
where bit is the bit you want to access. As you can see there is masking and bit shifts needed.
BYTE map access
this can be accessed directly on most architectures
readed_byte=map[byte];
set:
map[byte]=1;
clear:
map[byte]=0;
Where byte is BYTE you want to access. As you can see the memory space is wasted if just boolean value is stored in single BYTE.
So unless you got specific HW designed to work with bit maps and planes then BYTE maps are faster ... but to every rule there is an exception so there are algorithms where you already got the masked address and bit masks in such cases bit maps are faster or as fast as byte maps ...
Related
I am experimenting with x86 instructions emulation under java (just for fun) and ran into the problem with "override prefixes" which an instruction may have.
A prefix can change the behavior of an instruction
For examle with "operand size override prefix" you can change the size of the operands. 16 bit to 32 bit or vice versa.
the problem is now: when the program runs in 16 bit mode all the operations are done with chars (char is 16 bit wide), when the operand size changes to 32 bit, I would like to run the operations with integers. So I have redundant code. My idea is now to implement a byte array operations, for example I could implement an algorithm for addition of two byte-arrays. The advantage here would be: you could simply switch between different modes even in 128 bit and so on. But on the other side an addition of a bytearray may be not very performant as an addition of two integers...
Do you know a better way to do this?
What do you think about it?
I think you need to model memory as an array of bytes, because x86 supports unaligned loads / stores. You should probably decode instructions into load / ALU / store (where each part is optional, e.g. add eax, ecx only need ALU, not load or store).
You only have to write the code once to make an int32 from 4 bytes, or to store 4 bytes from an int32. Or if Java lets you get an Int reference to an arbitrarily-aligned 4 bytes, then you could use that as a source or destination operand when the operand-size is 32 bits.
If you can write type-generic versions of add, sub, etc., in Java, you can reuse the same code for each operand-size. So you'd have one switch() on the operand-size in the decoder, and dispatch from there to the handler functions for each instruction. If you use a table of pointers (or of Objects with methods), the same object could appear in the 8-bit table and the 32-bit table if it's generic. (unlike div or mul where they use AH:AL for 8-bit but all wider operand sizes use (E|R)DX:(E|R)AX.
BTW, the possible load/store sizes x86 supports are byte/word/dword/qword (x87 and i486 cmpxchg8b) / xmm / ymm / zmm, and 6-byte (segment + 32-bit pointer les or far jmp [mem]). And also 10-byte x87 or segment + 64-bit pointer (e.g. far jmp).
The last two are handled internally as two separate loads, e.g. a 6-byte load isn't guaranteed to be atomic: Why is integer assignment on a naturally aligned variable atomic on x86?. Only power-of-2 sizes up to 8 bytes are guaranteed atomic (with some alignment restrictions).
For more ideas about emulating x86, see some BOCHS design documents, e.g.
How Bochs Works Under the Hood. It's an interpreting emulator, no JIT / dynamic recompilation, like you're writing.
It covers some important ideas like lazy flag handling. Some of the ideas there make the emulator's overall design more complex to gain performance, but lazy flags is pretty limited complexity and should help a lot.
I writing a protocol use Bit to represent Boolean, but java network not use the smaller structure. I want to know why network data designed as byte rather then bit? expose atom structure isn't better?
Because the fundamental packets of IP are defined in terms of bytes rather than bits. All packets are a whole number of bytes rather than bits. Even bit fields are part of a multi-byte field. Bytes, rather than bits, are fundamental to IP.
This is ultimately because computer memory does not have addressable bits, but rather has addressable bytes. Any implementation of a networking protocol would have to be based on bytes, not bits. And that that is also why Java does not provide direct access to bits.
The network bandwidth saving that could be achieved by carrying single bits of payload, compared to the added complexity both at the hardware and software level, simply does not worth it.
Fundamentally, both at hardware level (registers) and software level, the minimal unit for data handling is byte, 8 bits (or octet, if you want to be nitpicking) or multiple of that. You cannot address memory at the bit level, only at the multiple of a byte level. Doing otherwise would be very complicated, down to the silicium level, without added value.
Whatever the programing language, when you declare and use a boolean, a byte (or a power of 2 multiple number of bytes, why not as long ass I can load it from memory to a CPU register) will actually be used to store it and the language will take care that only 2 cases when using it: is this byte all 0 bits, or not. At the machine code/assembly level: load this byte from its memory address to register FOO, or multiple bytes (if for example 32 bits wide register), cmp FOO to 0, depending on the result JE (Jump If Egal) to code address BAR, else go on with next machine code line. Or JNE (Jump if Not Equal) to such other code address. So your Javan boolean is not actually stored as a bit. It's, at minimum, a byte.
Even the good old Ethernet frame, not even looking at the actual useful payload, starts by a 56-bit preamble to synchronize devices. 56 bits is 7 bytes. Could the synchronization be done with less than that? Not a number of bytes? Maybe, but that does not worth the effort.
https://en.wikipedia.org/wiki/Ethernet_frame#Preamble_and_start_frame_delimiter
Pedantic edit for nitpickers:
A language such as C have a bit field facility:
https://en.wikipedia.org/wiki/Bit_field
...but don't be fooled, the minimal storage unit at the silicum level for a bit from a bit field will still be a byte. Hence the "field" in "bit fields".
Actually, my question is very similar with this one, but the post is focus on the C# only. Recently I read an article said that java will 'promote' some short types (like short) to 4 bytes in memory even if some bits are not used, so it can't reduce usage. (is it true ?)
So my question is how languages, especially C, C++ and java (as Manish said in this post talked about java), handles memory allocation of small datatypes. References or any approaches to figure out it are preferred. Thanks
C/C++ uses only the specified amount of memory but aligns the data (by default) to an address that is a multiple of some value, typically 4 bytes for 32 bit applications or 8 bytes for 64 bit.
So for example if the data is aligned on a 4 or 8 byte boundary then a "char" uses only one byte. An array of 5 chars will use 5 bytes. But the data item that is allocated after the 5 byte char array is placed at an address that skips 3 bytes to keep it correctly aligned.
This is for performance on most processors. There are usually pragmas like "pack" and "align" that can be used to change the alignment or disable it.
In C and C++, different approaches may be taken depending on how you've requested the memory.
For T* p = (T*)malloc(n * sizeof(T)); or T* p = new T[n]; then the data will occupy sizeof(T)*n bytes of memory, so if sizeof(T) is reduced (e.g. to int16_t instead of int32_t) then that space is reduced accordingly. That said, heap allocations tend to have some overheads, so few large allocations are better than a great many allocations for individual data items or very small arrays, where the overheads may be much more significant than small differences in sizeof(T).
For structures, static and stack usage, padding is more significant than for large arrays, as the following data item might be of a different type with different alignment requirements, resulting in more padding.
At the other extreme, you can apply bitfields to effectively pack values into the minimum number of bits they need - very dense compression indeed, though you need to rely on compiler pragmas/attributes if you want explicit control - the Standard leaves it unspecified when a bitfield might start in a new memory "word" (e.g. 32 bit memory word for a 32 bit process, 64 for 64) or wrap across separate words, where in the words the bits hold data vs padding etc.). Data types like C++ bitsets and vector<bool> may be more efficient than arrays of bool (which may well use an int for each element, but it's unspecified in the C++03 Standard).`
I am working on implementing some bloom filter variants, and a very useful data structure for this would be a compact multi-bit array; that is, an array where each element is a compact integer of around 4 bits.
Space efficiency is of the utmost importance here, so while a plain integer array would give me the functionality I want, it would be bulkier than necessary.
Before I try to implement this functionality myself with bit arithmetic, I was wondering if anyone knows of a library out there that already provides such a data structure.
Edit: Static size is fine.
The ideal case would be an implementation that is flexible with regard to the number of bits per cell. That might be a bit much to hope for though (no pun intended?).
If you aren't modifying the array after creation, java.util.BitSet does all the bit masking for you but is slow to access since you have to fetch each bit individually and do the masking yourself to re-create the int from 4 bits.
Having said that writing it yourself might be the best way to go. Doing the bit arithmetic yourself isn't that difficult since it's only 2 values per byte so decoding the high bits are (array[i] & 0xF0) >> 4 and the low bits are array[i] & 0x0F
Take a look at the compressed BitSet provided by http://code.google.com/p/javaewah/, it allows to set bits freely and will ensure that it uses memory efficiently via compression algorithms being used.
I.e. something like
EWAHCompressedBitmap32 set = new EWAHCompressedBitmap32();
set.set(0);
set.set(1000000);
will still only occupy a few bytes, not one MB as with the Java BitSet...
You should be able to map the 4-bit integer to the BitSet by multiplying the index into the BitSet accordingly
I have been programming in Java since 2004, mostly enterprise and web applications. But I have never used short or byte, other than a toy program just to know how these types work. Even in a for loop of 100 times, we usually go with int. And I don't remember if I have ever came across any code which made use of byte or short, other than some public APIs and frameworks.
Yes I know, you can use a short or byte to save memory in large arrays, in situations where the memory savings actually matters. Does anyone care to practice that? Or its just something in the books.
[Edited]
Using byte arrays for network programming and socket communication is a quite common usage. Thanks, Darren, to point that out. Now how about short? Ryan, gave an excellent example. Thanks, Ryan.
I use byte a lot. Usually in the form of byte arrays or ByteBuffer, for network communications of binary data.
I rarely use float or double, and I don't think I've ever used short.
Keep in mind that Java is also used on mobile devices, where memory is much more limited.
I used 'byte' a lot, in C/C++ code implementing functionality like image compression (i.e. running a compression algorithm over each byte of a black-and-white bitmap), and processing binary network messages (by interpreting the bytes in the message).
However I have virtually never used 'float' or 'double'.
The primary usage I've seen for them is while processing data with an unknown structure or even no real structure. Network programming is an example of the former (whoever is sending the data knows what it means but you might not), something like image compression of 256-color (or grayscale) images is an example of the latter.
Off the top of my head grep comes to mind as another use, as does any sort of file copy. (Sure, the OS will do it--but sometimes that's not good enough.)
The Java language itself makes it unreasonably difficult to use the byte or short types. Whenever you perform any operation on a byte or short value, Java promotes it to an int first, and the result of the operation is returned as an int. Also, they're signed, and there are no unsigned equivalents, which is another frequent source of frustration.
So you end up using byte a lot because it's still the basic building block of all things cyber, but the short type might as well not exist.
Until today I haven't notice how seldom I use them.
I've use byte for network related stuff, but most of the times they were for my own tools/learning. In work projects these things are handled by frameworks ( JSP for instance )
Short? almost never.
Long? Neither.
My preferred integer literals are always int, for loops, counters, etc.
When data comes from another place ( a database for instance ) I use the proper type, but for literals I use always int.
I use bytes in lots of different places, mostly involving low-level data processing. Unfortunately, the designers of the Java language made bytes signed. I can't think of any situation in which having negative byte values has been useful. Having a 0-255 range would have been much more helpful.
I don't think I've ever used shorts in any proper code. I also never use floats (if I need floating point values, I always use double).
I agree with Tom. Ideally, in high-level languages we shouldn't be concerned with the underlying machine representations. We should be able to define our own ranges or use arbitrary precision numbers.
when we are programming for electronic devices like mobile phone we use byte and short.In this case we should take care on memory management.
It's perhaps more interesting to look at the semantics of int. Are those arbitrary limits and silent truncation what you want? For application-level code really wants arbitrary sized integers, it's just that Java has no way of expressing those reasonably.
I have used bytes when saving State while doing model checking. In that application the space savings are worth the extra work. Otherwise I never use them.
I found I was using byte variables when doing some low-level image processing. The .Net GDI+ draw routines were really slow so I hand-rolled my own.
Most times, though, I stick with signed integers unless I am forced to use something larger, given the problem constraints. Any sort of physics modeling I do usually requires floats or doubles, even if I don't need the precision.
Apache POI was using short quite a few times. Probably because of Excel's row/column number limitation.
A few months ago they changed to int replacing
createCell(short columnIndex)
with
createCell(int column).
On in-memory datagrids, it can be useful.
The concept of a datagrid like Gemfire is to have a huge distributed map.
When you don't have enough memory you can overflow to disk with LRU strategy, but the keys of all entries of your map remains in memory (at least with Gemfire).
Thus it is very important to make your keys with a small footprint, particularly if you are handling very large datasets.
For the entry value, when you can it's also better to use the appropriate type with a small memory footprint...
I have used shorts and bytes in Java apps communicating to custom usb or serial micro-controllers to receive 10bit values wrapped in 2 bytes as shorts.
bytes and shorts are extensively used in Java Card development. Take a look at my answer to Are there any real life uses for the Java byte primitive type?.