Saving a binary STL file in Java - java

I am trying to save some data as an STL file for use on a 3D printer. The STL file has two forms: ASCII and Binary. The ASCII format is relatively easy to understand and create but most 3D printing services require it to be in binary format.
The information about STL Binary is explained on the Wikipedia page here: http://en.wikipedia.org/wiki/STL_(file_format)
I know that I will require the data to be in a byte array but I have no idea how to go about interpreting the information from Wikipedia and creating the byte array. This is what I would like help with.
The code I have so far simply saves an empty byte array:
byte[] bytes = null;
FileOutputStream stream = new FileOutputStream("test.stl");
try {
stream.write(bytes);
} finally {
stream.close();
}

If you start a new project on an up-to-date Java version, you should not hassle with OutputStreams. Use Channels and ByteBuffers instead.
try(FileChannel ch=new RandomAccessFile("test.stl", "rw").getChannel())
{
ByteBuffer bb=ByteBuffer.allocate(10000).order(ByteOrder.LITTLE_ENDIAN);
// ...
// e.g. store a vertex:
bb.putFloat(0.0f).putFloat(1.0f).putFloat(42);
bb.flip();
ch.write(bb);
bb.clear();
// ...
}
This is the only API providing you with the little-endian support as required. Then match the datatypes:
UINT8 means unsigned byte,
UINT32 means unsigned int,
REAL32 means float,
UINT16 means unsigned short,
REAL32[3] means three floats (i.e. an array)
You don’t have to worry about the unsigned nature of the data types as long as you don’t exceed the max values of the corresponding signed Java types.

This is shouldn't be so ambiguous. The spec says:
UINT8[80] – Header
UINT32 – Number of triangles
foreach triangle
REAL32[3] – Normal vector
REAL32[3] – Vertex 1
REAL32[3] – Vertex 2
REAL32[3] – Vertex 3
UINT16 – Attribute byte count
end
That means a total file size of: 80 + 4 + Number of Triangles * ( 4 * 3 * 4 + 2 ).
So for example, 100 triangles ( 84+100*50 ), produces a 5084 byte file.
You can optimize the following functional code.
Open the file and write the header:
RandomAccessFile raf = new RandomAccessFile( fileName, "rw" );
raf.setLength( 0L );
FileChannel ch = raf.getChannel();
ByteBuffer bb = ByteBuffer.allocate( 1024 ).order( ByteOrder.LITTLE_ENDIAN );
byte titleByte[] = new byte[ 80 ];
System.arraycopy( title.getBytes(), 0, titleByte, 0, title.length() );
bb.put( titleByte );
bb.putInt( nofTriangles ); // Number of triangles
bb.flip(); // prep for writing
ch.write( bb );
In this code, the point vertices and triangle indices are in an arrays like this:
Vector3 vertices[ index ]
int indices[ index ][ triangle point number ]
Write the point data:
for ( int i = 0; i < nofIndices; i++ ) // triangles
{
bb.clear();
Vector3 normal = getNormal( indices[ i ][ 0 ], indices[ i ][ 1 ], indices[ i ][ 2 ] );
bb.putFloat( normal[ k ].x );
bb.putFloat( normal[ k ].y );
bb.putFloat( normal[ k ].z );
for ( int j = 0; j < 3; j++ ) // triangle indices
{
bb.putFloat( vertices[ indices[ i ][ j ] ].x );
bb.putFloat( vertices[ indices[ i ][ j ] ].y );
bb.putFloat( vertices[ indices[ i ][ j ] ].z );
}
bb.putShort( ( short ) 0 ); // number of attributes
bb.flip();
ch.write( bb );
}
close the file:
ch.close();
Get the normals:
Vector3 getNormal( int ind1, int ind2, int ind3 )
{
Vector3 p1 = vertices[ ind1 ];
Vector3 p2 = vertices[ ind2 ];
Vector3 p3 = vertices[ ind3 ];
return p1.cpy().sub( p2 ).crs( p2.x - p3.x, p2.y - p3.y, p2.z - p3.z ) ).nor();
}
See also:
Vector3

You should generate this file in ASCII and use an ASCII to Binary STL Converter.
If you are unable to answer this question yourself, it is probably way easier to do it in ascii first.
http://www.thingiverse.com/thing:39655

Since your question was based on writing a file to send to a 3D printer, I suggest you ditch the STL format file and use an OBJ format file instead. It is much simpler to compose, and it produces a much smaller file. There isn't a binary flavor of OBJ, but it's still a pretty compact file as you will see.
The (abbreviated) spec says:
List all the geometric vertex coordinates as a "v", followed by x, y, z values, like:
v 123.45 234.56 345.67
then List all the triangle as "f", followed by indices in a CCW order, like:
f 1 2 3
Indices start with 1.
Use a # character to start a comment line. Don't append comments anywhere else in a line.
Blank lines are ok.
There's a whole bunch of other things it supports, like normals, and textures. But if all you want to do is write your geometry to file to import into a 3D printer then OBJ is actually preferred, and this simple content is valid and adequate.
Here is an example of a perfectly valid file composing a 1 unit cube, as imported successfully in Microsoft 3D Viewer (included in Win/10), AutoDesk MeshMixer (free download), and PrusaSlicers (free download)
# vertices
v 0 0 0
v 0 1 0
v 1 1 0
v 1 0 0
v 0 0 1
v 0 1 1
v 1 1 1
v 1 0 1
# triangle indices
f 1 3 4
f 1 2 3
f 1 6 2
f 1 5 6
f 1 8 5
f 1 4 8
f 3 7 8
f 3 8 4
f 3 6 7
f 2 6 3
f 5 8 7
f 5 7 6
If you've got data in multiple meshes, you ought to coalesce the vertices to eliminate duplicate points. But because the file is plain text you can use PrintWriter() object and println() methods to write the whole thing.

Related

What is the Base64 size in bytes of a byte array in Java? [duplicate]

After reading the base64 wiki ...
I'm trying to figure out how's the formula working :
Given a string with length of n , the base64 length will be
Which is : 4*Math.Ceiling(((double)s.Length/3)))
I already know that base64 length must be %4==0 to allow the decoder know what was the original text length.
The max number of padding for a sequence can be = or ==.
wiki :The number of output bytes per input byte is approximately 4 / 3 (33%
overhead)
Question:
How does the information above settle with the output length ?
Each character is used to represent 6 bits (log2(64) = 6).
Therefore 4 chars are used to represent 4 * 6 = 24 bits = 3 bytes.
So you need 4*(n/3) chars to represent n bytes, and this needs to be rounded up to a multiple of 4.
The number of unused padding chars resulting from the rounding up to a multiple of 4 will obviously be 0, 1, 2 or 3.
4 * n / 3 gives unpadded length.
And round up to the nearest multiple of 4 for padding, and as 4 is a power of 2 can use bitwise logical operations.
((4 * n / 3) + 3) & ~3
For reference, the Base64 encoder's length formula is as follows:
As you said, a Base64 encoder given n bytes of data will produce a string of 4n/3 Base64 characters. Put another way, every 3 bytes of data will result in 4 Base64 characters. EDIT: A comment correctly points out that my previous graphic did not account for padding; the correct formula for padding is 4(Ceiling(n/3)).
The Wikipedia article shows exactly how the ASCII string Man encoded into the Base64 string TWFu in its example. The input string is 3 bytes, or 24 bits, in size, so the formula correctly predicts the output will be 4 bytes (or 32 bits) long: TWFu. The process encodes every 6 bits of data into one of the 64 Base64 characters, so the 24-bit input divided by 6 results in 4 Base64 characters.
You ask in a comment what the size of encoding 123456 would be. Keeping in mind that every every character of that string is 1 byte, or 8 bits, in size (assuming ASCII/UTF8 encoding), we are encoding 6 bytes, or 48 bits, of data. According to the equation, we expect the output length to be (6 bytes / 3 bytes) * 4 characters = 8 characters.
Putting 123456 into a Base64 encoder creates MTIzNDU2, which is 8 characters long, just as we expected.
Integers
Generally we don't want to use doubles because we don't want to use the floating point ops, rounding errors etc. They are just not necessary.
For this it is a good idea to remember how to perform the ceiling division: ceil(x / y) in doubles can be written as (x + y - 1) / y (while avoiding negative numbers, but beware of overflow).
Readable
If you go for readability you can of course also program it like this (example in Java, for C you could use macro's, of course):
public static int ceilDiv(int x, int y) {
return (x + y - 1) / y;
}
public static int paddedBase64(int n) {
int blocks = ceilDiv(n, 3);
return blocks * 4;
}
public static int unpaddedBase64(int n) {
int bits = 8 * n;
return ceilDiv(bits, 6);
}
// test only
public static void main(String[] args) {
for (int n = 0; n < 21; n++) {
System.out.println("Base 64 padded: " + paddedBase64(n));
System.out.println("Base 64 unpadded: " + unpaddedBase64(n));
}
}
Inlined
Padded
We know that we need 4 characters blocks at the time for each 3 bytes (or less). So then the formula becomes (for x = n and y = 3):
blocks = (bytes + 3 - 1) / 3
chars = blocks * 4
or combined:
chars = ((bytes + 3 - 1) / 3) * 4
your compiler will optimize out the 3 - 1, so just leave it like this to maintain readability.
Unpadded
Less common is the unpadded variant, for this we remember that each we need a character for each 6 bits, rounded up:
bits = bytes * 8
chars = (bits + 6 - 1) / 6
or combined:
chars = (bytes * 8 + 6 - 1) / 6
we can however still divide by two (if we want to):
chars = (bytes * 4 + 3 - 1) / 3
Unreadable
In case you don't trust your compiler to do the final optimizations for you (or if you want to confuse your colleagues):
Padded
((n + 2) / 3) << 2
Unpadded
((n << 2) | 2) / 3
So there we are, two logical ways of calculation, and we don't need any branches, bit-ops or modulo ops - unless we really want to.
Notes:
Obviously you may need to add 1 to the calculations to include a null termination byte.
For Mime you may need to take care of possible line termination characters and such (look for other answers for that).
(In an attempt to give a succinct yet complete derivation.)
Every input byte has 8 bits, so for n input bytes we get:
n × 8      input bits
Every 6 bits is an output byte, so:
ceil(n × 8 / 6)  =  ceil(n × 4 / 3)      output bytes
This is without padding.
With padding, we round that up to multiple-of-four output bytes:
ceil(ceil(n × 4 / 3) / 4) × 4  =  ceil(n × 4 / 3 / 4) × 4  =  ceil(n / 3) × 4      output bytes
See Nested Divisions (Wikipedia) for the first equivalence.
Using integer arithmetics, ceil(n / m) can be calculated as (n + m – 1) div m,
hence we get:
(n * 4 + 2) div 3      without padding
(n + 2) div 3 * 4      with padding
For illustration:
n with padding (n + 2) div 3 * 4 without padding (n * 4 + 2) div 3
------------------------------------------------------------------------------
0 0 0
1 AA== 4 AA 2
2 AAA= 4 AAA 3
3 AAAA 4 AAAA 4
4 AAAAAA== 8 AAAAAA 6
5 AAAAAAA= 8 AAAAAAA 7
6 AAAAAAAA 8 AAAAAAAA 8
7 AAAAAAAAAA== 12 AAAAAAAAAA 10
8 AAAAAAAAAAA= 12 AAAAAAAAAAA 11
9 AAAAAAAAAAAA 12 AAAAAAAAAAAA 12
10 AAAAAAAAAAAAAA== 16 AAAAAAAAAAAAAA 14
11 AAAAAAAAAAAAAAA= 16 AAAAAAAAAAAAAAA 15
12 AAAAAAAAAAAAAAAA 16 AAAAAAAAAAAAAAAA 16
Finally, in the case of MIME Base64 encoding, two additional bytes (CR LF) are needed per every 76 output bytes, rounded up or down depending on whether a terminating newline is required.
Here is a function to calculate the original size of an encoded Base 64 file as a String in KB:
private Double calcBase64SizeInKBytes(String base64String) {
Double result = -1.0;
if(StringUtils.isNotEmpty(base64String)) {
Integer padding = 0;
if(base64String.endsWith("==")) {
padding = 2;
}
else {
if (base64String.endsWith("=")) padding = 1;
}
result = (Math.ceil(base64String.length() / 4) * 3 ) - padding;
}
return result / 1000;
}
I think the given answers miss the point of the original question, which is how much space needs to be allocated to fit the base64 encoding for a given binary string of length n bytes.
The answer is (floor(n / 3) + 1) * 4 + 1
This includes padding and a terminating null character. You may not need the floor call if you are doing integer arithmetic.
Including padding, a base64 string requires four bytes for every three-byte chunk of the original string, including any partial chunks. One or two bytes extra at the end of the string will still get converted to four bytes in the base64 string when padding is added. Unless you have a very specific use, it is best to add the padding, usually an equals character. I added an extra byte for a null character in C, because ASCII strings without this are a little dangerous and you'd need to carry the string length separately.
For all people who speak C, take a look at these two macros:
// calculate the size of 'output' buffer required for a 'input' buffer of length x during Base64 encoding operation
#define B64ENCODE_OUT_SAFESIZE(x) ((((x) + 3 - 1)/3) * 4 + 1)
// calculate the size of 'output' buffer required for a 'input' buffer of length x during Base64 decoding operation
#define B64DECODE_OUT_SAFESIZE(x) (((x)*3)/4)
Taken from here.
While everyone else is debating algebraic formulas, I'd rather just use BASE64 itself to tell me:
$ echo "Including padding, a base64 string requires four bytes for every three-byte chunk of the original string, including any partial chunks. One or two bytes extra at the end of the string will still get converted to four bytes in the base64 string when padding is added. Unless you have a very specific use, it is best to add the padding, usually an equals character. I added an extra byte for a null character in C, because ASCII strings without this are a little dangerous and you'd need to carry the string length separately."| wc -c
525
$ echo "Including padding, a base64 string requires four bytes for every three-byte chunk of the original string, including any partial chunks. One or two bytes extra at the end of the string will still get converted to four bytes in the base64 string when padding is added. Unless you have a very specific use, it is best to add the padding, usually an equals character. I added an extra byte for a null character in C, because ASCII strings without this are a little dangerous and you'd need to carry the string length separately." | base64 | wc -c
710
So it seems the formula of 3 bytes being represented by 4 base64 characters seems correct.
I don't see the simplified formula in other responses. The logic is covered but I wanted a most basic form for my embedded use:
Unpadded = ((4 * n) + 2) / 3
Padded = 4 * ((n + 2) / 3)
NOTE: When calculating the unpadded count we round up the integer division i.e. add Divisor-1 which is +2 in this case
Seems to me that the right formula should be:
n64 = 4 * (n / 3) + (n % 3 != 0 ? 4 : 0)
I believe that this one is an exact answer if n%3 not zero, no ?
(n + 3-n%3)
4 * ---------
3
Mathematica version :
SizeB64[n_] := If[Mod[n, 3] == 0, 4 n/3, 4 (n + 3 - Mod[n, 3])/3]
Have fun
GI
Simple implementantion in javascript
function sizeOfBase64String(base64String) {
if (!base64String) return 0;
const padding = (base64String.match(/(=*)$/) || [])[1].length;
return 4 * Math.ceil((base64String.length / 3)) - padding;
}
If there is someone interested in achieve the #Pedro Silva solution in JS, I just ported this same solution for it:
const getBase64Size = (base64) => {
let padding = base64.length
? getBase64Padding(base64)
: 0
return ((Math.ceil(base64.length / 4) * 3 ) - padding) / 1000
}
const getBase64Padding = (base64) => {
return endsWith(base64, '==')
? 2
: 1
}
const endsWith = (str, end) => {
let charsFromEnd = end.length
let extractedEnd = str.slice(-charsFromEnd)
return extractedEnd === end
}
In windows - I wanted to estimate size of mime64 sized buffer, but all precise calculation formula's did not work for me - finally I've ended up with approximate formula like this:
Mine64 string allocation size (approximate)
= (((4 * ((binary buffer size) + 1)) / 3) + 1)
So last +1 - it's used for ascii-zero - last character needs to allocated to store zero ending - but why "binary buffer size" is + 1 - I suspect that there is some mime64 termination character ? Or may be this is some alignment issue.

String to binary?

I have a very odd situation,
I'm writing a filter engine for another program, and that program has what are called "save areas". Each of those save areas is numbered 0 through 32 (why there are 33 of them, I don't know). They are turned on or off via a binary string,
1 = save area 0 on
10 = save area 1 on, save area 0 off
100 = save area 2 on, save areas 1 and 0 off.
and so on.
I have another program passing in what save areas it needs, but it does so with decimal representations and underscores - 1_2_3 for save areas 1, 2, and 3 for instance.
I would need to convert that example to 1110.
What I came up with is that I can build a string as follows:
I break it up (using split) into savePart[i]. I then iterate through savePart[i] and build strings:
String saveString = padRight("0b1",Integer.parseInt(savePart[i]));
That'll give me a string that reads "0b1000000" in the case of save area 6, for instance.
Is there a way to read that string as if it was a binary number instead. Because if I were to say:
long saveBinary = 0b1000000L
that would totally work.
or, is there a smarter way to be doing this?
long saveBinary = Long.parseLong(saveString, 2);
Note that you'll have to leave off the 0b prefix.
This will do it:
String input = "1_2_3";
long areaBits = 0;
for (String numTxt : input.split("_")) {
areaBits |= 1L << Integer.parseInt(numTxt);
}
System.out.printf("\"%s\" -> %d (decimal) = %<x (hex) = %s (binary)%n",
input, areaBits, Long.toBinaryString(areaBits));
Output:
"1_2_3" -> 14 (decimal) = e (hex) = 1110 (binary)
Just take each number in the string and treat it as an exponent. Accumulate the total for each exponent found and you will get your answer w/o the need to remove prefixes or suffixes.
// will contain our answer
long answer = 0;
String[] buckets = givenData.split("_"); // array of each bucket wanted, exponents
for (int x = 0; x < buckets.length; x++){ // iterate through all exponents found
long tmpLong = Long.parseLong(buckets[x]); // get the exponent
answer = (10^tmpLong) + answer; // add 10^exponent to our running total
}
answer will now contain our answer in the format 1011010101 (what have you).
In your example, the given data was 1_2_3. The array will contain {"1", "2", "3"}
We iterate through that array...
10^1 + 10^2 + 10^3 = 10 + 100 + 1000 = 1110
I believe this is also why your numbers are 0 - 32. x^0 = 1, so you can dump into the 0 bucket when 0 is in the input.

Liblinear usage format

I am using .NET implementation of liblinear in my C# code by the following nuget package:
https://www.nuget.org/packages/Liblinear/
But in the readme file of liblinear, the format for x is:
struct problem describes the problem:
struct problem
{
int l, n;
int *y;
struct feature_node **x;
double bias;
};
where `l` is the number of training data. If bias >= 0, we assume
that one additional feature is added to the end of each data
instance. `n` is the number of feature (including the bias feature
if bias >= 0). `y` is an array containing the target values. (integers
in classification, real numbers in regression) And `x` is an array
of pointers, each of which points to a sparse representation (array
of feature_node) of one training vector.
For example, if we have the following training data:
LABEL ATTR1 ATTR2 ATTR3 ATTR4 ATTR5
----- ----- ----- ----- ----- -----
1 0 0.1 0.2 0 0
2 0 0.1 0.3 -1.2 0
1 0.4 0 0 0 0
2 0 0.1 0 1.4 0.5
3 -0.1 -0.2 0.1 1.1 0.1
and bias = 1, then the components of problem are:
l = 5
n = 6
y -> 1 2 1 2 3
x -> [ ] -> (2,0.1) (3,0.2) (6,1) (-1,?)
[ ] -> (2,0.1) (3,0.3) (4,-1.2) (6,1) (-1,?)
[ ] -> (1,0.4) (6,1) (-1,?)
[ ] -> (2,0.1) (4,1.4) (5,0.5) (6,1) (-1,?)
[ ] -> (1,-0.1) (2,-0.2) (3,0.1) (4,1.1) (5,0.1) (6,1) (-1,?)
But, in the example showing java implementation:
https://gist.github.com/hodzanassredin/6682771
problem.x <- [|
[|new FeatureNode(1,0.); new FeatureNode(2,1.)|]
[|new FeatureNode(1,2.); new FeatureNode(2,0.)|]
|]// feature nodes
problem.y <- [|1.;2.|] // target values
which means his data set is:
1 0 1
2 2 0
So, he is not storing the nodes as per sparse format of liblinear. Does, anyone know of correct format for x for liblinear implementation?
Though it doesn't address exactly the library you mentioned, I can offer you an alternative. The
Accord.NET Framework has recently incorporated all of LIBLINEAR's algorithms in its machine learning
namespaces. It is also available through NuGet.
In this library, the direct syntax to create a linear support vector machine from in-memory data is
// Create a simple binary AND
// classification problem:
double[][] problem =
{
// a b a + b
new double[] { 0, 0, 0 },
new double[] { 0, 1, 0 },
new double[] { 1, 0, 0 },
new double[] { 1, 1, 1 },
};
// Get the two first columns as the problem
// inputs and the last column as the output
// input columns
double[][] inputs = problem.GetColumns(0, 1);
// output column
int[] outputs = problem.GetColumn(2).ToInt32();
// However, SVMs expect the output value to be
// either -1 or +1. As such, we have to convert
// it so the vector contains { -1, -1, -1, +1 }:
//
outputs = outputs.Apply(x => x == 0 ? -1 : 1);
After the problem is created, one can learn a linear SVM using
// Create a new linear-SVM for two inputs (a and b)
SupportVectorMachine svm = new SupportVectorMachine(inputs: 2);
// Create a L2-regularized L2-loss support vector classification
var teacher = new LinearDualCoordinateDescent(svm, inputs, outputs)
{
Loss = Loss.L2,
Complexity = 1000,
Tolerance = 1e-5
};
// Learn the machine
double error = teacher.Run(computeError: true);
// Compute the machine's answers for the learned inputs
int[] answers = inputs.Apply(x => Math.Sign(svm.Compute(x)));
This assumes, however, that your data is already in-memory. If you wish to load your data from the
disk, from a file in libsvm sparse format, you can use the framework's SparseReader class.
An example on how to use it can be found below:
// Suppose we are going to read a sparse sample file containing
// samples which have an actual dimension of 4. Since the samples
// are in a sparse format, each entry in the file will probably
// have a much smaller number of elements.
//
int sampleSize = 4;
// Create a new Sparse Sample Reader to read any given file,
// passing the correct dense sample size in the constructor
//
SparseReader reader = new SparseReader(file, Encoding.Default, sampleSize);
// Declare a vector to obtain the label
// of each of the samples in the file
//
int[] labels = null;
// Declare a vector to obtain the description (or comments)
// about each of the samples in the file, if present.
//
string[] descriptions = null;
// Read the sparse samples and store them in a dense vector array
double[][] samples = reader.ReadToEnd(out labels, out descriptions);
Afterwards, one can use the samples and labels vectors as the inputs and outputs of the problem,
respectively.
I hope it helps.
Disclaimer: I am the author of this library. I am answering this question in the sincere hope it
can be useful for the OP, since not long ago I also faced the same problems. If a moderator thinks
this looks like spam, feel free to delete. However, I am only posting this because I think it might
help others. I even came across this question by mistake while searching for existing C#
implementations of LIBSVM, not LIBLINEAR.

Conversion formula from decimal value to byte array not working

I have a formula in a specification for a binary file. The spec gives details of the meaning of the various bytes in the heading.
In particular, one formula states this about 2 of the bytes:
Byte 1 --> 7 6 5 4 3 2 1 0
Byte 2 --> 7 6 5 4 3 2 1 0
R Roll
If 'R' = 0, Roll Angle not available
If 'R' = 1, Roll Angle = [((Byte 84 & 0x7F)<<8) | (Byte 85) – 900] / 10
I need to take a value such as 4.3 and convert it to two bytes such that it will be able to be converted back to 4.3 using the above formula. The part that puzzles me the most is subtracting the 900.
This is what I have so far:
private byte[] getRollBytes(BigDecimal[] positionData) {
BigDecimal roll = positionData[4];
roll = roll.multiply(BigDecimal.TEN);
roll = roll.add(new BigDecimal(900));
short shortval = roll.shortValue();
byte[] rollBytes = new byte[2];
ByteBuffer headingbuf = ByteBuffer.wrap(rollBytes);
headingbuf.order(ByteOrder.BIG_ENDIAN);
headingbuf.putShort(shortval);
//set the leftmost bit of the two bytes to 'on', meaning data is available
rollBytes[0] = (byte) (rollBytes[0] | (0x80));
//testing the result with my formula doesn't give me the right answer:
float testFloat = (float) (((((rollBytes[0] & 0x7F) <<8) | rollBytes[1]) - 900) /10);
return rollBytes;
}
I think something is getting lost in the conversion between short and byte...
One problem is that you are casting to a short. This can't be correct, because a value like 4.3 is not an integer. You probably want to convert back to a BigDecimal:
BigDecimal testVal = ((rollBytes[0] & 0x7F) <<8) | rollBytes[1]) - 900;
testVal = val.divide(BigDecimal.valueOf(10));
Another problem is that you seem to be adding 900 twice—once when computing shortVal and once again with rollBytes[1] = (byte) (rollBytes[1] + 900);

In Java, how do I create a bitmap to solve "knapsack quandary"

I'm in my first programming course and I'm quite stuck right now. Basically, what we are doing is we take 16 values from a text file (on the first line of code) and there is a single value on the second line of code. We read those 16 values into an array, and we set that 2nd line value as our target. I had no problem with that part.
But, where I'm having trouble is creating a bitmap to test every possible subset of the 16 values, that equal the target number.
IE, say we had these numbers:
12 15 20 4 3 10 17 12 24 21 19 33 27 11 25 32
We then correspond each value to a bitmap
0 1 1 0 0 0 0 1 1 1 0 1 0 0 1 0
Then we only accept the values predicated with "1"
15 20 12 24 21 33 25
Then we test that subset to see if it equals the "target" number.
We are only allowed to use one array in the problem, and we aren't allowed to use the math class (haven't gotten to it yet).
I understand the concept, and I know that I need to implement shifting operators and the logical & sign, but I'm truly at a loss. I'm very frustrated, and I just was wondering if anybody could give me any tips.
To generate all possible bit patterns inside an int and thus all possible subsets defined by that bit map would simply require you to start your int at 1 and keep incrementing it to the highest possible value an unsigned short int can hold (all 1s). At the end of each inner loop, compare the sum to the target. If it matches, you got a solution subset - print it out. If not, try the next subset.
Can someone help to explain how to go about doing this? I understand the concept but lack the knowledge of how to implement it.
OK, so you are allowed one array. Presumably, that array holds the first set of data.
So your approach needs to not have any additional arrays.
The bit-vector is simply a mental model construct in this case. The idea is this: if you try every possible combination (note, NOT permutation), then you are going to find the closest sum to your target. So lets say you have N numbers. That means you have 2^N possible combinations.
The bit-vector approach is to number each combination with 0 to 2^N - 1, and try each one.
Assuming you have less that 32 numbers in the array, you essentially have an outer loop like this:
int numberOfCombinations = (1 << numbers.length - 1) - 1;
for (int i = 0; i < numberOfCombinations; ++i) { ... }
for each value of i, you need to go over each number in numbers, deciding to add or skip based on shifts and bitmasks of i.
So the task is to what an algorithm that, given a set A of non-negative numbers and a goal value k, determines whether there is a subset of A such that the sum of its elements is k.
I'd approach this using induction over A, keeping track of which numbers <= k are sums of a subset of the set of elements processed so far. That is:
boolean[] reachable = new boolean[k+1];
reachable[0] = true;
for (int a : A) {
// compute the new reachable
// hint: what's the relationship between subsets of S and S \/ {a} ?
}
return reachable[k];
A bitmap is, mathematically speaking, a function mapping a range of numbers onto {0, 1}. A boolean[] maps array indices to booleans. So one could call a boolean[] a bitmap.
One disadvanatage of using a boolean[] is that you must process each array element individually. Instead, one could use that a long holds 64 bits, and use bitshifting and masking operations to process 64 "array" elements at a time. But that sort of microoptimization is error-prone and rather involved, so not commonly done in code that should be reliable and maintainable.
I think you need something like this:
public boolean equalsTarget( int bitmap, int [] numbers, int target ) {
int sum = 0; // this is the variable we're storing the running sum of our numbers
int mask = 1; // this is the bitmask that we're using to query the bitmap
for( int i = 0; i < numbers.length; i++ ) { // for each number in our array
if( bitmap & mask > 0 ) { // test if the ith bit is 1
sum += numbers[ i ]; // and add the ith number to the sum if it is
}
mask <<= 1; // shift the mask bit left by 1
}
return sum == target; //if the sum equals the target, this bitmap is a match
}
The rest of your code is fairly simple, you just feed every possible value of your bitmap (1..65535) into this method and act on the result.
P.s.: Please make sure that you fully understand the solution and not just copy it, otherwise you're just cheating yourself. :)
P.p.s: Using int works in this case, as int is 32 bit wide and we only need 16. Be careful with bitwise operations though if you need all the bits, as all primitive integer types (byte, short, int, long) are signed in Java.
There are a couple steps in solving this. First you need to enumerate all the possible bit maps. As others have pointed out you can do this easily by incrementing an integer from 0 to 2^n - 1.
Once you have that, you can iterate over all the possible bit maps you just need a way to take that bit map and "apply" it to an array to generate the sum of the elements at all indexes represented by the map. The following method is an example of how to do that:
private static int bitmapSum(int[] input, int bitmap) {
// a variable for holding the running total
int sum = 0;
// iterate over each element in our array
// adding only the values specified by the bitmap
for (int i = 0; i < input.length; i++) {
int mask = 1 << i;
if ((bitmap & mask) != 0) {
// If the index is part of the bitmap, add it to the total;
sum += input[i];
}
}
return sum;
}
This function will take an integer array and a bit map (represented as an integer) and return the sum of all the elements in the array whose index are present in the mask.
The key to this function is the ability to determine if a given index is in fact in the bit map. That is accomplished by first creating a bit mask for the desired index and then applying that mask to the bit map to test if that value is set.
Basically we want to build an integer where only one bit is set and all the others are zero. We can then bitwise AND that mask with the bit map and test if a particular position is set by comparing the result to 0.
Lets say we have an 8-bit map like the following:
map: 1 0 0 1 1 1 0 1
---------------
indexes: 7 6 5 4 3 2 1 0
To test the value for index 4 we would need a bit mask that looks like the following:
mask: 0 0 0 1 0 0 0 0
---------------
indexes: 7 6 5 4 3 2 1 0
To build the mask we simply start with 1 and shift it by N:
1: 0 0 0 0 0 0 0 1
shift by 1: 0 0 0 0 0 0 1 0
shift by 2: 0 0 0 0 0 1 0 0
shift by 3: 0 0 0 0 1 0 0 0
shift by 4: 0 0 0 1 0 0 0 0
Once we have this we can apply the mask to the map and see if the value is set:
map: 1 0 0 1 1 1 0 1
mask: 0 0 0 1 0 0 0 0
---------------
result of AND: 0 0 0 1 0 0 0 0
Since the result is != 0 we can tell that index 4 is included in the map.

Categories