Byte array to string gives "???" [closed] - java

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 7 years ago.
Improve this question
So I am trying to write a steganography program in java.
Here is what I have so far (the important parts)
private void hideMessage(){
byte[] messageBytes = message.getBytes();
//message is a string
int messageLength = messageBytes.length;
for(int i = messageLength-1; i>=0; i--){
imageBytes[i+100000] = messageBytes[i];
//imageBytes is a bitmap image read into a byte array using imageIO
}
}
and
private void getMessage(){
int messageLength = 11;
byte[] messageBytes = new byte[messageLength];
for(int i = messageLength; i>0; i--){
messageBytes[i-1] = imageBytes[i+10000];
}
message = new String(messageBytes);
}
However this is the output I get for the string:
???????????
What am I doing wrong?

Pay attention to your zeroes. Your comment says 1000, getMessage uses 10000, and hideMessage uses 100000
(reposted as answer since apparently that's all that was wrong)

You can't simply create a string from arbitrary bytes - the bytes must be encodings of characters in the encoding you are using (in your case, the default encoding). If you use bytes that don't map to a character, they will be mapped to '?'. The same is true in the other direction: If you have a string with characters which do not map to bytes, the getBytes() method will map them to (byte)'?'. I think one or both of this happened here.
If you are using JPG or a similar lossy image format, it will change the bytes of your image during saving.

If the plan is to actually change part of your bitmap bytes, you'd need to export the image as png, as its lossless. Jpeg would probably change the bytes slightly, which isn't a problem for an image, but for text its obviously critical.
Second, if you're going to pick 100,000 as a fixed position to insert the message, you should set that up as a constant to make it easier, and less error prone. Speaking of which, your current fixed offsets are off by a '0', 10,000 and 100,000.

But you should edit the raw file, but an instance of BufferedImage, then rewrite it back to a file with ImageIO.

Related

How to encode a ByteArray to a base36 String

To encode a ByteArray to Base64 I could simply use Base64 from java.util. But now I need to change my code to create base36
instead. Unfortunately java.util doesn't have this functionality. What I need is a function/method that takes a ByteArray and outputs a String containing a base36 representation of it. No other changes like cutting off leading zeroes.
So this other question looks similar but we're are 2 problems. First the question got edited so that I don't understand the answer. Second, the answer uses BigInteger and I'm afraid that converting the ByteArray to a BigInteger could lead to information loss (like leading zeroes).
Similar question on stackoverflow.

How to inflate in Python some data that was deflated by Peoplesoft (Java)?

DISCLAIMER: Peoplesoft knowledge is not mandatory in order to help me with this one!
How could i extract the data from that Peoplesoft table, from the PUBDATALONG column?
The description of the table is here:
http://www.go-faster.co.uk/peopletools/psiblogdata.htm
Currently i am using a program written in Java and below is a piece of the code:
Inflater inflater = new Inflater();
byte[] result = new byte[rs.getInt("UNCOMPDATALEN")];
inflater.setInput(rs.getBytes("PUBDATALONG"));
int length = inflater.inflate(result);
System.out.println(new String(result, 0, length, "UTF-8"));
System.out.println();
System.out.println("-----");
System.out.println();
How could I rewrite this using Python?
It is a question that appeared in other forms on Stackoverflow but had no real answer.
I have basic understanding of what the code does in Java but i don't know any library in Python i could work with to achieve the same thing.
Some recommended to try zlib, as it is compatible with the algorithm used by Java Inflater class, but i did not succeed in doing that.
Considering the below facts from PeopleSoft manual:
When the message is received by the PeopleSoft database, the XML data
is converted to UTF-8 to prevent any UCS2 byte order issues. It is
also compressed using the deflate algorithm prior to storage in the
database.
I tried something like this:
import zlib
import base64
UNCOMPDATALEN = 362 #this value is taken from the DB and is the dimension of the data after decompression.
PUBDATALONG = '789CB3B1AFC8CD51284B2D2ACECCCFB35532D43350B2B7E3E5B2F130F40C8977770D8977F4710D0A890F0E710C090D8EF70F0D09080DB183C8BAF938BAC707FBBBFB783ADA19DAE86388D904B90687FAC0F4DAD940CD70F67771B533B0D147E6DAE8A3A9D5C76B3F00E2F4355C=='
print zlib.decompress(base64.b64decode(PUBDATALONG), 0, 362)
and I get this:
zlib.error: Error -3 while decompressing data: incorrect header check
for sure I do something wrong but I am not smart enough to figure it out by myself.
That string is not Base-64 encoded. It is simply hexadecimal. (I have no idea why it ends in ==, which makes it look a little like a Base-64 string.) You should be able to see by inspection that there are no lower case letters, or for that matter upper case letters after F as there would be in a typical Base-64 encoded string of compressed, i.e. random-appearing data.
Remove the equal signs at the end and use .decode("hex") in Python 2, or bytes.fromhex() in Python 3.

Java bit processing of file.txt [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I want to process a file.txt at the binary level by removing every 5th bit if it is equal to 1. Save the new processed binary file and repeat the process until it can no longer find any more 5th bits equal to 1, then save the final file.
Usually you operate on bytes not bits. If you want to access individual bits, you can use BitSet (assuming the file will fit in memory). For example, to set 17th bit to 1:
final Path path = Paths.get("file.bin");
final BitSet bitSet = BitSet.valueOf(Files.readAllBytes(path));
bitSet.set(17, true);
Files.write(path, bitSet.toByteArray());
All files are already stored as binary. You can get the binary bytes from any file in Java using the Files api. As an example:
InputStream is = null;
try{
is = Files.newInputStream(Paths.get("myFile.pdf"),StandardOpenOption.READ, StandardOpenOption.WRITE);
boolean hadMoreBytes = true;
byte[] buffer = new byte[1024];
int bytesRead = 0;
while(hadMoreBytes){
bytesRead = is.read(buffer);
doSomethingWithBytes(buffer,bytesRead);
hadMoreBytes = bytesRead > 0;
}
} finally {
if(is!=null){
is.close();
}
}
*plus usual disclaimers about adding error handling & other checks as appropriate for your situation
Note that you will be reading your file in "chunks" of bytes no bigger than your buffer. If you know that your files will be small enough to fit comfortably in memory and your situation demands it, you can build an array that contains all the bytes from the file yourself.
If you wanted to do something with bytes of the file after reading it, you can do something similar using Files.newOutputStream(Path path, OpenOption... options).
To manipulate file bytes - read and write, you could use RandomAccessFile or a ByteBuffer. An example using RandomAccess file:
public void writeAndRead(byte[] bytes) throws IOException {
RandomAccessFile file = new RandomAccessFile("myFile.bin", "rw");
// Write some bytes to file.
file.write(bytes);
// Seek to the begining of the file.
file.seek(0);
// Read back the bytes from the file.
byte[] buffer = new byte[bytes.length];
file.read(buffer);
file.close();
}
My take on this would be something like this:
After reading a byte from you file you could check its 5th bit value by using bit wise operations.
byte myByte;
int bit;
...
boolean bitValue = (myByte & (1 << bit)) != 0;
After reading one byte, check its 5th bit. If the bit is equal to 1, shift the first 3 bits of the byte to left (remove the bit). Now the first bit of your byte is undefined (can be either 0 or 1). So read the next byte and take its last bit, and insert into the previous bytes first bit.
Do the same shifting for the next byte until no bytes are left. Afterwards repeat the process. Of checking the bits.
You can set a specific bit of a byte by doing this:
myByte |= 1 << bit;
Looking at other questions in stack overflow, maybe you could make use of bit-io.

How to convert "java.nio.HeapByteBuffer" to String

I have a data structure java.nio.HeapByteBuffer[pos=71098 lim=71102 cap=94870], which I need to convert into Int (in Scala), the conversion might look simple but whatever which I approach , i did not get right conversion. could you please help me?
Here is my code snippet:
val v : ByteBuffer= map.get("company").get
val utf_str = new String(v, java.nio.charset.StandardCharsets.UTF_8)
println (utf_str)
the output is just "R" ??
I can't see how you can even get that to compile, String has constructors that accepts another string or possibly an array, but not a ByteBuffer or any of its parents.
To work with the nio buffer api you first write to a buffer, then do a flip before you read from the buffer, there are lots of good resources online about that. This one for example: http://tutorials.jenkov.com/java-nio/buffers.html
How to read that as a string entirely depends on how the characters are encoded inside the buffer, if they are two bytes per character (as strings are in Java/the JVM) you can convert your buffer to a character buffer by using asCharBuffer.
So, for example:
val byteBuffer = ByteBuffer.allocate(7).order(ByteOrder.BIG_ENDIAN);
byteBuffer.putChar('H').putChar('i').putChar('!')
byteBuffer.flip()
val charBuffer = byteBuffer.asCharBuffer
assert(charBuffer.toString == "Hi!")

Convert ASCII byte[] to String

I am trying to pass a byte[] containing ASCII characters to log4j, to be logged into a file using the obvious representation. When I simply pass in the byt[] it is of course treated as an object and the logs are pretty useless. When I try to convert them to strings using new String(byte[] data), the performance of my application is halved.
How can I efficiently pass them in, without incurring the approximately 30us time penalty of converting them to strings.
Also, why does it take so long to convert them?
Thanks.
Edit
I should add that I am optmising for latency here - and yes, 30us does make a difference! Also, these arrays vary from ~100 all the way up to a few thousand bytes.
ASCII is one of the few encodings that can be converted to/from UTF16 with no arithmetic or table lookups so it's possible to convert manually:
String convert(byte[] data) {
StringBuilder sb = new StringBuilder(data.length);
for (int i = 0; i < data.length; ++ i) {
if (data[i] < 0) throw new IllegalArgumentException();
sb.append((char) data[i]);
}
return sb.toString();
}
But make sure it really is ASCII, or you'll end up with garbage.
What you want to do is delay processing of the byte[] array until log4j decides that it actually wants to log the message. This way you could log it at DEBUG level, for example, while testing and then disable it during production. For example, you could:
final byte[] myArray = ...;
Logger.getLogger(MyClass.class).debug(new Object() {
#Override public String toString() {
return new String(myArray);
}
});
Now you don't pay the speed penalty unless you actually log the data, because the toString method isn't called until log4j decides it'll actually log the message!
Now I'm not sure what you mean by "the obvious representation" so I've assumed that you mean convert to a String by reinterpreting the bytes as the default character encoding. Now if you are dealing with binary data, this is obviously worthless. In that case I'd suggest using Arrays.toString(byte[]) to create a formatted string along the lines of
[54, 23, 65, ...]
If your data is in fact ASCII (i.e. 7-bit data), then you should be using new String(data, "US-ASCII") instead of depending on the platform default encoding. This may be faster than trying to interpret it as your platform default encoding (which could be UTF-8, which requires more introspection).
You could also speed this up by avoiding the Charset-Lookup hit each time, by caching the Charset instance and calling new String(data, charset) instead.
Having said that: it's been a very, very long time since I've seen real ASCII data in production environment
Halved performance? How large is this byte array? If it's for example 1MB, then there are certainly more factors to take into account than just "converting" from bytes to chars (which is supposed to be fast enough though). Writing 1MB of data instead of "just" 100bytes (which the byte[].toString() may generate) to a log file is obviously going to take some time. The disk file system is not as fast as RAM memory.
You'll need to change the string representation of the byte array. Maybe with some more sensitive information, e.g. the name associated with it (filename?), its length and so on. After all, what does that byte array actually represent?
Edit: I can't remember to have seen the "approximately 30us" phrase in your question, maybe you edited it in within 5 minutes after asking, but this is actually microoptimization and it should certainly not cause "halved performance" in general. Unless you write them a million times per second (still then, why would you want to do that? aren't you overusing the phenomenon "logging"?).
Take a look here: Faster new String(bytes, cs/csn) and String.getBytes(cs/csn)

Categories