Jar differs but they should not - java

I have one method to create a jar.
public class Test {
public static void main(String[] args) throws Exception {
aha();
aha();
aha();
aha();
Thread.sleep(5000);
aha();
}
private static void aha() throws IOException, NoSuchAlgorithmException {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
JarOutputStream jos = new JarOutputStream(baos);
jos.putNextEntry(new ZipEntry("sd"));
jos.write("sdf".getBytes());
jos.close();
MessageDigest md = MessageDigest.getInstance("sha1");
byte[] digest = md.digest(baos.toByteArray());
for (byte b : digest) {
System.out.print("," + b);
}
System.out.println();
}
}
The output is:
,-57,-44,59,113,-126,-15,71,62,-90,-120,27,36,-3,69,26,-55,63,107,-93,102
,-57,-44,59,113,-126,-15,71,62,-90,-120,27,36,-3,69,26,-55,63,107,-93,102
,-57,-44,59,113,-126,-15,71,62,-90,-120,27,36,-3,69,26,-55,63,107,-93,102
,-57,-44,59,113,-126,-15,71,62,-90,-120,27,36,-3,69,26,-55,63,107,-93,102
,-124,-26,-79,-28,-34,77,-72,83,92,53,30,-13,95,21,-92,55,70,24,-72,39
I need same digests but the last digest differs. How to become reproducable hashes?

Altough almost invisible, if you write a ZipEntry to a JarOutputStream, the underlying ZipOutputStream will initialize the last modification time for you.
if (e.xdostime == -1) {
// by default, do NOT use extended timestamps in extra
// data, for now.
e.setTime(System.currentTimeMillis());
}
You would have to manually initialize the time with setTime get a constant result.

Related

How to get the intermediate compressed size using GZipOutputStream? [duplicate]

I have a BufferedWriter as shown below:
BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(
new GZIPOutputStream( hdfs.create(filepath, true ))));
String line = "text";
writer.write(line);
I want to find out the bytes written to the file with out querying file like
hdfs = FileSystem.get( new URI( "hdfs://localhost:8020" ), configuration );
filepath = new Path("path");
hdfs.getFileStatus(filepath).getLen();
as it will add overhead and I don't want that.
Also I cant do this:
line.getBytes().length;
As it give size before compression.
You can use the CountingOutputStream from Apache commons IO library.
Place it between the GZIPOutputStream and the file Outputstream (hdfs.create(..)).
After writing the content to the file you can read the number of written bytes from the CountingOutputStream instance.
If this isn't too late and you are using 1.7+ and you don't wan't to pull in an entire library like Guava or Commons-IO, you can just extend the GZIPOutputStream and obtain the data from the associated Deflater like so:
public class MyGZIPOutputStream extends GZIPOutputStream {
public MyGZIPOutputStream(OutputStream out) throws IOException {
super(out);
}
public long getBytesRead() {
return def.getBytesRead();
}
public long getBytesWritten() {
return def.getBytesWritten();
}
public void setLevel(int level) {
def.setLevel(level);
}
}
You can make you own descendant of OutputStream and count how many time write method was invoked
This is similar to the response by Olaseni, but I moved the counting into the BufferedOutputStream rather than the GZIPOutputStream, and this is more robust, since def.getBytesRead() in Olaseni's answer is not available after the stream has been closed.
With the implementation below, you can supply your own AtomicLong to the constructor so that you can assign the CountingBufferedOutputStream in a try-with-resources block, but still retrieve the count after the block has exited (i.e. after the file is closed).
public static class CountingBufferedOutputStream extends BufferedOutputStream {
private final AtomicLong bytesWritten;
public CountingBufferedOutputStream(OutputStream out) throws IOException {
super(out);
this.bytesWritten = new AtomicLong();
}
public CountingBufferedOutputStream(OutputStream out, int bufSize) throws IOException {
super(out, bufSize);
this.bytesWritten = new AtomicLong();
}
public CountingBufferedOutputStream(OutputStream out, int bufSize, AtomicLong bytesWritten)
throws IOException {
super(out, bufSize);
this.bytesWritten = bytesWritten;
}
#Override
public void write(byte[] b) throws IOException {
super.write(b);
bytesWritten.addAndGet(b.length);
}
#Override
public void write(byte[] b, int off, int len) throws IOException {
super.write(b, off, len);
bytesWritten.addAndGet(len);
}
#Override
public synchronized void write(int b) throws IOException {
super.write(b);
bytesWritten.incrementAndGet();
}
public long getBytesWritten() {
return bytesWritten.get();
}
}

How to calculate message digests in custom output stream?

I would like to implement an OutputStream that can produce MessageDigests. Likewise, I already have an InputStream implementation of it here, which works fine and extends FilterInputStream.
The problem is this: if I'm extending FilterOutputStream, the checksums don't match. If I use FileOutputStream it works fine (although that is not the stream I'd like to be using, as I'd like it to be a bit more generic than that).
public class MultipleDigestOutputStream extends FilterOutputStream
{
public static final String[] DEFAULT_ALGORITHMS = { EncryptionConstants.ALGORITHM_MD5,
EncryptionConstants.ALGORITHM_SHA1 };
private Map<String, MessageDigest> digests = new LinkedHashMap<>();
private File file;
public MultipleDigestOutputStream(File file, OutputStream os)
throws NoSuchAlgorithmException, FileNotFoundException
{
this(file, os, DEFAULT_ALGORITHMS);
}
public MultipleDigestOutputStream(File file, OutputStream os, String[] algorithms)
throws NoSuchAlgorithmException, FileNotFoundException
{
// super(file); // If extending FileOutputStream
super(os);
this.file = file;
for (String algorithm : algorithms)
{
addAlgorithm(algorithm);
}
}
public void addAlgorithm(String algorithm)
throws NoSuchAlgorithmException
{
MessageDigest digest = MessageDigest.getInstance(algorithm);
digests.put(algorithm, digest);
}
public MessageDigest getMessageDigest(String algorithm)
{
return digests.get(algorithm);
}
public Map<String, MessageDigest> getDigests()
{
return digests;
}
public String getMessageDigestAsHexadecimalString(String algorithm)
{
return MessageDigestUtils.convertToHexadecimalString(getMessageDigest(algorithm));
}
public void setDigests(Map<String, MessageDigest> digests)
{
this.digests = digests;
}
#Override
public void write(int b)
throws IOException
{
super.write(b);
System.out.println("write(int b)");
for (Map.Entry entry : digests.entrySet())
{
int p = b & 0xFF;
byte b1 = (byte) p;
MessageDigest digest = (MessageDigest) entry.getValue();
digest.update(b1);
}
}
#Override
public void write(byte[] b)
throws IOException
{
super.write(b);
for (Map.Entry entry : digests.entrySet())
{
MessageDigest digest = (MessageDigest) entry.getValue();
digest.update(b);
}
}
#Override
public void write(byte[] b, int off, int len)
throws IOException
{
super.write(b, off, len);
for (Map.Entry entry : digests.entrySet())
{
MessageDigest digest = (MessageDigest) entry.getValue();
digest.update(b, off, len);
}
}
#Override
public void close()
throws IOException
{
super.close();
}
}
My test case (the asserted checksums have been checked with md5sum and sha1sum):
public class MultipleDigestOutputStreamTest
{
#Before
public void setUp()
throws Exception
{
File dir = new File("target/test-resources");
if (!dir.exists())
{
//noinspection ResultOfMethodCallIgnored
dir.mkdirs();
}
}
#Test
public void testWrite()
throws IOException,
NoSuchAlgorithmException
{
String s = "This is a test.";
File file = new File("target/test-resources/metadata.xml");
ByteArrayOutputStream baos = new ByteArrayOutputStream();
MultipleDigestOutputStream mdos = new MultipleDigestOutputStream(file, baos);
mdos.write(s.getBytes());
mdos.flush();
final String md5 = mdos.getMessageDigestAsHexadecimalString("MD5");
final String sha1 = mdos.getMessageDigestAsHexadecimalString("SHA-1");
Assert.assertEquals("Incorrect MD5 sum!", "120ea8a25e5d487bf68b5f7096440019", md5);
Assert.assertEquals("Incorrect SHA-1 sum!", "afa6c8b3a2fae95785dc7d9685a57835d703ac88", sha1);
System.out.println("MD5: " + md5);
System.out.println("SHA1: " + sha1);
}
}
Could you please advise as to what could be the problem and how to fix it? Many thanks in advance!
If you are using java 7 or above, you can just use DigestOutputstream.
UPDATE
You can inplement the abstract MessageDigest class to wrap multiple MessageDigest instances.
SOME CODE
import java.security.DigestException;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
public class DigestWrapper extends MessageDigest
{
private final MessageDigest md5;
private final MessageDigest sha1;
// some methods missing.
// I just implemeted them throwing a RuntimeException.
public DigestWrapper() throws NoSuchAlgorithmException
{
super(null);
sha1 = MessageDigest.getInstance("sha-1");
md5 = MessageDigest.getInstance("md5");
}
public byte[] getMD5Digest()
{
return md5.digest();
}
public byte[] getSHA1Digest()
{
return sha1.digest();
}
#Override
public int digest(byte[] buf, int offset, int len) throws DigestException
{
md5.digest(buf, offset, len);
sha1.digest(buf, offset, len);
return 0;
}
#Override
public byte[] digest(byte[] input)
{
md5.digest(input);
sha1.digest(input);
return input;
}
#Override
public void reset()
{
md5.reset();
sha1.reset();
}
#Override
public void update(byte input)
{
md5.update(input);
sha1.update(input);
}
#Override
public void update(byte[] input, int offset, int len)
{
md5.update(input, offset, len);
sha1.update(input, offset, len);
}
#Override
public void update(byte[] input)
{
md5.update(input);
sha1.update(input);
}
}
I have created a project on Github which contains my implementation of the MultipleDigestInputStream and MultipleDigestOutputStream here.
To check how the code can be used, have a look at the following tests:
MultipleDigestInputStreamTest
MultipleDigestOutputStreamTest
Let me know, if there is enough interest and I can release it and publish it to Maven Central.

Find file with certain extension and calculate its hash in Java

I want to calculate the MD5 hash of a file that ends with a certain extension in Java. I used two codes for this:
FileSearch.java
public class FileSearch
{
public static File findfile(File file) throws IOException
{
String drive = (new DetectDrive()).USBDetect();
Path start = FileSystems.getDefault().getPath(drive);
Files.walkFileTree(start, new SimpleFileVisitor<Path>() {
#Override
public FileVisitResult visitFile(Path file, BasicFileAttributes attrs)
{
if (file.toString().endsWith(".raw"))
{
System.out.println(file);
}
return FileVisitResult.CONTINUE;
}
});
return file;
}
public static void main(String[] args) throws IOException
{
Hash hasher = new Hash();
try
{
if (file.toString().endsWith("raw"))
{
hasher.hash(file);
}
} catch (IOException e)
{
e.printStackTrace();
}
}
}
Hash.java
public class Hash
{
public void hash(File file) throws Exception
{
MessageDigest md = MessageDigest.getInstance("MD5");
FileInputStream fis = new FileInputStream(file);
byte[] dataBytes = new byte[1024];
int nread = 0;
while ((nread = fis.read(dataBytes)) != -1)
{
md.update(dataBytes, 0, nread);
};
byte[] mdbytes = md.digest();
StringBuffer sb = new StringBuffer();
for (int i = 0; i < mdbytes.length; i++)
{
sb.append(Integer.toString((mdbytes[i] & 0xff) + 0x100, 16).substring(1));
}
System.out.println("Digest(in hex format):: " + sb.toString());
}
}
The first code is used to search for the file that ends with .raw while the second code (not completed yet) is used to get the raw file and then calculate its hash. However, I do not know how to call the first code into the second code to get that raw file. I believe I have to put a string inside the new FileInputStream(...) but I need to call the raw file instead.
Is it possible to do so since both of them contain a main method? Or do I need to change the FileSearch.java without the main method and have a "public String search()" instead and then call it in the second code? I would appreciate if you could show me how to do it the right way.
So the logic consists in these steps:
for each file with the .raw extension
hash the file
You should thus have a method void hash(File file), and call it from your first class.
So, in Hash.java, rename your main method to
public void hash(File file)
And open the file using
FileInputStream fis = new FileInputStream(file);
Then call this hash() method from your first class:
public static void main(String[] args) throws IOException
Hash hasher = new Hash();
...
if (file.toString().endsWith(".raw")) {
hasher.hash(file);
}
...
}
You'll also have to make sure that every FileInputStream you create is properly closed, otherwise you'll quickly run out of file descriptors. The best way to do that is to use the try-with-resources construct: http://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html

Is there a SignatureOutputStream (or equivalent) in Java?

I note that there is a CipherOutputStream in Java, but apparently no SignatureOutputStream. Is this true? Where might I find a SignatureOutputStream?
Here is an example implementation, just for you. It comes with a test function (see below).
package de.fencing_game.paul.examples;
import java.io.*;
import java.security.*;
/**
* This class provides an outputstream which writes everything
* to a Signature as well as to an underlying stream.
*/
public class SignatureOutputStream
extends OutputStream
{
private OutputStream target;
private Signature sig;
/**
* creates a new SignatureOutputStream which writes to
* a target OutputStream and updates the Signature object.
*/
public SignatureOutputStream(OutputStream target, Signature sig) {
this.target = target;
this.sig = sig;
}
public void write(int b)
throws IOException
{
write(new byte[]{(byte)b});
}
public void write(byte[] b)
throws IOException
{
write(b, 0, b.length);
}
public void write(byte[] b, int offset, int len)
throws IOException
{
target.write(b, offset, len);
try {
sig.update(b, offset, len);
}
catch(SignatureException ex) {
throw new IOException(ex);
}
}
public void flush()
throws IOException
{
target.flush();
}
public void close()
throws IOException
{
target.close();
}
}
Here are some test methods to show how to use this:
private static byte[] signData(OutputStream target,
PrivateKey key, String[] data)
throws IOException, GeneralSecurityException
{
Signature sig = Signature.getInstance("SHA1withRSA");
sig.initSign(key);
DataOutputStream dOut =
new DataOutputStream(new SignatureOutputStream(target, sig));
for(String s : data) {
dOut.writeUTF(s);
}
byte[] signature = sig.sign();
return signature;
}
private static void verify(PublicKey key, byte[] signature,
byte[] data)
throws IOException, GeneralSecurityException
{
Signature sig = Signature.getInstance("SHA1withRSA");
sig.initVerify(key);
ByteArrayOutputStream collector =
new ByteArrayOutputStream(data.length);
OutputStream checker = new SignatureOutputStream(collector, sig);
checker.write(data);
if(sig.verify(signature)) {
System.err.println("Signature okay");
}
else {
System.err.println("Signature falsed!");
}
}
/**
* a test method.
*/
public static void main(String[] params)
throws IOException, GeneralSecurityException
{
if(params.length < 1) {
params = new String[] {"Hello", "World!"};
}
KeyPairGenerator gen = KeyPairGenerator.getInstance("RSA");
KeyPair pair = gen.generateKeyPair();
ByteArrayOutputStream arrayStream = new ByteArrayOutputStream();
byte[] signature = signData(arrayStream, pair.getPrivate(), params);
byte[] data = arrayStream.toByteArray();
verify(pair.getPublic(), signature, data);
// change one byte by one
data[3]++;
verify(pair.getPublic(), signature, data);
data = arrayStream.toByteArray();
verify(pair.getPublic(), signature, data);
// change signature
signature[4]++;
verify(pair.getPublic(), signature, data);
}
The main method signs its command line parameters with a new (random) private key, and then checks it with the corresponding public key. It again checks it with a bit changed data and signature.
Of course, for the checking of the signature a SignatureInputStream would be more useful - you can create it precisely the same way.

How to Cache InputStream for Multiple Use

I have an InputStream of a file and i use apache poi components to read from it like this:
POIFSFileSystem fileSystem = new POIFSFileSystem(inputStream);
The problem is that i need to use the same stream multiple times and the POIFSFileSystem closes the stream after use.
What is the best way to cache the data from the input stream and then serve more input streams to different POIFSFileSystem ?
EDIT 1:
By cache i meant store for later use, not as a way to speedup the application. Also is it better to just read up the input stream into an array or string and then create input streams for each use ?
EDIT 2:
Sorry to reopen the question, but the conditions are somewhat different when working inside desktop and web application.
First of all, the InputStream i get from the org.apache.commons.fileupload.FileItem in my tomcat web app doesn't support markings thus cannot reset.
Second, I'd like to be able to keep the file in memory for faster acces and less io problems when dealing with files.
you can decorate InputStream being passed to POIFSFileSystem with a version that when close() is called it respond with reset():
class ResetOnCloseInputStream extends InputStream {
private final InputStream decorated;
public ResetOnCloseInputStream(InputStream anInputStream) {
if (!anInputStream.markSupported()) {
throw new IllegalArgumentException("marking not supported");
}
anInputStream.mark( 1 << 24); // magic constant: BEWARE
decorated = anInputStream;
}
#Override
public void close() throws IOException {
decorated.reset();
}
#Override
public int read() throws IOException {
return decorated.read();
}
}
testcase
static void closeAfterInputStreamIsConsumed(InputStream is)
throws IOException {
int r;
while ((r = is.read()) != -1) {
System.out.println(r);
}
is.close();
System.out.println("=========");
}
public static void main(String[] args) throws IOException {
InputStream is = new ByteArrayInputStream("sample".getBytes());
ResetOnCloseInputStream decoratedIs = new ResetOnCloseInputStream(is);
closeAfterInputStreamIsConsumed(decoratedIs);
closeAfterInputStreamIsConsumed(decoratedIs);
closeAfterInputStreamIsConsumed(is);
}
EDIT 2
you can read the entire file in a byte[] (slurp mode) then passing it to a ByteArrayInputStream
Try BufferedInputStream, which adds mark and reset functionality to another input stream, and just override its close method:
public class UnclosableBufferedInputStream extends BufferedInputStream {
public UnclosableBufferedInputStream(InputStream in) {
super(in);
super.mark(Integer.MAX_VALUE);
}
#Override
public void close() throws IOException {
super.reset();
}
}
So:
UnclosableBufferedInputStream bis = new UnclosableBufferedInputStream (inputStream);
and use bis wherever inputStream was used before.
This works correctly:
byte[] bytes = getBytes(inputStream);
POIFSFileSystem fileSystem = new POIFSFileSystem(new ByteArrayInputStream(bytes));
where getBytes is like this:
private static byte[] getBytes(InputStream is) throws IOException {
byte[] buffer = new byte[8192];
ByteArrayOutputStream baos = new ByteArrayOutputStream(2048);
int n;
baos.reset();
while ((n = is.read(buffer, 0, buffer.length)) != -1) {
baos.write(buffer, 0, n);
}
return baos.toByteArray();
}
Use below implementation for more custom use -
public class ReusableBufferedInputStream extends BufferedInputStream
{
private int totalUse;
private int used;
public ReusableBufferedInputStream(InputStream in, Integer totalUse)
{
super(in);
if (totalUse > 1)
{
super.mark(Integer.MAX_VALUE);
this.totalUse = totalUse;
this.used = 1;
}
else
{
this.totalUse = 1;
this.used = 1;
}
}
#Override
public void close() throws IOException
{
if (used < totalUse)
{
super.reset();
++used;
}
else
{
super.close();
}
}
}
What exactly do you mean with "cache"? Do you want the different POIFSFileSystem to start at the beginning of the stream? If so, there's absolutely no point caching anything in your Java code; it will be done by the OS, just open a new stream.
Or do you wan to continue reading at the point where the first POIFSFileSystem stopped? That's not caching, and it's very difficult to do. The only way I can think of if you can't avoid the stream getting closed would be to write a thin wrapper that counts how many bytes have been read and then open a new stream and skip that many bytes. But that could fail when POIFSFileSystem internally uses something like a BufferedInputStream.
If the file is not that big, read it into a byte[] array and give POI a ByteArrayInputStream created from that array.
If the file is big, then you shouldn't care, since the OS will do the caching for you as best as it can.
[EDIT] Use Apache commons-io to read the File into a byte array in an efficient way. Do not use int read() since it reads the file byte by byte which is very slow!
If you want to do it yourself, use a File object to get the length, create the array and the a loop which reads bytes from the file. You must loop since read(byte[], int offset, int len) can read less than len bytes (and usually does).
This is how I would implemented, to be safely used with any InputStream :
write your own InputStream wrapper where you create a temporary file to mirror the original stream content
dump everything read from the original input stream into this temporary file
when the stream was completely read you will have all the data mirrored in the temporary file
use InputStream.reset to switch(initialize) the internal stream to a FileInputStream(mirrored_content_file)
from now on you will loose the reference of the original stream(can be collected)
add a new method release() which will remove the temporary file and release any open stream.
you can even call release() from finalize to be sure the temporary file is release in case you forget to call release()(most of the time you should avoid using finalize, always call a method to release object resources). see Why would you ever implement finalize()?
public static void main(String[] args) throws IOException {
BufferedInputStream inputStream = new BufferedInputStream(IOUtils.toInputStream("Foobar"));
inputStream.mark(Integer.MAX_VALUE);
System.out.println(IOUtils.toString(inputStream));
inputStream.reset();
System.out.println(IOUtils.toString(inputStream));
}
This works. IOUtils is part of commons IO.
This answer iterates on previous ones 1|2 based on the BufferInputStream. The main changes are that it allows infinite reuse. And takes care of closing the original source input stream to free-up system resources. Your OS defines a limit on those and you don't want the program to run out of file handles (That's also why you should always 'consume' responses e.g. with the apache EntityUtils.consumeQuietly()). EDIT Updated the code to handle for gready consumers that use read(buffer, offset, length), in that case it may happen that BufferedInputStream tries hard to look at the source, this code protects against that use.
public class CachingInputStream extends BufferedInputStream {
public CachingInputStream(InputStream source) {
super(new PostCloseProtection(source));
super.mark(Integer.MAX_VALUE);
}
#Override
public synchronized void close() throws IOException {
if (!((PostCloseProtection) in).decoratedClosed) {
in.close();
}
super.reset();
}
private static class PostCloseProtection extends InputStream {
private volatile boolean decoratedClosed = false;
private final InputStream source;
public PostCloseProtection(InputStream source) {
this.source = source;
}
#Override
public int read() throws IOException {
return decoratedClosed ? -1 : source.read();
}
#Override
public int read(byte[] b) throws IOException {
return decoratedClosed ? -1 : source.read(b);
}
#Override
public int read(byte[] b, int off, int len) throws IOException {
return decoratedClosed ? -1 : source.read(b, off, len);
}
#Override
public long skip(long n) throws IOException {
return decoratedClosed ? 0 : source.skip(n);
}
#Override
public int available() throws IOException {
return source.available();
}
#Override
public void close() throws IOException {
decoratedClosed = true;
source.close();
}
#Override
public void mark(int readLimit) {
source.mark(readLimit);
}
#Override
public void reset() throws IOException {
source.reset();
}
#Override
public boolean markSupported() {
return source.markSupported();
}
}
}
To reuse it just close it first if it wasn't.
One limitation though is that if the stream is closed before the whole content of the original stream has been read, then this decorator will have incomplete data, so make sure the whole stream is read before closing.
I just add my solution here, as this works for me. It basically is a combination of the top two answers :)
private String convertStreamToString(InputStream is) {
Writer w = new StringWriter();
char[] buf = new char[1024];
Reader r;
is.mark(1 << 24);
try {
r = new BufferedReader(new InputStreamReader(is, "UTF-8"));
int n;
while ((n=r.read(buf)) != -1) {
w.write(buf, 0, n);
}
is.reset();
} catch(UnsupportedEncodingException e) {
Logger.debug(this.getClass(), "Cannot convert stream to string.", e);
} catch(IOException e) {
Logger.debug(this.getClass(), "Cannot convert stream to string.", e);
}
return w.toString();
}

Categories