Byte[] input to Java Scanner - java

I'm writing a plugin for a Java application. External software makes a TCP connection to this application, and sends messages as to my plugin as UTF-8 encoded JSON objects. Each message is separated by a delimiter. I'm currently using "\u00A1" (¡) for a delimiter.
{ "message": "value" }¡{ "message": "value" }¡{ "message": "value" }...
Since TCP doesn't provide any guarantees about how much data will arrive at a time, the plugin has to be receiving this stream of data and pull out each individual { "message": "value" } token. Sounds like a great use of java.util.Scanner..
The problem is, the application doesn't provide my plugin direct access to the TCP socket. The plugin receives data as repeated calls to its receiveData(byte[] bytes) method. I need some sort of input stream or channel that Scanner can read from, but that I can also deposit bytes to (from receiveData). Does such a thing exist? If not, any suggestions for implementing one? Or am I way off and is there a better way to approach this?
Note: I originally tried to implement this logic manually. I would take each chunk of received bytes, decode to a string, search for the delimiter, and append to a StringBuilder. Then I realized that this approach isn't valid because the incoming byte[] probably won't end on an even UTF-8 character boundary and would not decode properly. I feel like Scanner is exactly what I want, I just can't figure out how to provide it input.
Edit: The data is streamed continuously to update the display as the application is running. It is not possible to wait until there is no more data to begin parsing.

I ended up creating a circular buffer with a condition variable that implements ReadableByteChannel. Seems to work well enough. Here's a full example:
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.channels.ReadableByteChannel;
import java.util.Scanner;
import java.util.concurrent.locks.Condition;
import java.util.concurrent.locks.Lock;
import java.util.concurrent.locks.ReentrantLock;
public class Buffer implements ReadableByteChannel {
private final Lock lock = new ReentrantLock();
private final Condition notEmpty = lock.newCondition();
private int readIndex = 0;
private int writeIndex = 0;
private int size = 0;
private final int capacity;
private final byte[] buff;
public Buffer(int capacity) {
this.capacity = capacity;
this.buff = new byte[capacity];
}
/**
* Deposit bytes to the buffer. Will only write until
* buffer is full.
* #param bytes the bytes to add
* #return the number of bytes actually added
*/
public int addBytes(byte[] bytes) {
lock.lock();
int writeCount = 0;
try {
int available = capacity - size;
writeCount = available <= bytes.length ? available : bytes.length;
for (int i = 0; i < writeCount; ++i) {
buff[writeIndex] = bytes[i];
incrementWrite();
++size;
}
// notify callers waiting on read()
notEmpty.signal();
} finally {
lock.unlock();
}
return writeCount;
}
public int addBytes(byte[] bytes, int offset, int length) {
lock.lock();
int writeCount = 0;
try {
int available = capacity - size;
writeCount = available <= length ? available : length;
for (int i = 0; i <writeCount; ++i) {
buff[writeIndex] = bytes[offset + i];
incrementWrite();
++size;
}
notEmpty.signal();
} finally {
lock.unlock();
}
return writeCount;
}
#Override
public int read(ByteBuffer byteBuffer) throws IOException {
lock.lock();
try {
// if the current size is 0, wait until data is added
while (size == 0) {
notEmpty.wait();
}
int attempt = byteBuffer.remaining();
int readCount = attempt <= size ? attempt : size;
for (int i = 0; i < readCount; ++i) {
byteBuffer.put(buff[readIndex]);
incrementRead();
--size;
}
return readCount;
} catch (InterruptedException ie) {
Thread.currentThread().interrupt();
return 0;
} finally {
lock.unlock();
}
}
#Override
public boolean isOpen() {
return true;
}
#Override
public void close() throws IOException {
// do nothing
}
private final void incrementRead() {
// increment and wrap around if necessary
if (++readIndex == capacity) {
readIndex = 0;
}
}
private final void incrementWrite() {
// increment and wrap around if necessary
if (++writeIndex == capacity) {
writeIndex = 0;
}
}
public static void main(String[] args) {
final Buffer buff = new Buffer(1024);
final Scanner scanner = new Scanner(buff).useDelimiter("!");
Thread readThread = new Thread(new Runnable() {
#Override
public void run() {
while (scanner.hasNext()) {
String message = scanner.next();
System.out.println(message);
if (message.equals("goodbye")) {
return;
}
}
}
});
readThread.start();
buff.addBytes("hello,".getBytes());
buff.addBytes(" world!".getBytes());
buff.addBytes("good".getBytes());
buff.addBytes("bye!".getBytes());
}
}

Related

Passing Parameters to Threads while running [duplicate]

This question already has answers here:
How to pass parameter to an already running thread in java?
(4 answers)
Closed 3 years ago.
I want to implement Circular Buffer using 2 threads. One Thread that reads from the buffer and the other one writes to the buffer as follow:
Assuming that process P is the writer process and process Q is the reader process.
After starting both processes from the main class, how can I pass to process P (the writer) the value I want it to write to the buffer each time I want to write to the buffer? Similarly, how can I ask process Q (the reader) to read from the buffer (i.e. how can I call it to return the value that it has read from the buffer)?
I am confused because the implementation of both processes is done in the run() method and this method is executed when we issue the .start() command. However, once started, how can we keep passing and reading parameters while it is in the run mode.
I am following the implementation in the following example mentioned in the following question :
Circular Buffer with Threads Consumer and Producer: it get stucks some executions
Shared Variables:
public class BufferCircular {
volatile int[] array;
volatile int p;
volatile int c;
volatile int nElem;
public BufferCircular(int[] array) {
this.array = array;
this.p = 0;
this.c = 0;
this.nElem = 0;
}
public void writeData (int data) {
this.array[p] = data;
this.p = (p + 1) % array.length;
this.nElem++;
}
public int readData() {
int data = array[c];
this.c = (c + 1) % array.length;
this.nElem--;
return data;
}
}
Writer Process:
public class Producer extends Thread {
BufferCircular buffer;
int bufferTam;
int contData;
public Productor(BufferCircular buff) {
this.buffer = buff;
this.bufferTam = buffer.array.length;
this.contData = 0;
}
public void produceData() {
this.contData++;
this.buffer.writeData(contData);
}
public void run() {
for (int i = 0; i < 500; i++) {
while (this.buffer.nElem == this.bufferTam) {
Thread.yield();
}
this.produceData();
}
}
}
Reader Process:
public class Consumer extends Thread {
BufferCircular buffer;
int cont;
public Consumer(BufferCircular buff) {
this.buffer = buff;
this.cont = 0;
}
public void consumeData() {
int data = buffer.readData();
cont++;
System.out.println("data " + cont + ": " + data);
}
public void run() {
for (int i = 0; i < 500; i++) {
while (this.buffer.nElem == 0) {
Thread.yield();
}
this.consumeData();
}
}
}
Main:
public class Main {
public static void main(String[] args) {
Random ran = new Random();
int tamArray = ran.nextInt(21) + 1;
int[] array = new int[tamArray];
BufferCircular buffer = new BufferCircular(array);
Producer producer = new Producer (buffer);
Consumer consumer = new Consumer (buffer);
producer.start();
consumer.start();
try {
producer.join();
consumer.join();
} catch (InterruptedException e) {
System.err.println("Error with Threads");
e.printStackTrace();
}
}
}
You main thread should not be responsible for passing value to producer thread and in same way it should not be responsible to print the data from consumer.
Producer: this should be responsible to getting the data and inserting into your queue, in your case its currently increment int value and passing it, but you can also read data from file or db or use std input to take user input and pass that data in the queue.
Consumer: this should be responsible to process the data from the queue in ur case printing the numbers.
Check online

Circular Buffer with Threads Consumer and Producer: it get stucks some executions

I'm developing a circular buffer with two Threads: Consumer and Producer.
I'm using active waiting with Thread.yield.
I know that it is possible to do that with semaphores, but I wanted the buffer without semaphores.
Both have a shared variable: bufferCircular.
While the buffer is not full of useful information, producer write data in the position pof array, and while there are some useful information consumer read data in the position c of array. The variable nElem from BufferCircular is the number of value datas that haven't been read yet.
The program works quite good 9/10 times that runs. Then, sometimes, it get stucks in a infinite loop before show the last element on screen (number 500 of loop for), or just dont' show any element.
I think is probably a liveLock, but I can't find the mistake.
Shared Variable:
public class BufferCircular {
volatile int[] array;
volatile int p;
volatile int c;
volatile int nElem;
public BufferCircular(int[] array) {
this.array = array;
this.p = 0;
this.c = 0;
this.nElem = 0;
}
public void writeData (int data) {
this.array[p] = data;
this.p = (p + 1) % array.length;
this.nElem++;
}
public int readData() {
int data = array[c];
this.c = (c + 1) % array.length;
this.nElem--;
return data;
}
}
Producer Thread:
public class Producer extends Thread {
BufferCircular buffer;
int bufferTam;
int contData;
public Productor(BufferCircular buff) {
this.buffer = buff;
this.bufferTam = buffer.array.length;
this.contData = 0;
}
public void produceData() {
this.contData++;
this.buffer.writeData(contData);
}
public void run() {
for (int i = 0; i < 500; i++) {
while (this.buffer.nElem == this.bufferTam) {
Thread.yield();
}
this.produceData();
}
}
}
Consumer Thread:
public class Consumer extends Thread {
BufferCircular buffer;
int cont;
public Consumer(BufferCircular buff) {
this.buffer = buff;
this.cont = 0;
}
public void consumeData() {
int data = buffer.readData();
cont++;
System.out.println("data " + cont + ": " + data);
}
public void run() {
for (int i = 0; i < 500; i++) {
while (this.buffer.nElem == 0) {
Thread.yield();
}
this.consumeData();
}
}
}
Main:
public class Main {
public static void main(String[] args) {
Random ran = new Random();
int tamArray = ran.nextInt(21) + 1;
int[] array = new int[tamArray];
BufferCircular buffer = new BufferCircular(array);
Producer producer = new Producer (buffer);
Consumer consumer = new Consumer (buffer);
producer.start();
consumer.start();
try {
producer.join();
consumer.join();
} catch (InterruptedException e) {
System.err.println("Error with Threads");
e.printStackTrace();
}
}
}
Any help will be welcome.
Your problem here is that your BufferCircular methods are sensitive to race conditions. Take for example writeData(). It executes in 3 steps, some of which are also not atomic:
this.array[p] = data; // 1
this.p = (p + 1) % array.length; // 2 not atomic
this.nElem++; // 3 not atomic
Suppose that 2 threads entered writeData() at the same time. At step 1, they both have the same p value, and both rewrite array[p] value. Now, array[p] is rewritten twice and data that first thread had to write, is lost, because second thread wrote to the same index after. Then they execute step 2--and result is unpredictable since p can be incremented by 1 or 2 (p = (p + 1) % array.length consists of 3 operations, where threads can interact). Then, step 3. ++ operator is also not atomic: it uses 2 operations behind the scenes. So nElem becomes also incremented by 1 or 2.
So we have fully unpredictable result. Which leads to poor execution of your program.
The simplest solution is to make readData() and writeData() methods serialized. For this, declare them synchronized:
public synchronized void writeData (int data) { //...
public synchronized void readData () { //...
If you have only one producer and one consumer threads, race conditions may occur on operations involving nElem. Solution is to use AtomicInteger instead of int:
final AtomicInteger nElem = new AtomicInteger();
and use its incrementAndGet() and decrementAndGet() methods.

ArrayIndexOutOfBoundsException at method of concurrency approaches comprise

I want to run some comparison of different approaches for concurrency technique.
But it throws next exceptions:
Warmup
BaseLine : 21246915
============================
Cycles : 50000
Exception in thread "pool-1-thread-3" Exception in thread "pool-1-thread-5" java.lang.ArrayIndexOutOfBoundsException: 100000
at concurrency.BaseLine.accumulate(SynchronizationComparisons.java:89)
at concurrency.Accumulator$Modifier.run(SynchronizationComparisons.java:39)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:744)
java.lang.ArrayIndexOutOfBoundsException: 100000
at concurrency.BaseLine.accumulate(SynchronizationComparisons.java:89)
at concurrency.Accumulator$Modifier.run(SynchronizationComparisons.java:39)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:744)
Here is code:
import java.util.concurrent.*;
import java.util.concurrent.atomic.*;
import java.util.concurrent.locks.*;
import java.util.*;
import static net.mindview.util.Print.*;
abstract class Accumulator {
public static long cycles = 50000L;
// Number of Modifiers and Readers during each test:
private static final int N = 4;
public static ExecutorService exec = Executors.newFixedThreadPool(N * 2);
private static CyclicBarrier barrier = new CyclicBarrier(N * 2 + 1);
protected volatile int index = 0;
protected volatile long value = 0;
protected long duration = 0;
protected String id = "error";
protected final static int SIZE = 100000;
protected static int[] preLoaded = new int[SIZE];
static {
// Load the array of random numbers:
Random rand = new Random(47);
for (int i = 0; i < SIZE; i++)
preLoaded[i] = rand.nextInt();
}
public abstract void accumulate();
public abstract long read();
private class Modifier implements Runnable {
public void run() {
for (long i = 0; i < cycles; i++)
accumulate();
try {
barrier.await();
} catch (Exception e) {
throw new RuntimeException(e);
}
}
}
private class Reader implements Runnable {
#SuppressWarnings("unused")
private volatile long value;
public void run() {
for (long i = 0; i < cycles; i++)
value = read();
try {
barrier.await();
} catch (Exception e) {
throw new RuntimeException(e);
}
}
}
public void timedTest() {
long start = System.nanoTime();
for (int i = 0; i < N; i++) {
exec.execute(new Modifier());
exec.execute(new Reader());
}
try {
barrier.await();
} catch (Exception e) {
throw new RuntimeException(e);
}
duration = System.nanoTime() - start;
printf("%-13s: %13d\n", id, duration);
}
public static void report(Accumulator acc1, Accumulator acc2) {
printf("%-22s: %.2f\n", acc1.id + "/" + acc2.id, (double) acc1.duration / (double) acc2.duration);
}
}
class BaseLine extends Accumulator {
{
id = "BaseLine";
}
public void accumulate() {
value += preLoaded[index++];
if (index >= SIZE)
index = 0;
}
public long read() {
return value;
}
}
class SynchronizedTest extends Accumulator {
{
id = "synchronized";
}
public synchronized void accumulate() {
value += preLoaded[index++];
if (index >= SIZE)
index = 0;
}
public synchronized long read() {
return value;
}
}
class LockTest extends Accumulator {
{
id = "Lock";
}
private Lock lock = new ReentrantLock();
public void accumulate() {
lock.lock();
try {
value += preLoaded[index++];
if (index >= SIZE)
index = 0;
} finally {
lock.unlock();
}
}
public long read() {
lock.lock();
try {
return value;
} finally {
lock.unlock();
}
}
}
class AtomicTest extends Accumulator {
{
id = "Atomic";
}
private AtomicInteger index = new AtomicInteger(0);
private AtomicLong value = new AtomicLong(0);
public void accumulate() {
// Oops! Relying on more than one Atomic at
// a time doesn't work. But it still gives us
// a performance indicator:
int i = index.getAndIncrement();
value.getAndAdd(preLoaded[i]);
if (++i >= SIZE)
index.set(0);
}
public long read() {
return value.get();
}
}
public class SynchronizationComparisons {
static BaseLine baseLine = new BaseLine();
static SynchronizedTest synch = new SynchronizedTest();
static LockTest lock = new LockTest();
static AtomicTest atomic = new AtomicTest();
static void test() {
print("============================");
printf("%-12s : %13d\n", "Cycles", Accumulator.cycles);
baseLine.timedTest();
synch.timedTest();
lock.timedTest();
atomic.timedTest();
Accumulator.report(synch, baseLine);
Accumulator.report(lock, baseLine);
Accumulator.report(atomic, baseLine);
Accumulator.report(synch, lock);
Accumulator.report(synch, atomic);
Accumulator.report(lock, atomic);
}
public static void main(String[] args) {
int iterations = 5; // Default
if (args.length > 0) // Optionally change iterations
iterations = new Integer(args[0]);
// The first time fills the thread pool:
print("Warmup");
baseLine.timedTest();
// Now the initial test doesn't include the cost
// of starting the threads for the first time.
// Produce multiple data points:
for (int i = 0; i < iterations; i++) {
test();
Accumulator.cycles *= 2;
}
Accumulator.exec.shutdown();
}
}
How to solve this trouble?
The array preLoaded is of size 100000. So, the valid index starts from 0 to 99999 since array index starts from 0. You need to swap the statements in method accumulate()
Change this
value += preLoaded[index++]; //index validity is not done
if (index >= SIZE)
index = 0;
to
if (index >= SIZE)
index = 0;
value += preLoaded[index++]; // index validity is done and controlled
This will not make the index go to 100000. It will make it to 0 when it turns 100000 before the index value is accessed.
Note : The above code is vulnerable only in multi-threaded environment. The above code will work fine with single thread.
Change BaseLine class and AtomicTest class:
class BaseLine extends Accumulator {
{
id = "BaseLine";
}
public void accumulate() {
int early = index++; // early add and assign to a temp.
if(early >= SIZE) {
index = 0;
early = 0;
}
value += preLoaded[early];
}
public long read() {
return value;
}
}
class AtomicTest extends Accumulator {
{
id = "Atomic";
}
private AtomicInteger index = new AtomicInteger(0);
private AtomicLong value = new AtomicLong(0);
public void accumulate() {
int early = index.getAndIncrement();
if(early >= SIZE) {
index.set(0);
early = 0;
}
value.getAndAdd(preLoaded[early]);
}
public long read() {
return value.get();
}
}
I suspect that you're running into concurrent executions of BaseLine.accumulate() near the boundary of the preLoaded array.
You've got 4 threads hammering away at an unsynchronized method, which is potentially leading to index being incremented to 100000 by say, Thread 1, and before Thread 1 can set it back to 0, one of Thread 2, 3 or 4 is coming in and attempting to access preLoaded[index++], which fails as index is still 100000.

Get size of String w/ encoding in bytes without converting to byte[]

I have a situation where I need to know the size of a String/encoding pair, in bytes, but cannot use the getBytes() method because 1) the String is very large and duplicating the String in a byte[] array would use a large amount of memory, but more to the point 2) getBytes() allocates a byte[] array based on the length of the String * the maximum possible bytes per character. So if I have a String with 1.5B characters and UTF-16 encoding, getBytes() will try to allocate a 3GB array and fail, since arrays are limited to 2^32 - X bytes (X is Java version specific).
So - is there some way to calculate the byte size of a String/encoding pair directly from the String object?
UPDATE:
Here's a working implementation of jtahlborn's answer:
private class CountingOutputStream extends OutputStream {
int total;
#Override
public void write(int i) {
throw new RuntimeException("don't use");
}
#Override
public void write(byte[] b) {
total += b.length;
}
#Override public void write(byte[] b, int offset, int len) {
total += len;
}
}
Simple, just write it to a dummy output stream:
class CountingOutputStream extends OutputStream {
private int _total;
#Override public void write(int b) {
++_total;
}
#Override public void write(byte[] b) {
_total += b.length;
}
#Override public void write(byte[] b, int offset, int len) {
_total += len;
}
public int getTotalSize(){
_total;
}
}
CountingOutputStream cos = new CountingOutputStream();
Writer writer = new OutputStreamWriter(cos, "my_encoding");
//writer.write(myString);
// UPDATE: OutputStreamWriter does a simple copy of the _entire_ input string, to avoid that use:
for(int i = 0; i < myString.length(); i+=8096) {
int end = Math.min(myString.length(), i+8096);
writer.write(myString, i, end - i);
}
writer.flush();
System.out.println("Total bytes: " + cos.getTotalSize());
it's not only simple, but probably just as fast as the other "complex" answers.
The same using apache-commons libraries:
public static long stringLength(String string, Charset charset) {
try (NullOutputStream nul = new NullOutputStream();
CountingOutputStream count = new CountingOutputStream(nul)) {
IOUtils.write(string, count, charset.name());
count.flush();
return count.getCount();
} catch (IOException e) {
throw new IllegalStateException("Unexpected I/O.", e);
}
}
Guava has an implementation according to this post:
Utf8.encodedLength()
Here's an apparently working implementation:
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
public class TestUnicode {
private final static int ENCODE_CHUNK = 100;
public static long bytesRequiredToEncode(final String s,
final Charset encoding) {
long count = 0;
for (int i = 0; i < s.length(); ) {
int end = i + ENCODE_CHUNK;
if (end >= s.length()) {
end = s.length();
} else if (Character.isHighSurrogate(s.charAt(end))) {
end++;
}
count += encoding.encode(s.substring(i, end)).remaining() + 1;
i = end;
}
return count;
}
public static void main(String[] args) {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < 100; i++) {
sb.appendCodePoint(11614);
sb.appendCodePoint(1061122);
sb.appendCodePoint(2065);
sb.appendCodePoint(1064124);
}
Charset cs = StandardCharsets.UTF_8;
System.out.println(bytesRequiredToEncode(new String(sb), cs));
System.out.println(new String(sb).getBytes(cs).length);
}
}
The output is:
1400
1400
In practice I'd increase ENCODE_CHUNK to 10MChars or so.
Probably slightly less efficient than brettw's answer, but simpler to implement.
Ok, this is extremely gross. I admit that, but this stuff is hidden by the JVM, so we have to dig a little. And sweat a little.
First, we want the actual char[] that backs a String without making a copy. To do this we have to use reflection to get at the 'value' field:
char[] chars = null;
for (Field field : String.class.getDeclaredFields()) {
if ("value".equals(field.getName())) {
field.setAccessible(true);
chars = (char[]) field.get(string); // <--- got it!
break;
}
}
Next you need to implement a subclass of java.nio.ByteBuffer. Something like:
class MyByteBuffer extends ByteBuffer {
int length;
// Your implementation here
};
Ignore all of the getters, implement all of the put methods like put(byte) and putChar(char) etc. Inside something like put(byte), increment length by 1, inside of put(byte[]) increment length by the array length. Get it? Everything that is put, you add the size of whatever it is to length. But you're not storing anything in your ByteBuffer, you're just counting and throwing away, so no space is taken. If you breakpoint the put methods, you can probably figure out which ones you actually need to implement. putFloat(float) is probably not used, for example.
Now for the grand finale, putting it all together:
MyByteBuffer bbuf = new MyByteBuffer(); // your "counting" buffer
CharBuffer cbuf = CharBuffer.wrap(chars); // wrap your char array
Charset charset = Charset.forName("UTF-8"); // your charset goes here
CharsetEncoder encoder = charset.newEncoder(); // make a new encoder
encoder.encode(cbuf, bbuf, true); // do it!
System.out.printf("Length: %d\n", bbuf.length); // pay me US$1,000,000

How to asynchronously modify an array of Strings

I thought this was an interesting programming problem so I posted it even though I think I have a solution idea that is good enough, see (*). If anyone has an elegant solution I would love to see it!
I am working with a method that calls external library that does http requests to a server. I need to have K strings as input to be effective i.e. each invocation of the external resource is a HTTP request and I need to buffer up some data for effectiveness. (As an example let K be 200 and occurs as a token in a text with say 1% probability so I would need to process 20,000 tokens before finding the 200 input arguments).
Effectively what this does is: externalHTTP(commaDelimitedString) -> get info about each string. Example externalHTTP("foo,bar") -> ["information snippet 1","information snippet 2"]. Where "information snippet 1" is about "foo".
I want to replace the "foo" and "bar" in a long text (string) with the information snippets but only after my buffer for the HTTP request is full. I still want to continue reading the original string while waiting for this to happen.
The text is tokenized by splitting (so I am working with an array of strings).
I.e. I would not like to stop execution of my text processing just because I am waiting for K strings to buffer up.
At first I thought that I could store words as individual string objects that I later update but then I realized that strings are immutable so it is call by value.
(*) My second idea was to store indices of the words (foo and bar) and then in order insert the snippets back into the original string array when the http request is finished. Like
class doStuff {
String text[];
LinkedList<Integer> idxList = new LinkedList<Integer>();
public doStuff(String[] t) {
text = t;
int i = 0;
for (String token : text) {
if (shouldReplaceToken(token)) {
replaceToken(i);
}
i++;
//do other work with the tokens
}
}
void replaceToken(int i) {
idxList.add(i);
if (++count > buffSize) {
count = 0;
String commaDelim = "";
ListIterator<Integer> it = idxList.getListIterator(0);
while (it.hasNext()) {
commaDelim += text[it.next()]+",";
}
String[] http_response = http_req(commaDelim);
for (String snippet : http_response) {
idx = idxList.poll(); //this is not elegant, dependent on FIFO order
text[Idx] = snippet;
}
}
}
}
To complicate things further is that I want to process several longer texts so i would have need to have a matrix of String arrays, one for each text.
I don't like the class known reference
String[] text
or they way I deal with indices in this code...
Hoping to see some suggestions :)
Edit: changed a bit to be more clear. I cant really say what I am looking up, non-disclosure etc, sorry. Some names might be different from java (bit only small difference).
Ok... Here's an attempt to fully answer your question with example code.
I've never played with threads much, so I figured I'd try to learn something tonight.
This solution uses threads to allow the http request to take place asynchronously.
The asynchronous request is simulated by using Thread.sleep().
My test case is primitive: the main class just sleeps for 30 sec to wait for everything to finish up.
Like I said, I'm new to Thread programming, so I probably overlooked something.
Hopefully this gets you started in the right direction...
/**
* A class that asynchronously replaces text in an
* array of strings by using helper threads.
*/
public class TextReplacer {
private final int bufferSize;
List<String> wordList = new ArrayList<String>();
List<Integer> indexList = new ArrayList<Integer>();
int bufferPosition = 0;
int lastPosition = 0;
public TextReplacer(String[] a, int n) {
bufferSize = n;
if (a != null) {
wordList = Arrays.asList(a);
}
}
public void replaceText() {
int i = 0;
for (String thisWord : getWordListCopy()) {
if (shouldReplaceToken(thisWord)) {
indexList.add(i);
processTextReplacement();
}
i++;
}
}
private void processTextReplacement() {
if (isBufferReady()) {
int currentPos = lastPosition;
replaceStrings(getCsv(), currentPos);
}
}
/** Uses a thread to replace strings in wordList. */
private void replaceStrings(String csv, int pos) {
new ReplacerThread(wordList, indexList, csv, pos, bufferSize).start();
}
private String getCsv() {
StringBuilder csv = new StringBuilder();
for (int i = 0; i < bufferSize; i ++) {
int idx = indexList.get(lastPosition++);
csv.append(wordList.get(idx)).append(",");
}
return csv.toString();
}
private boolean isBufferReady() {
bufferPosition++;
return ( bufferPosition % bufferSize == 0 );
}
private List<String> getWordListCopy() {
List<String> listCopy = new ArrayList<String>();
listCopy.addAll(wordList);
return listCopy;
}
/**
* Simulates a 10% replacement rate by only
* returning true for input that ends with a 3.
*/
private boolean shouldReplaceToken(String s) {
return s.endsWith("3");
}
public List<String> getWordList() {
return wordList;
}
public String[] getWordArray() {
return wordList.toArray(new String[0]);
}
}
/**
* A thread that sleeps for up to 8 seconds, then
* replaces a bunch of words in the list that is
* passed to it in its constructor.
*/
public class ReplacerThread extends Thread {
List<String> originalWords;
List<Integer> indices;
String wordCsv;
String[] replacementWords;
int startPos;
int bufferSize;
int maxSleepMillis = 8000;
int sleepMillis = getSleepMillis();
int threadNum; // for debugging
String prefix = new String(); // for debugging
/** Create a new thread. */
public ReplacerThread(List<String> o, List<Integer> i,
String c, int p, int n) {
originalWords = o;
indices = i;
wordCsv = c;
startPos = p;
bufferSize = n;
threadNum = startPos / bufferSize;
int count = 0;
while (count++ < threadNum) {
prefix += " ";
}
}
#Override
public void run() {
replacementWords = httpReq(wordCsv);
for (int i = 0; i < bufferSize; i ++) {
int pos = startPos + i;
int idx = indices.get(pos);
originalWords.set(idx, replacementWords[i]);
}
print("Thread #" + threadNum + " COMPLETE");
}
/** Simulate an asynchronous http request by using Thread.sleep */
private String[] httpReq(String s) {
try {
printSleepMessage();
sleep(sleepMillis);
}
catch (InterruptedException ex) {}
String[] repText = s.split(",");
for (int i = 0; i < repText.length; i++) {
repText[i] = repText[i].replace("Line", "Funky Line");
}
return repText;
}
private void printSleepMessage() {
int ms = sleepMillis / 1000;
print("Thread #" + threadNum + " SLEEP(" + ms + ")");
}
private int getSleepMillis() {
Double ms = maxSleepMillis * Math.random();
return ms.intValue();
}
public void print(Object o) {
String s = (o == null ? "null" : o.toString());
System.out.println(prefix + s + "\n");
}
}
/** A class that tests my funky solution. */
public class Main {
static String inputFile = "test-input.txt";
static int bufferSize = 50;
public static void main(String[] args) {
String[] theInput = readInput();
TextReplacer testItem = new TextReplacer(theInput, bufferSize);
testItem.replaceText();
try {
// wait 40 seconds for everything to happen
Thread.sleep(40000);
}
catch (InterruptedException ex) { }
dumpOutput(testItem.getWordArray());
}
public static String[] readInput() {
File inFile = new File(inputFile);
List<String> lineList = new ArrayList<String>();
try {
BufferedReader buff = new BufferedReader(new FileReader(inFile));
String currentLine = buff.readLine();
while (currentLine != null) {
lineList.add(currentLine);
currentLine = buff.readLine();
}
}
catch (IOException ignoreMe) {}
print("Lines read: " + lineList.size());
return lineList.toArray(new String[0]);
}
public static void dumpOutput(String[] txt) {
long ms = System.currentTimeMillis();
String fileName = "output-" + ms + ".txt";
File outFile = new File(fileName);
try {
BufferedWriter buff = new BufferedWriter(new FileWriter(outFile));
for (String s : txt) {
buff.write(s);
buff.newLine();
}
}
catch (IOException ignoreMe) {}
print("Lines written: " + txt.length);
print("File: " + fileName);
}
public static void print(Object o) {
System.out.println(o == null ? "null" : o.toString());
}
}

Categories