I would like to create 5 million csv files, I have waiting for almost 3 hours, but the program is still running. Can somebody give me some advice, how to speed up the file generation.
After these 5 million files generation complete, I have to upload them to s3 bucket.
It would be better if someone know how to generate these files through AWS, thus, we can move files to s3 bucket directly and ignore network speed issue.(Just start to learning AWS, there are lots of knowledge need to know)
The following is my code.
public class ParallelCsvGenerate implements Runnable {
private static AtomicLong baseID = new AtomicLong(8160123456L);
private static ThreadLocalRandom random = ThreadLocalRandom.current();
private static ThreadLocalRandom random2 = ThreadLocalRandom.current();
private static String filePath = "C:\\5millionfiles\\";
private static List<String> headList = null;
private static String csvHeader = null;
public ParallelCsvGenerate() {
headList = generateHeadList();
csvHeader = String.join(",", headList);
}
#Override
public void run() {
for(int i = 0; i < 1000000; i++) {
generateCSV();
}s
}
private void generateCSV() {
StringBuilder builder = new StringBuilder();
builder.append(csvHeader).append(System.lineSeparator());
for (int i = 0; i < headList.size(); i++) {
if(i < headList.size() - 1) {
builder.append(i % 2 == 0 ? generateRandomInteger() : generateRandomStr()).append(",");
} else {
builder.append(i % 2 == 0 ? generateRandomInteger() : generateRandomStr());
}
}
String fileName = String.valueOf(baseID.addAndGet(1));
File csvFile = new File(filePath + fileName + ".csv");
FileWriter fileWriter = null;
try {
fileWriter = new FileWriter(csvFile);
fileWriter.write(builder.toString());
fileWriter.flush();
} catch (Exception e) {
System.err.println(e);
} finally {
try {
if(fileWriter != null) {
fileWriter.close();
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
private static List<String> generateHeadList() {
List<String> headList = new ArrayList<>(20);
String baseFiledName = "Field";
for(int i = 1; i <=20; i++) {
headList.add(baseFiledName + i);
}
return headList;
}
/**
* generate a number in range of 0-50000
* #return
*/
private Integer generateRandomInteger() {
return random.nextInt(0,50000);
}
/**
* generate a string length is 5 - 8
* #return
*/
private String generateRandomStr() {
int strLength = random2.nextInt(5, 8);
String str="abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
int length = str.length();
StringBuilder builder = new StringBuilder();
for (int i = 0; i < strLength; i++) {
builder.append(str.charAt(random.nextInt(length)));
}
return builder.toString();
}
Main
ParallelCsvGenerate generate = new ParallelCsvGenerate();
Thread a = new Thread(generate, "A");
Thread b = new Thread(generate, "B");
Thread c = new Thread(generate, "C");
Thread d = new Thread(generate, "D");
Thread e = new Thread(generate, "E");
a.run();
b.run();
c.run();
d.run();
e.run();
Thanks for your guys advice, just refactor the code, and generate 3.8million files using 2.8h, which is much better.
Refactor code:
public class ParallelCsvGenerate implements Callable<Integer> {
private static String filePath = "C:\\5millionfiles\\";
private static String[] header = new String[]{
"FIELD1","FIELD2","FIELD3","FIELD4","FIELD5",
"FIELD6","FIELD7","FIELD8","FIELD9","FIELD10",
"FIELD11","FIELD12","FIELD13","FIELD14","FIELD15",
"FIELD16","FIELD17","FIELD18","FIELD19","FIELD20",
};
private String fileName;
public ParallelCsvGenerate(String fileName) {
this.fileName = fileName;
}
#Override
public Integer call() throws Exception {
try {
generateCSV();
} catch (IOException e) {
e.printStackTrace();
}
return 0;
}
private void generateCSV() throws IOException {
CSVWriter writer = new CSVWriter(new FileWriter(filePath + fileName + ".csv"), CSVWriter.DEFAULT_SEPARATOR, CSVWriter.NO_QUOTE_CHARACTER);
String[] content = new String[]{
RandomGenerator.generateRandomInteger(),
RandomGenerator.generateRandomStr(),
RandomGenerator.generateRandomInteger(),
RandomGenerator.generateRandomStr(),
RandomGenerator.generateRandomInteger(),
RandomGenerator.generateRandomStr(),
RandomGenerator.generateRandomInteger(),
RandomGenerator.generateRandomStr(),
RandomGenerator.generateRandomInteger(),
RandomGenerator.generateRandomStr(),
RandomGenerator.generateRandomInteger(),
RandomGenerator.generateRandomStr(),
RandomGenerator.generateRandomInteger(),
RandomGenerator.generateRandomStr(),
RandomGenerator.generateRandomInteger(),
RandomGenerator.generateRandomStr(),
RandomGenerator.generateRandomInteger(),
RandomGenerator.generateRandomStr(),
RandomGenerator.generateRandomInteger(),
RandomGenerator.generateRandomStr()
};
writer.writeNext(header);
writer.writeNext(content);
writer.close();
}
}
Main
public static void main(String[] args) {
System.out.println("Start generate");
long start = System.currentTimeMillis();
ThreadPoolExecutor threadPoolExecutor = new ThreadPoolExecutor(8, 8,
0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueue<Runnable>());
List<ParallelCsvGenerate> taskList = new ArrayList<>(3800000);
for(int i = 0; i < 3800000; i++) {
taskList.add(new ParallelCsvGenerate(i+""));
}
try {
List<Future<Integer>> futures = threadPoolExecutor.invokeAll(taskList);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println("Success");
long end = System.currentTimeMillis();
System.out.println("Using time: " + (end-start));
}
You could write directly into the file (without allocating the whole file in one StringBuilder). (I think this is the biggest time+memory bottleneck here: builder.toString())
You could generate each file in parallel.
(little tweaks:) Omit the if's inside loop.
if(i < headList.size() - 1) is not needed, when you do a more clever loop + 1 extra iteration.
The i % 2 == 0 can be eliminated by a better iteration (i+=2) ..and more labor inside the loop (i -> int, i + 1 -> string)
If applicable prefer append(char) to append(String). (Better append(',') than append(",")!)
...
You can use Fork/Join framework (java 7 and above) to make your process in parallel and use multi core of your Cpu.
I'll take an example for you.
import java.util.concurrent.ForkJoinPool;
import java.util.concurrent.ForkJoinTask;
import java.util.concurrent.RecursiveTask;
import java.util.stream.LongStream;
public class ForkJoinAdd extends RecursiveTask<Long> {
private final long[] numbers;
private final int start;
private final int end;
public static final long threshold = 10_000;
public ForkJoinAdd(long[] numbers) {
this(numbers, 0, numbers.length);
}
private ForkJoinAdd(long[] numbers, int start, int end) {
this.numbers = numbers;
this.start = start;
this.end = end;
}
#Override
protected Long compute() {
int length = end - start;
if (length <= threshold) {
return add();
}
ForkJoinAdd firstTask = new ForkJoinAdd(numbers, start, start + length / 2);
firstTask.fork(); //start asynchronously
ForkJoinAdd secondTask = new ForkJoinAdd(numbers, start + length / 2, end);
Long secondTaskResult = secondTask.compute();
Long firstTaskResult = firstTask.join();
return firstTaskResult + secondTaskResult;
}
private long add() {
long result = 0;
for (int i = start; i < end; i++) {
result += numbers[i];
}
return result;
}
public static long startForkJoinSum(long n) {
long[] numbers = LongStream.rangeClosed(1, n).toArray();
ForkJoinTask<Long> task = new ForkJoinAdd(numbers);
return new ForkJoinPool().invoke(task);
}
}
use this example
And if you want to read more about it, Guide to the Fork/Join Framework in Java | Baeldung
and Fork/Join (The Java™ Tutorials
can help you to better understand and better design your app.
be lucky.
Remove the for(int i = 0; i < 1000000; i++) loop from run method (leave a single generateCSV() call.
Create 5 million ParallelCsvGenerate objects.
Submit them to a ThreadPoolExecutor
Converted main:
public static void main(String[] args) {
ThreadPoolExecutor ex = (ThreadPoolExecutor) Executors.newFixedThreadPool(8);
for(int i = 0; i < 5000000; i++) {
ParallelCsvGenerate generate = new ParallelCsvGenerate();
ex.submit(generate);
}
ex.shutdown();
}
It takes roughly 5 minutes to complete on my laptop (4 physical cores with hyperthreading, SSD drive).
EDIT:
I've replaced FileWriter with AsynchronousFileChannel using the following code:
Path file = Paths.get(filePath + fileName + ".csv");
try(AsynchronousFileChannel asyncFile = AsynchronousFileChannel.open(file,
StandardOpenOption.WRITE,
StandardOpenOption.CREATE)) {
asyncFile.write(ByteBuffer.wrap(builder.toString().getBytes()), 0);
} catch (IOException e) {
e.printStackTrace();
}
to achieve 30% speedup.
I believe that the main bottleneck is the hard drive and filesystem itself. Not much more can be achieved here.
I am able to extract the tempo from the midi file using .getTempoInBPM() but somehow the function .setTempoFactor() that I'm using does'nt seem to alter the tempo of the music the soft synth is generating. I know the problem definitely lies with the function Thread.sleep() I'm using after the .noteOn(key,velocity) function in every iteration of a midi message.
But I cannot find a possible alternative to it.
I'm attaching my code along.
Any insight is greatly appreciated. Thanks in advance.
import java.io.*;
import java.util.*;
import javax.sound.midi.*;
public class Test1 {
public static final int NOTE_ON = 0x90;
public static final int NOTE_OFF = 0x80;
public static void main(String[] args) throws Exception {
int instrmt = 0;
int tempo = 120;
String filename = null;
Sequencer sequencer;
sequencer = MidiSystem.getSequencer();
Sequence sequence = MidiSystem.getSequence(new File("STE-003-chord.mid"));
sequencer.setSequence(sequence);
int a = 0;
Scanner input = new Scanner(System.in);
float fBPM;
System.out.println("Enter tempo");
fBPM=input.nextFloat();
float tempoBPM = sequencer.getTempoInBPM();
System.out.println(tempoBPM);
float fFactor= fBPM / tempoBPM;
sequencer.setTempoFactor((float)(fFactor));
String inst[] = new String[131];
try
{
Synthesizer synth = MidiSystem.getSynthesizer();
synth.open();
MidiChannel channels[] = synth.getChannels();
Soundbank bank = synth.getDefaultSoundbank();
synth.loadAllInstruments(bank);
Instrument instrs[] = synth.getLoadedInstruments();
for (int i=0; i < instrs.length; i++)
{
System.out.println((i+1)+". "+instrs [i].getName());
inst[i+1] = instrs[i].getName();
}
String str;
Instrument instrument = null;
System.out.println("Enter an instrument number.");
int s = input.nextInt();
str=inst[s];
for (int i=0; i < instrs.length; i++)
{
if (instrs[i].getName().equals(str))
{
instrument = instrs[i];
break;
}
}
if (instrument == null)
{
System.out.println("Instrument not compatible.");
System.exit(0);
}
Patch instrumentPatch = instrument.getPatch();
channels[0].programChange(instrumentPatch.getBank(),
instrumentPatch.getProgram());
int trackNumber = 0;
for (Track track : sequence.getTracks()) {
trackNumber++;
System.out.println();
for (int i=0; i < track.size(); i++) {
MidiEvent event = track.get(i);
//System.out.print("#" + event.getTick() + " ");
MidiMessage message = event.getMessage();
if (message instanceof ShortMessage) {
ShortMessage sm = (ShortMessage) message;
if (sm.getCommand() == NOTE_ON) {
int key = sm.getData1();
int octave = (key / 12)-1;
int note = key % 12;
// String noteName = NOTE_NAMES[note];
int velocity = sm.getData2();
channels[0].noteOn(key,velocity);
Thread.sleep(1000); //This might create the problem
} else if (sm.getCommand() == NOTE_OFF) {
int key = sm.getData1();
int octave = (key / 12)-1;
int note = key % 12;
// String noteName = NOTE_NAMES[note];
int velocity = sm.getData2();
channels[0].noteOff(key);
}
}
System.out.println();
}
}
catch (Exception exc)
{
exc.printStackTrace();
}
}
}
hey i am trying to get the ws2801 working. only the first led is lighting up but i cant controll them
i am using java with the pi4j library. i connected the 5V Pin of the raspi to the 5V of the led stripe and the ground to the raspi ground.
GPIO 19 is connected to the DO of the stripe and GPIO 23 is connected to the CI of the stripe. hope someone can help me.
Here is the code so far:
package test.test2;
import java.util.Random;
import com.pi4j.wiringpi.Gpio;
import com.pi4j.wiringpi.Spi;
public class App {
public static void main(String[] args) {
System.out.println("Test 2!");
System.out.println("Out:" + Spi.wiringPiSPISetup(0, 500000));// 1mhz=1000000
System.out.println("Out:" + Gpio.wiringPiSetupSys());
int num_leds = 110;
try {
// Using a file vs wiringPi:
//
// FileOutputStream fos = new FileOutputStream(new
// File("/dev/spidev0.0"));
// fos.write(colors);
// fos.flush(); //limited success with this
Random randomGenerator = new Random();
for (;;) {
int colorToUse1 = randomGenerator.nextInt(255);
int colorToUse2 = randomGenerator.nextInt(255);
int colorToUse3 = randomGenerator.nextInt(255);
long startTimeMillis = System.currentTimeMillis();
// fade all lights out...
for (int j = 0; j < 255; j++) {
byte[] colors = new byte[num_leds * 3];
for (int i = 0; i < num_leds * 3; i = i + 3) {
colors[i] = (byte) colorToUse1;
colors[i + 1] = (byte) colorToUse2;
colors[i + 2] = (byte) colorToUse3;
}
if (colorToUse1 != 0)
colorToUse1--;
if (colorToUse2 != 0)
colorToUse2--;
if (colorToUse3 != 0)
colorToUse3--;
Spi.wiringPiSPIDataRW(0, colors, colors.length);
Gpio.delayMicroseconds(800); // need to determine optimal value
}
System.out.println("Elapsed:" + (System.currentTimeMillis() - startTimeMillis));
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
Update: Its working. had something wired wrong. thanks anyways
I've tried to find any information regarding this but I haven't been able to find anything that helps.
I'm trying to make a program that generates a midi-file consisting of two instruments playing at once using different instruments(programs) on them. I have been using a sample program:
http://www.cs.cornell.edu/courses/cs211/2008sp/examples/MidiSynth.java.txt
as a template but when I try and create the midi events artificially(as opposed to generating them on the fly with the synth in the sample program), the resulting midi-file doesn't seem to care that I have switched programs, using the last changed-to program for every note in the file, consisting of two midi-tracks, even though I have saved program-change data to both tracks. I have pasted the code for my program beneith:
import java.io.File;
import java.io.IOException;
import javax.sound.midi.Sequence;
import javax.sound.midi.Soundbank;
import javax.sound.midi.Sequencer;
import javax.sound.midi.Synthesizer;
import javax.sound.midi.Instrument;
import javax.sound.midi.MidiChannel;
import javax.sound.midi.MidiEvent;
import javax.sound.midi.MidiMessage;
import javax.sound.midi.MidiSystem;
import javax.sound.midi.ShortMessage;
import javax.sound.midi.Track;
import javax.sound.midi.InvalidMidiDataException;
public class MidiTest2
{
/* This velocity is used for all notes.
*/
private static final int VELOCITY = 64;
final int PROGRAM = 192;
final int NOTEON = 144;
final int NOTEOFF = 128;
long startTime;
Sequence sequence;
Synthesizer synthesizer;
Sequencer sequencer;
Instrument instruments[];
ChannelData channels[];
ChannelData cc;
//int instrumentCounter = 0;
Track track;
MidiTest2(){
try{
if(synthesizer == null){
if((synthesizer = MidiSystem.getSynthesizer()) == null){
System.out.println("getSynthesizer() failed");
return;
}
}
synthesizer.open();
sequencer = MidiSystem.getSequencer();
sequence = new Sequence(Sequence.PPQ, 10);
}catch(Exception e){
e.printStackTrace();
return;
}
Soundbank sb = synthesizer.getDefaultSoundbank();
if(sb != null){
instruments = synthesizer.getDefaultSoundbank().getInstruments();
synthesizer.loadInstrument(instruments[0]);
}
MidiChannel midiChannels[] = synthesizer.getChannels();
channels = new ChannelData[midiChannels.length];
for(int i = 0; i < channels.length;++i){
channels[i] = new ChannelData(midiChannels[i], i);
}
cc = channels[0];
}
public void createShortEvent(int type, int num){
ShortMessage message = new ShortMessage();
try{
long millis = System.currentTimeMillis() - startTime;
long tick = millis * sequence.getResolution() / 500;
message.setMessage(type+cc.num, num, cc.velocity);
System.out.println("Type: " + message.getCommand() + ", Data1: " + message.getData1() + ", Data2: " + message.getData2() + ", Tick: " + tick);
MidiEvent event = new MidiEvent(message, tick);
track.add(event);
}catch (Exception e){
e.printStackTrace();
}
}
public void createShortEvent(int type, int num, int eventTime){
ShortMessage message = new ShortMessage();
try{
//long millis = System.currentTimeMillis() - startTime;
long tick = eventTime * sequence.getResolution();
message.setMessage(type+cc.num, num, cc.velocity);
System.out.println("Type: " + message.getCommand() + ", Data1: " + message.getData1() + ", Data2: " + message.getData2() + ", Tick: " + tick);
MidiEvent event = new MidiEvent(message, tick);
track.add(event);
}catch (Exception e){
e.printStackTrace();
}
}
public void saveMidiFile(){
try {
int[] fileTypes = MidiSystem.getMidiFileTypes(sequence);
if (fileTypes.length == 0) {
System.out.println("Can't save sequence");
} else {
if (MidiSystem.write(sequence, fileTypes[0], new File("testmidi.mid")) == -1) {
throw new IOException("Problems writing to file");
}
}
} catch (SecurityException ex) {
} catch (Exception ex) {
ex.printStackTrace();
}
}
void run(){
//System.out.println("sequence: " + sequence.getTracks().length);
createNewTrack(0);
createShortEvent(NOTEON, 60, 2);
createShortEvent(NOTEOFF, 60, 3);
createShortEvent(NOTEON, 61, 3);
createShortEvent(NOTEOFF, 61, 4);
createShortEvent(NOTEON, 62, 4);
createShortEvent(NOTEOFF, 62, 5);
createShortEvent(NOTEON, 63, 5);
createShortEvent(NOTEOFF, 63, 6);
createNewTrack(5);
createShortEvent(NOTEON, 50, 1);
createShortEvent(NOTEOFF, 50, 5);
playMidiFile();
saveMidiFile();
}
void printTrack(int num){
Track tempTrack = sequence.getTracks()[num];
System.out.println(tempTrack.get(0).getTick());
}
void playMidiFile(){
try{
sequencer.open();
sequencer.setSequence(sequence);
}catch (Exception e){
e.printStackTrace();
}
sequencer.start();
}
void createNewTrack(int program){
track = sequence.createTrack();
programChange(program);
}
void programChange(int program){
cc.channel.programChange(program);
System.out.println("program: " + program);
startTime = System.currentTimeMillis();
createShortEvent(PROGRAM, program);
}
public static void main(String[] args)
{
MidiTest2 mt = new MidiTest2();
mt.run();
}
}
The ChannelData-class(that doesn't do anything but I thought I'd post it for completeness sake):
public class ChannelData {
MidiChannel channel;
boolean solo, mono, mute, sustain;
int velocity, pressure, bend, reverb;
int row, col, num;
public ChannelData(MidiChannel channel, int num) {
this.channel = channel;
this.num = num;
velocity = pressure = bend = reverb = 64;
}
public void setComponentStates() {
}
}
In the program I try to create 5 notes with the acoustic piano-sound and one note with an electric piano sound. However all notes are played back with the electric piano sound even though I create a new track before I switch instrument.
I have been trying to figure this out now for 5 hours or something and I'm all out of ideas.
Tracks can help your own program with organizing events, but they do not affect the synthesizer in any way.
To be able have different settings, you must use different channels.
I have a task to write program with 1 camera, 1 kinect, a lot of video processing and then controlling a robot.
This code just shows captured video frames without processing, but I only have 20 frames/s approximately. The same simple frames displaying program in Matlab gave me 29 frames/s. I was hoping that I will win some speed in Java, but it doesn't look like that, am I doing sth wrong? If not, how I can increase the speed?
public class Video implements Runnable {
//final int INTERVAL=1000;///you may use interval
IplImage image;
CanvasFrame canvas = new CanvasFrame("Web Cam");
public Video() {
canvas.setDefaultCloseOperation(javax.swing.JFrame.EXIT_ON_CLOSE);
}
#Override
public void run() {
FrameGrabber grabber = new VideoInputFrameGrabber(0); // 1 for next camera
int i=0;
try {
grabber.start();
IplImage img;
int g = 0;
long start2 = 0;
long stop = System.nanoTime();
long diff = 0;
start2 = System.nanoTime();
while (true) {
img = grabber.grab();
if (img != null) {
// cvFlip(img, img, 1);// l-r = 90_degrees_steps_anti_clockwise
// cvSaveImage((i++)+"-aa.jpg", img);
// show image on window
canvas.showImage(img);
}
g++;
if(g%200 == 0){
stop = System.nanoTime();
diff = stop - start2;
double d = (float)diff;
double dd = d/1000000000;
double dv = dd/g;
System.out.printf("frames = %.2f\n",1/dv);
}
//Thread.sleep(INTERVAL);
}
} catch (Exception e) {
}
}
public static void main(String[] args) {
Video gs = new Video();
Thread th = new Thread(gs);
th.start();
}
}