Read parquet data from ByteArrayOutputStream instead of file

Read parquet data from ByteArrayOutputStream instead of file - java

I would like to convert this code:
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.parquet.column.page.PageReadStore;
import org.apache.parquet.example.data.simple.SimpleGroup;
import org.apache.parquet.example.data.simple.convert.GroupRecordConverter;
import org.apache.parquet.hadoop.ParquetFileReader;
import org.apache.parquet.hadoop.util.HadoopInputFile;
import org.apache.parquet.io.ColumnIOFactory;
import org.apache.parquet.io.MessageColumnIO;
import org.apache.parquet.io.RecordReader;
import org.apache.parquet.schema.MessageType;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
public class ParquetReaderUtils {
public static Parquet getParquetData(String filePath) throws IOException {
List<SimpleGroup> simpleGroups = new ArrayList<>();
ParquetFileReader reader = ParquetFileReader.open(HadoopInputFile.fromPath(new Path(filePath), new Configuration()));
MessageType schema = reader.getFooter().getFileMetaData().getSchema();
//List<Type> fields = schema.getFields();
PageReadStore pages;
while ((pages = reader.readNextRowGroup()) != null) {
long rows = pages.getRowCount();
MessageColumnIO columnIO = new ColumnIOFactory().getColumnIO(schema);
RecordReader recordReader = columnIO.getRecordReader(pages, new GroupRecordConverter(schema));
for (int i = 0; i < rows; i++) {
SimpleGroup simpleGroup = (SimpleGroup) recordReader.read();
simpleGroups.add(simpleGroup);
}
}
reader.close();
return new Parquet(simpleGroups, schema);
}
}
(which is from https://www.arm64.ca/post/reading-parquet-files-java/)
to take a ByteArrayOutputStream parameter instead of a filePath.
Is this possible? I don't see a ParquetStreamReader in org.apache.parquet.hadoop.
Any help is appreciated. I am trying to write a test app for parquet coming from kafka and writing each of many messages out to a file is rather slow.

So without deeper testing, I would try with this class (albeit the content of the outputstream should be parquet-compatible). I put there a streamId to make the identification of the processed bytearray easier (the ParquetFileReader prints the instance.toString() out if something went wrong).
public class ParquetStream implements InputFile {
private final String streamId;
private final byte[] data;
private static class SeekableByteArrayInputStream extends ByteArrayInputStream {
public SeekableByteArrayInputStream(byte[] buf) {
super(buf);
}
public void setPos(int pos) {
this.pos = pos;
}
public int getPos() {
return this.pos;
}
}
public ParquetStream(String streamId, ByteArrayOutputStream stream) {
this.streamId = streamId;
this.data = stream.toByteArray();
}
#Override
public long getLength() throws IOException {
return this.data.length;
}
#Override
public SeekableInputStream newStream() throws IOException {
return new DelegatingSeekableInputStream(new SeekableByteArrayInputStream(this.data)) {
#Override
public void seek(long newPos) throws IOException {
((SeekableByteArrayInputStream) this.getStream()).setPos((int) newPos);
}
#Override
public long getPos() throws IOException {
return ((SeekableByteArrayInputStream) this.getStream()).getPos();
}
};
}
#Override
public String toString() {
return "ParquetStream[" + streamId + "]";
}
}

Related

Flink ParquetSinkWriter FileAlreadyExistsException

I am trying to use Apache Flink write parquet file on HDFS by using BucketingSink and a custom ParquetSinkWriter.
Here is the code and above error indicate when enable checking point (call snapshotState() in BucketingSink Class) flush method from below is not quiet working. Even writer is closed with "writer.close();" but still got error from "writer = createWriter();". Any thoughts? thanks
Got error like this:
org.apache.hadoop.fs.FileAlreadyExistsException:
/user/hive/flink_parquet_fils_with_checkingpoint/year=20/month=2/day=1/hour=17/_part-4-9.in-progress
for client 192.168.56.202 already exists
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:3003)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2890)
....
.
at flink.untils.ParquetSinkWriter.flush(ParquetSinkWriterForecast.java:81)
at
org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.snapshotState(BucketingSink.java:749)
import org.apache.flink.util.Preconditions;
import org.apache.avro.Schema;
import org.apache.avro.generic.GenericData;
import org.apache.avro.generic.GenericRecord;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.parquet.avro.AvroParquetWriter;
import org.apache.parquet.hadoop.ParquetWriter;
import org.apache.parquet.hadoop.metadata.CompressionCodecName;
import java.io.IOException;
/**
* Parquet writer.
*
* #param <T>
*/
public class ParquetSinkWriter<T extends GenericRecord> implements Writer<T> {
private static final long serialVersionUID = -975302556515811398L;
private final CompressionCodecName compressionCodecName = CompressionCodecName.SNAPPY;
private final int pageSize = 64 * 1024;
private final String schemaRepresentation;
private transient Schema schema;
private transient ParquetWriter<GenericRecord> writer;
private transient Path path;
private int position;
public ParquetSinkWriter(String schemaRepresentation) {
this.schemaRepresentation = Preconditions.checkNotNull(schemaRepresentation);
}
#Override
public void open(FileSystem fs, Path path) throws IOException {
this.position = 0;
this.path = path;
if (writer != null) {
writer.close();
}
writer = createWriter();
}
#Override
public long flush() throws IOException {
Preconditions.checkNotNull(writer);
position += writer.getDataSize();
writer.close();
writer = createWriter();
return position;
}
#Override
public long getPos() throws IOException {
Preconditions.checkNotNull(writer);
return position + writer.getDataSize();
}
#Override
public void close() throws IOException {
if (writer != null) {
writer.close();
writer = null;
}
}
#Override
public void write(T element) throws IOException {
Preconditions.checkNotNull(writer);
writer.write(element);
}
#Override
public Writer<T> duplicate() {
return new ParquetSinkWriter<>(schemaRepresentation);
}
private ParquetWriter<GenericRecord> createWriter() throws IOException {
if (schema == null) {
schema = new Schema.Parser().parse(schemaRepresentation);
}
return AvroParquetWriter.<GenericRecord>builder(path)
.withSchema(schema)
.withDataModel(new GenericData())
.withCompressionCodec(compressionCodecName)
.withPageSize(pageSize)
.build();
}
}

It seems that the file You are trying to create currently exists. This is because You are using the default write mode CREATE, which fails when the file exists. What You can try to do is change Your code to use the OVERWRITE mode. You can change the createWriter() method to return something like below:
return AvroParquetWriter.<GenericRecord>builder(path)
.withSchema(schema)
.withDataModel(new GenericData())
.withCompressionCodec(compressionCodecName)
.withPageSize(pageSize)
.withWriteMode(ParquetFileWriter.Mode.OVERWRITE)
.build();

org.apache.commons.logging.Log cannot be resolved

When I am trying to declare an byte array using private byte[] startTag;.
Eclipse show this line as erroneous.
Hovering over it, I get this message:
The type org.apache.commons.logging.Log cannot be resolved. It is indirectly referenced from required .class files
I tried adding jar file in the classpath by viewing other solutions, I'm but unable to remove the error.
What should I do now?
If any specific jar file needs to be added please mention it.

import java.io.IOException;
import java.util.List;
import org.apache.hadoop.fs.BlockLocation;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.DataOutputBuffer;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.InputSplit;
import org.apache.hadoop.mapreduce.JobContext;
import org.apache.hadoop.mapreduce.RecordReader;
import org.apache.hadoop.mapreduce.TaskAttemptContext;
import org.apache.hadoop.mapreduce.lib.input.FileSplit;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
public class XmlInputFormat extends TextInputFormat {
public static final String START_TAG_KEY = "< student>";
public static final String END_TAG_KEY = "</student>";
#Override
public RecordReader<LongWritable, Text> createRecordReader(
InputSplit split, TaskAttemptContext context) {
return new XmlRecordReader();
}
public static class XmlRecordReader extends
RecordReader<LongWritable, Text> {
private byte[] startTag;
private byte[] endTag;
private long start;
private long end;
private FSDataInputStream fsin;
private DataOutputBuffer buffer = new DataOutputBuffer();
private LongWritable key = new LongWritable();
private Text value = new Text();
#Override
public void initialize(InputSplit is, TaskAttemptContext tac)
throws IOException, InterruptedException {
FileSplit fileSplit = (FileSplit) is;
String START_TAG_KEY = "<employee>";
String END_TAG_KEY = "</employee>";
startTag = START_TAG_KEY.getBytes("utf-8");
endTag = END_TAG_KEY.getBytes("utf-8");
start = fileSplit.getStart();
end = start + fileSplit.getLength();
Path file = fileSplit.getPath();
FileSystem fs =file.getFileSystem(tac.getConfiguration());
fsin = fs.open(fileSplit.getPath());
fsin.seek(start);
}
#Override
public boolean nextKeyValue() throws
IOException,InterruptedException {
if (fsin.getPos() < end) {
if (readUntilMatch(startTag, false)) {
try {
buffer.write(startTag);
if (readUntilMatch(endTag, true)) {
value.set(buffer.getData(), 0,
buffer.getLength());
key.set(fsin.getPos());
return true;
}
} finally {
buffer.reset();
}
}
}
return false;
}
#Override
public LongWritable getCurrentKey() throws IOException,
InterruptedException {
return key;
}
#Override
public Text getCurrentValue() throws IOException,
InterruptedException {
return value;
}
#Override
public float getProgress() throws IOException,
InterruptedException {
return (fsin.getPos() - start) / (float) (end - start);
}
#Override
public void close() throws IOException {
fsin.close();
}
private boolean readUntilMatch(byte[] match, boolean
withinBlock)throws IOException {
int i = 0;
while (true) {
int b = fsin.read();
if (b == -1)
return false;
if (withinBlock)
buffer.write(b);
if (b == match[i]) {
i++;
if (i >= match.length)
return true;
} else
i = 0;
if (!withinBlock && i == 0 && fsin.getPos() >= end)
return false;
}
}
}
}

I have solved the issue, finding the .jar library inside the $HADOOP_HOME.
I post an image to explain better:
I've also answered on this thread, for a similar problem:
https://stackoverflow.com/a/73427233/6685449

Using Jackcess with JCIFS to manipulate an Access database on an SMB share

I need to work with an MS Access file in Java using Jackcess. The file is located on an SMB share so I assume I would have to use JCIFS.
I tried this
String testdirectory = "smb://" + "file location";
SmbFile testsmbdir = null;
try{
testsmbdir = new SmbFile(testdirectory,auth);
}catch(Exception e){
e.printStackTrace();
}
SmbFileInputStream smbFilestream = new SmbFileInputStream(testsmbdir);
db = DatabaseBuilder.open(testsmbdir);
However, it says SMBFile can not be converted to File for the
db = DatabaseBuilder.open(testsmbdir)"
line. Also if I try using "smbFilestream" instead it says it cannot convert SmbFileInputStream to File either.
Do I have to copy the file to the local machine or something completely different? If how can I do so?
(I'm a windows user by the way. I am just converting my application to Mac so sorry if my lingo is off.)

In reply to a thread on the Jackcess forums here, James suggested that
it should be relatively straightforward to implement a version of FileChannel which works with a SmbRandomAccessFile
I just tried it in a Maven project named smb4jackcess in Eclipse, and I got it working without having to write too much code. The class I created is named SmbFileChannel:
// FileChannel using jcifs.smb.SmbRandomAccessFile
package smb4jackcess;
import java.io.IOException;
import java.net.MalformedURLException;
import java.net.UnknownHostException;
import java.nio.ByteBuffer;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.channels.FileLock;
import java.nio.channels.ReadableByteChannel;
import java.nio.channels.WritableByteChannel;
import jcifs.smb.SmbException;
import jcifs.smb.SmbFile;
import jcifs.smb.SmbRandomAccessFile;
public class SmbFileChannel extends FileChannel {
private final SmbRandomAccessFile _file;
private long _length;
public SmbFileChannel(String smbURL) throws SmbException, MalformedURLException, UnknownHostException {
_file = new SmbRandomAccessFile(smbURL, "rw", SmbFile.FILE_NO_SHARE);
_length = _file.length();
}
#Override
public void force(boolean metaData) throws SmbException, MalformedURLException, UnknownHostException {
// do nothing
}
#Override
public FileLock lock(long position, long size, boolean shared) {
throw new UnsupportedOperationException();
}
#Override
public MappedByteBuffer map(MapMode mode, long position, long size) {
throw new UnsupportedOperationException();
}
#Override
public long position() throws SmbException {
return _file.getFilePointer();
}
#Override
public FileChannel position(long newPosition) throws SmbException {
_file.seek(newPosition);
return this;
}
#Override
public int read(ByteBuffer dst) {
throw new UnsupportedOperationException();
}
#Override
public int read(ByteBuffer dst, long position) throws SmbException {
byte[] b = new byte[dst.remaining()];
_file.seek(position);
int bytesRead =_file.read(b);
dst.put(b);
return bytesRead;
}
#Override
public long read(ByteBuffer[] dsts, int offset, int length) {
throw new UnsupportedOperationException();
}
#Override
public long size() throws SmbException {
return _length;
}
#Override
public long transferFrom(ReadableByteChannel src, long position, long count) throws IOException {
ByteBuffer bb = ByteBuffer.allocate((int)count);
int bytesWritten = src.read(bb);
bb.rewind();
bb.limit(bytesWritten);
this.write(bb, position);
return bytesWritten;
}
#Override
public long transferTo(long position, long count, WritableByteChannel target) {
throw new UnsupportedOperationException();
}
#Override
public FileChannel truncate(long newSize) throws SmbException {
if (newSize < 0L) {
throw new IllegalArgumentException("negative size");
}
_file.setLength(newSize);
_length = newSize;
return this;
}
#Override
public FileLock tryLock(long position, long size, boolean shared) {
throw new UnsupportedOperationException();
}
#Override
public int write(ByteBuffer src) throws SmbException {
throw new UnsupportedOperationException();
}
#Override
public int write(ByteBuffer src, long position) throws SmbException {
byte[] b = new byte[src.remaining()];
src.get(b);
_file.seek(position);
_file.write(b);
long endPos = position + b.length;
if(endPos > _length) {
_length = endPos;
}
return b.length;
}
#Override
public long write(ByteBuffer[] srcs, int offset, int length) {
throw new UnsupportedOperationException();
}
#Override
protected void implCloseChannel() throws SmbException {
_file.close();
}
}
and the main class I used was
package smb4jackcess;
import java.nio.channels.FileChannel;
import com.healthmarketscience.jackcess.Column;
import com.healthmarketscience.jackcess.ColumnBuilder;
import com.healthmarketscience.jackcess.DataType;
import com.healthmarketscience.jackcess.Database;
import com.healthmarketscience.jackcess.Database.FileFormat;
import com.healthmarketscience.jackcess.DatabaseBuilder;
import com.healthmarketscience.jackcess.IndexBuilder;
import com.healthmarketscience.jackcess.Table;
import com.healthmarketscience.jackcess.TableBuilder;
public class Smb4jackcessMain {
public static void main(String[] args) {
String smbURL = "smb://gord:mypassword#SERVERNAME/sharename/etc/newdb.accdb";
try (SmbFileChannel sfc = new SmbFileChannel(smbURL)) {
// create a brand new database file
Database db = new DatabaseBuilder()
.setChannel(sfc)
.setFileFormat(FileFormat.V2010)
.create();
// add a table to it
Table newTable = new TableBuilder("NewTable")
.addColumn(new ColumnBuilder("ID", DataType.LONG)
.setAutoNumber(true))
.addColumn(new ColumnBuilder("TextField", DataType.TEXT))
.addIndex(new IndexBuilder(IndexBuilder.PRIMARY_KEY_NAME)
.addColumns("ID").setPrimaryKey())
.toTable(db);
// insert a row into the table
newTable.addRow(Column.AUTO_NUMBER, "This is a new row.");
db.close();
} catch (Exception e) {
e.printStackTrace(System.err);
}
}
}
Updated 2016-02-04: Code improvements. Many thanks to James at Dell Boomi for his assistance!

java.io.StreamCorruptedException in appending data

I have written this java code for appending data in ObjectOutputStream, but this code is throwing (java.io.StreamCorruptedException:). Please help if this code can not work properly then please give an alternative for appending data in ObjectOutputStream.
import java.awt.Toolkit;
import java.io.EOFException;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;
import java.io.Serializable;
import java.util.Vector;
import javax.swing.JOptionPane;
class Data implements Serializable {
private static final long serialVersionUID = 1L;
private String time;
private String note;
public Data(String time, String note) {
this.time=time;
this.note=note;
}
public String getTime() {
return time;
}
public String getNote() {
return note;
}
}
public class S extends ObjectOutputStream {
String t, n;
public S(FileOutputStream w, String time, String note) throws Exception {
super(w);
t=time;
n=note;
writeStreamHeader();
}
protected void writeStreamHeader() throws IOException {
writeObject(new Data(t,n));
reset();
}
public static void rd() {
Vector v = new Vector();
Data d;
try
{
ObjectInputStream r = new ObjectInputStream(new FileInputStream("file.cer"));
for(int i=1; i<=100; i++) {
try { v.add(r.readObject()); }
catch(EOFException exp){
r.close();
break;
}
}
for(int i=0; i<v.size(); i++) {
d = (Data)v.elementAt(i);
System.out.println(d.getNote()+" "+d.getTime());
}
}
catch(Exception exp) {
Toolkit.getDefaultToolkit().beep();
JOptionPane.showMessageDialog(null, String.format("ERROR = %s\nCLASS = S", exp.getClass()));
System.out.println(exp.getClass());
System.exit(0);
}
}
public static void main(String arg[]) throws Exception {
FileOutputStream w = new FileOutputStream("file.cer",true);
new S(w,"99:59:59:99","Maxima");
new S(w,"00:00:00:00","Minima");
rd();
}
}

if this code can not work properly
You are correct. It can not work properly.
then please give an alternative for appending data in ObjectOutputStream.
writeStreamHeader() must do nothing if the file is being appended to, but it must call super.writeStreamHeader() if the file is newly created. It certainly should not call writeObject() at any time.

map.size() not working with static map

Hey i am trying to get the size of Static map from other class...
i am defining Static map in one class...as
tasklet.class
package com.hcsc.ccsp.nonadj.subrogation.integration;
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.TreeMap;
import org.apache.commons.lang3.StringUtils;
import org.apache.log4j.LogManager;
import org.apache.log4j.Logger;
import org.springframework.batch.core.StepContribution;
import org.springframework.batch.core.scope.context.ChunkContext;
import org.springframework.batch.core.step.tasklet.Tasklet;
import org.springframework.batch.repeat.RepeatStatus;
import org.springframework.beans.factory.InitializingBean;
import org.springframework.core.io.Resource;
import org.springframework.util.Assert;
import com.hcsc.ccsp.nonadj.subrogation.batch.Subrogation;
import com.hcsc.ccsp.nonadj.subrogation.common.SubrogationConstants;
/**
* #author Manan Shah
*
*/
public class SubrogationFileTransferTasklet implements Tasklet,
InitializingBean {
private Logger logger = LogManager
.getLogger(SubrogationFileTransferTasklet.class);
private Resource inputfile;
private Resource outputfile;
public static String fileLastName;
public static String header = null;
public static String trailer = null;
public static List<Subrogation> fileDataListSubro = new ArrayList<Subrogation>();
public List<String> fileDataListS = new ArrayList<String>();
public static TreeMap<String, Subrogation> map = new TreeMap<String, Subrogation>();
public int counter = 0;
public String value;
#Override
public void afterPropertiesSet() throws Exception {
Assert.notNull(inputfile, "inputfile must be set");
}
public void setTrailer(String trailer) {
this.trailer = trailer;
}
public void setHeader(String header) {
this.header = header;
}
public String getTrailer() {
return trailer;
}
public String getHeader() {
return header;
}
public Resource getInputfile() {
return inputfile;
}
public void setInputfile(Resource inputfile) {
this.inputfile = inputfile;
}
public Resource getOutputfile() {
return outputfile;
}
public void setOutputfile(Resource outputfile) {
this.outputfile = outputfile;
}
public static void setFileDataListSubro(List<Subrogation> fileDataListSubro) {
SubrogationFileTransferTasklet.fileDataListSubro = fileDataListSubro;
}
public static List<Subrogation> getFileDataListSubro() {
return fileDataListSubro;
}
public static void setMap(TreeMap<String, Subrogation> map) {
SubrogationFileTransferTasklet.map = map;
}
public static TreeMap<String, Subrogation> getMap() {
return map;
}
#Override
public RepeatStatus execute(StepContribution contribution,
ChunkContext chunkContext) throws Exception {
value = (String) chunkContext.getStepContext().getStepExecution()
.getJobExecution().getExecutionContext().get("outputFile");
readFromFile();
return RepeatStatus.FINISHED;
}
public void readFromFile() {
BufferedReader br = null;
try {
String sCurrentLine;
br = new BufferedReader(new FileReader(inputfile.getFile()));
fileLastName = inputfile.getFile().getName();
while ((sCurrentLine = br.readLine()) != null) {
if (sCurrentLine.indexOf("TRAILER") != -1) {
setTrailer(sCurrentLine);
} else if (sCurrentLine.indexOf("HEADER") != -1) {
setHeader(sCurrentLine);
} else if (sCurrentLine.equalsIgnoreCase("")) {
} else {
fileDataListS.add(sCurrentLine);
}
}
convertListOfStringToListOfSubrogaion(fileDataListS);
writeDataToFile();
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (br != null)
br.close();
} catch (IOException ex) {
ex.printStackTrace();
}
}
}
public void convertListOfStringToListOfSubrogaion(List<String> list) {
Iterator<String> iterator = list.iterator();
while (iterator.hasNext()) {
Subrogation subrogration = new Subrogation();
String s = iterator.next();
subrogration.setGRP_NBR(StringUtils.substring(s, 0, 6));
subrogration.setSECT_NBR(StringUtils.substring(s, 6, 10));
subrogration.setAFP_VAL(StringUtils.substring(s, 10, 13));
subrogration.setDOL_MIN_VAL(StringUtils.substring(s, 13, 20));
subrogration
.setCORP_ENT_CD(StringUtils.substring(s, 20, s.length()));
map.put(subrogration.getGRP_NBR() + subrogration.getSECT_NBR(),
subrogration);
fileDataListSubro.add(subrogration);
}
}
public void writeDataToFile() {
try {
File file = new File(value);
if (!file.exists()) {
logger.info("output file is:-" + file.getAbsolutePath());
file.createNewFile();
}
FileWriter fw = new FileWriter(file.getAbsoluteFile());
BufferedWriter bw = new BufferedWriter(fw);
Iterator it = map.entrySet().iterator();
while (it.hasNext()) {
Map.Entry subrogation = (Map.Entry) it.next();
// System.out.println(subrogation.getKey() + " = " +
// subrogation.getValue());
// it.remove(); // avoids a ConcurrentModificationException
bw.append(subrogation.getValue().toString()
+ SubrogationConstants.filler58);
}
bw.close();
} catch (IOException e) {
e.printStackTrace();
}
logger.info("subrogationFileTransferTasklet Step completes");
}
}
In processor i want to put map size into int.
processor.class
package com.hcsc.ccsp.nonadj.subrogation.processor;
import org.apache.commons.lang3.StringUtils;
import org.springframework.batch.item.ItemProcessor;
import com.hcsc.ccsp.nonadj.subrogation.Utils.SubrogationUtils;
import com.hcsc.ccsp.nonadj.subrogation.batch.Subrogation;
import com.hcsc.ccsp.nonadj.subrogation.common.SubrogationConstants;
import com.hcsc.ccsp.nonadj.subrogation.integration.SubrogationFileTransferTasklet;
public class SubrogationProcessor implements
ItemProcessor<Subrogation, Subrogation> {
public SubrogationFileTransferTasklet fileTransferTasklet = new SubrogationFileTransferTasklet();
SubrogationUtils subrogationUtils = new SubrogationUtils();
public int countFromFile=SubrogationFileTransferTasklet.map.size();
public static int totalRecords = 0;
public static int duplicate = 0;
#Override
public Subrogation process(Subrogation subrogration) throws Exception {
// TODO Auto-generated method stub
if (subrogationUtils.validateData(subrogration)) {
Subrogation newSubro = new Subrogation();
newSubro.setGRP_NBR(StringUtils.leftPad(subrogration.getGRP_NBR()
.trim(), SubrogationConstants.length6, "0"));
if (subrogration.getSECT_NBR().trim().length() < 5) {
newSubro.setSECT_NBR(StringUtils.leftPad(subrogration
.getSECT_NBR().trim(), SubrogationConstants.length4,
"0"));
} else if (subrogration.getSECT_NBR().trim().length() == 5) {
newSubro.setSECT_NBR(StringUtils.substring(subrogration.getSECT_NBR().trim(), 1));
} else {
return null;
}
newSubro.setAFP_VAL(StringUtils.leftPad(subrogration.getAFP_VAL()
.trim(), SubrogationConstants.length3, "0"));
if (subrogration.getDOL_MIN_VAL().trim().contains(".")) {
newSubro.setDOL_MIN_VAL(StringUtils.leftPad(StringUtils.substring(subrogration.getDOL_MIN_VAL(),0,subrogration.getDOL_MIN_VAL().indexOf(".")), SubrogationConstants.length7,
"0"));
} else {
newSubro.setDOL_MIN_VAL(StringUtils.leftPad(subrogration
.getDOL_MIN_VAL().trim(), SubrogationConstants.length7,
"0"));
}
newSubro.setCORP_ENT_CD(StringUtils.substring(
subrogration.getCORP_ENT_CD(), 0, 2));
if (SubrogationFileTransferTasklet.map.containsKey(newSubro
.getGRP_NBR() + newSubro.getSECT_NBR())) {
duplicate++;
return null;
} else {
if(SubrogationFileTransferTasklet.fileLastName.contains("TX")){
if(newSubro.getCORP_ENT_CD().equalsIgnoreCase("TX")){
SubrogationFileTransferTasklet.map.put(newSubro
.getGRP_NBR() + newSubro.getSECT_NBR(), newSubro);
totalRecords++;
return newSubro;
}
}
else{
if(SubrogationFileTransferTasklet.fileLastName.contains("IL")){
if(!newSubro.getCORP_ENT_CD().equalsIgnoreCase("TX"))
{
newSubro.setCORP_ENT_CD("IL");
SubrogationFileTransferTasklet.map.put(newSubro
.getGRP_NBR() + newSubro.getSECT_NBR(), newSubro);
totalRecords++;
return newSubro;
}
}
else{
return null;
}
}
return null;
}
}
else {
return null;
}
}
}
class SubrogrationException extends RuntimeException {
private static final long serialVersionUID = -8971030257905108630L;
public SubrogrationException(String message) {
super(message);
}
}
and at last i want to use that countFromFile in other class..
writer.class
package com.hcsc.ccsp.nonadj.subrogation.writer;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.io.LineNumberReader;
import java.io.Writer;
import java.util.Date;
import java.util.List;
import org.apache.commons.lang3.StringUtils;
import org.apache.log4j.LogManager;
import org.apache.log4j.Logger;
import org.springframework.batch.item.ItemStreamException;
import org.springframework.batch.item.ItemWriter;
import org.springframework.batch.item.file.FlatFileFooterCallback;
import org.springframework.batch.item.file.FlatFileHeaderCallback;
import com.hcsc.ccsp.nonadj.subrogation.Utils.SubrogationUtils;
import com.hcsc.ccsp.nonadj.subrogation.batch.Subrogation;
import com.hcsc.ccsp.nonadj.subrogation.common.SubrogationConstants;
import com.hcsc.ccsp.nonadj.subrogation.integration.SubrogationFileTransferTasklet;
import com.hcsc.ccsp.nonadj.subrogation.processor.SubrogationProcessor;
public class SubrogationHeaderFooterWriter implements FlatFileFooterCallback,FlatFileHeaderCallback{
private Logger logger = LogManager
.getLogger(SubrogationHeaderFooterWriter.class);
SubrogationFileTransferTasklet fileTransferTasklet = new SubrogationFileTransferTasklet();
SubrogationUtils subrogationUtils=new SubrogationUtils();
SubrogationProcessor processor=new SubrogationProcessor();
private ItemWriter<Subrogation> delegate;
public void setDelegate(ItemWriter<Subrogation> delegate) {
this.delegate = delegate;
}
public ItemWriter<Subrogation> getDelegate() {
return delegate;
}
#Override
public void writeHeader(Writer writer) throws IOException {
//writer.write(SubrogationFileTransferTasklet.header);
}
#Override
public void writeFooter(Writer writer) throws IOException {
String trailer = SubrogationFileTransferTasklet.trailer;
String s1 = StringUtils.substring(trailer, 0, 23);
logger.info(" Data from input file size is---- "+new SubrogationProcessor().countFromFile);
int trailerCounter=new SubrogationProcessor().countFromFile+SubrogationProcessor.totalRecords;
logger.info(" Data comming from database is"+SubrogationProcessor.totalRecords);
logger.info(" Duplicate data From DataBase is " +SubrogationProcessor.duplicate);
logger.info(" Traileer is " + s1+ trailerCounter);
writer.write(s1 + trailerCounter);
SubrogationFileTransferTasklet.map.clear();
SubrogationFileTransferTasklet.fileDataListSubro.clear();
SubrogationProcessor.totalRecords=0;
SubrogationProcessor.duplicate=0;
}
public void writeErrorDataToFile(List<String> errorDataList,String errorfile){
File file;
try {
file = new File(errorfile);
logger.info("error file is "+errorfile);
FileWriter fileWriter = new FileWriter(file,true);
BufferedWriter bufferWritter = new BufferedWriter(fileWriter);
for(String data:errorDataList){
bufferWritter.write(new Date()+" "+data);
bufferWritter.write(SubrogationConstants.LINE_SEPARATOR);
}
bufferWritter.close();
}
catch (IOException e) {
throw new ItemStreamException("Could not convert resource to file: [" + errorfile + "]", e);
}
}
/*
public void write(List<? extends Subrogation> subrogation) throws Exception {
System.out.println("inside writer");
delegate.write(subrogation);
}*/
}
so here in logger massage.size prints 0....
I am not able to understand why???

Do in this way to make sure that It is initialized with the current size of the map when object is constructed.
class SubrogationProcessor{
public int countFromFile;
public SubrogationProcessor(){
countFromFile=SubrogationFileTransferTasklet.map.size();
}
}

This depends on when the "map.put" line of code is run. Is it in a static block in the tasklet class?
If processor instance is initialized before record has been added to the map then map.size() will indeed be 0.
my suggestion would be to add the map into a static block if at all possible or to debug the code and see when the .put() method is being called in comparison to when the .size() method is called
public static TreeMap<String, Subrogation> map = new TreeMap<String, Subrogation>();
static{
map.put(subrogration.getGRP_NBR() + subrogration.getSECT_NBR(), subrogration);
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Read parquet data from ByteArrayOutputStream instead of file - java

Related

Flink ParquetSinkWriter FileAlreadyExistsException

org.apache.commons.logging.Log cannot be resolved

Using Jackcess with JCIFS to manipulate an Access database on an SMB share

java.io.StreamCorruptedException in appending data

map.size() not working with static map

Categories

Resources