Java parsing ADIF file - java

ADIF format is described here: http://www.adif.org/ I'm trying to make Adif parser. Here is portion of Adif file to be parsed:
ADIF 2 Export from eQSL.cc
Received eQSLs for IZ1080SWL
for QSOs between 10-Aug-2015 and 31-Dec-2035
Generated on Sunday, October 18, 2015 at 00:48:50 AM UTC
<PROGRAMID:21>eQSL.cc DownloadInBox
<ADIF_Ver:1>2
<EOH>
<CALL:6>RA1QEA<QSO_DATE:8:D>20150829<TIME_ON:4>0455<BAND:3>30m<MODE:2>CW<RST_SENT:3>SWL<RST_RCVD:0><QSL_SENT:1>Y<QSL_SENT_VIA:1>E<APP_EQSL_AG:1>Y<GRIDSQUARE:6>lo19aq<EOR>
<CALL:5>F6HKA<QSO_DATE:8:D>20150910<TIME_ON:4>0400<BAND:3>80m<MODE:2>CW<RST_SENT:3>swl<RST_RCVD:0><QSL_SENT:1>Y<QSL_SENT_VIA:1>E<QSLMSG:34>Thanks for the SWL report. 73 Bert<APP_EQSL_AG:1>Y<GRIDSQUARE:6>JN05ot<EOR>
<CALL:5>DL5ZL<QSO_DATE:8:D>20150912<TIME_ON:4>2229<BAND:3>30m<MODE:2>CW<RST_SENT:3>599<RST_RCVD:0><QSL_SENT:1>Y<QSL_SENT_VIA:1>E<QSLMSG:28>tks, paper qsl is on the way<APP_EQSL_AG:1>Y<GRIDSQUARE:6>JO51jl<EOR>
<CALL:5>4Z5ML<QSO_DATE:8:D>20150915<TIME_ON:4>0504<BAND:3>20m<MODE:2>CW<RST_SENT:3>599<RST_RCVD:0><QSL_SENT:1>Y<QSL_SENT_VIA:1>E<APP_EQSL_AG:1>Y<GRIDSQUARE:4>km72<EOR>
I try this parser:
public void read() throws IOException {
BufferedReader br = new BufferedReader(new FileReader(filePath));
int intValue;
boolean createToken = false;
boolean createSize = false;
StringBuffer token = new StringBuffer();
StringBuffer size = new StringBuffer();
Adif2Record record = new Adif2Record();
while ((intValue = br.read()) != -1) {
char cValue = (char)intValue;
if (cValue == '\n') {
continue;
}
if (cValue == '<') {
createToken = true;
continue;
}
if (cValue == ':') {
createToken = false;
createSize = true;
continue;
}
if (cValue == '>') {
if ("eor".equalsIgnoreCase(token.toString())) {
records.add(record);
record = new Adif2Record();
token.setLength(0);
size.setLength(0);
continue;
}
createSize = false;
createData(br, token.toString(), str2int(size.toString()), record);
size.setLength(0);
token.setLength(0);
}
if (createToken) {
token.append(cValue);
}
if (createSize) {
size.append(cValue);
}
}
}
but I end up only with one token "PROGRAMID" and the rest of file becomes data for this token. The portion before EOF token is a header and I would not like to slice it off completely but I don't understand why createSize is keeping true after PROGRAMID, according to idea it should reset to false after each loop. Can someone help?

You're missing logic to handle the header. Basically the header seems to be allowed to contain text including : which means you have to add a check, it a tag is being parsed to the case where you get a : char.
Furthermore you need to handle data types appropriately, since otherwise the type is simply appended to the size.
Also you should use StringBuilder instead of StringBuffer, since the latter also does synchronisation which just decreases the performance in this case without providing any benefits.
The following code also replaces some of the ifs with switch statements.
For simplicity it uses another record for the header data...
public static void createData(BufferedReader br, String token, int size, Adif2Record record) throws IOException {
StringBuilder sb = new StringBuilder(size);
for (int i = 0; i < size; i++) {
int c = br.read();
if (c == -1) {
throw new IOException("Unexpected end of input");
}
sb.appendCodePoint((char) c);
}
record.setData(token, sb.toString());
}
private List<Adif2Record> records = new ArrayList<>();
public void read() throws IOException {
BufferedReader br = new BufferedReader(new FileReader(filePath));
int intValue;
boolean createToken = false;
boolean createSize = false;
boolean createType = false;
StringBuilder token = new StringBuilder();
StringBuilder size = new StringBuilder();
Adif2Record record = new Adif2Record();
while ((intValue = br.read()) != -1) {
switch (intValue) {
case '\n':
break;
case '<':
createToken = true;
break;
case ':':
if (createToken) {
// not in header
createToken = false;
createSize = true;
} else if (createSize) {
createType = true;
createSize = false;
}
break;
case '>':
switch (token.toString().toLowerCase()) {
case "eor":
case "eoh":
records.add(record);
record = new Adif2Record();
break;
default:
createSize = false;
createType = false;
createData(br, token.toString(), str2int(size.toString()), record);
}
token.setLength(0);
size.setLength(0);
break;
default:
char cValue = (char) intValue;
if (createToken) {
token.append(cValue);
}
if (createSize) {
size.append(cValue);
}
if (createType) {
// TODO
}
}
}
}
private static int str2int(String s) {
return s.isEmpty() ? 0 : Integer.parseInt(s);
}
public class Adif2Record {
private final Map<String, String> data = new HashMap<>();
public void setData(String key, String value) {
data.put(key, value);
}
public Map<String, String> getData() {
return data;
}
}

Related

How to implement hasNext() method with BufferedReader + StringTokenizer?

I have this fast reader class:
static class FastReader {
BufferedReader br;
StringTokenizer st;
public FastReader() {
br = new BufferedReader(new InputStreamReader(System.in));
}
String next() {
while (st == null || !st.hasMoreElements()) {
try {
st = new StringTokenizer(br.readLine());
}
catch (IOException e) {
e.printStackTrace();
}
}
return st.nextToken();
}
int nextInt() {
return Integer.parseInt(next());
}
long nextLong() {
return Long.parseLong(next());
}
double nextDouble() {
return Double.parseDouble(next());
}
String nextLine() {
String str = "";
try {
str = br.readLine();
}
catch (IOException e) {
e.printStackTrace();
}
return str;
}
int[] readArray(int n) {
int[] a = new int[n];
for(int i=0; i<n; i++) {
a[i] = nextInt();
}
return a;
}
}
I want to stop reading input as soon as I reach end of file. I know this can be done using scanner hasNext() method, How can I implement the same method for my reader class?
PS. I want to read input for this question:
https://www.spoj.com/problems/COINS/
You can set a mark() then try to read a line, if it is successful you could reset() to where you set the mark and return true and false otherwise.
You would also have to check if the current tokenizer has more tokens left. For example:
boolean hasNext() {
if (st != null && st.hasMoreTokens()) {
return true;
}
String tmp;
try {
br.mark(1000);
tmp = br.readLine();
if (tmp == null) {
return false;
}
br.reset();
} catch (IOException e) {
return false;
}
return true;
}
mark() takes readAheadLimit argument to limit the number of characters that can be read while having that mark set, you could increase that if you're dealing with long lines.
You can cover your code in a while loop with the condition that the user input is not null.
Example:
BuffedReader br = new BufferedReader(new InputStreamReader(System.in);
String key = "";
while((key = br.readLine()) != null)
{
//your code
}

I want to read a text file and split it based on column value

public class FileSplitter2 {
public static void main(String[] args) throws IOException {
String filepath = "D:\\temp\\test.txt";
BufferedReader reader = new BufferedReader(new FileReader(filepath));
String strLine;
boolean isFirst = true;
String strGroupByColumnName = "city";
int positionOgHeader = 0;
FileWriter objFileWriter;
Map<String, FileWriter> groupByMap = new HashMap<String, FileWriter>();
while ((strLine = reader.readLine()) != null) {
String[] splitted = strLine.split(",");
if (isFirst) {
isFirst = false;
for (int i = 0; i < splitted.length; i++) {
if (splitted[i].equalsIgnoreCase(strGroupByColumnName)) {
positionOgHeader = i;
break;
}
}
}
String strKey = splitted[positionOgHeader];
if (!groupByMap.containsKey(strKey)) {
groupByMap.put(strKey, new FileWriter("D:/TestExample/" + strKey + ".txt"));
}
FileWriter fileWriter = groupByMap.get(strKey);
fileWriter.write(strLine);
}
for (Map.Entry<String,FileWriter> entry : groupByMap.entrySet()) {
entry.getKey();
}
}
}
This is my code. I am not getting a proper result. The file contains 10 columns, and the 5th column is 'city'. There are 10 different cities in a file. I need to split each city a separate file.
You are not calling close on all the FileWriter and hence the data may not get flushed to the file.
See FileWriter is not writing in to a file
At the end of the processing,
groupByMap.values().forEach(fileWriter -> {
try {
fileWriter.close();
} catch (IOException e) {
e.printStackTrace(); //Add appropriate error handling
}
});
There is a bug in your code. You need to move the statements after the if (isFirst) block into the else block. Else, it will create a city.txt file too.

Read String with RandomAccessFile from file with different encoding

I have a big file encoded 1250. Lines are just single polish words one after another:
zając
dzieło
kiepsko
etc
I need to choose random 10 unique lines from this file in a quite fast way. I did this but when I print these words they have wrong encoding [zaj?c, dzie?o, kiepsko...], I need UTF8. So I changed my code to read bytes from file not just read lines, so my efforts ended up with this code:
public List<String> getRandomWordsFromDictionary(int number) {
List<String> randomWords = new ArrayList<String>();
File file = new File("file.txt");
try {
RandomAccessFile raf = new RandomAccessFile(file, "r");
for(int i = 0; i < number; i++) {
Random random = new Random();
int startPosition;
String word;
do {
startPosition = random.nextInt((int)raf.length());
raf.seek(startPosition);
raf.readLine();
word = grabWordFromDictionary(raf);
} while(checkProbability(word));
System.out.println("Word: " + word);
randomWords.add(word);
}
} catch (IOException ioe) {
logger.error(ioe.getMessage(), ioe);
}
return randomWords;
}
private String grabWordFromDictionary(RandomAccessFile raf) throws IOException {
byte[] wordInBytes = new byte[15];
int counter = 0;
byte wordByte;
char wordChar;
String convertedWord;
boolean stop = true;
do {
wordByte = raf.readByte();
wordChar = (char)wordByte;
if(wordChar == '\n' || wordChar == '\r' || wordChar == -1) {
stop = false;
} else {
wordInBytes[counter] = wordByte;
counter++;
}
} while(stop);
if(wordInBytes.length > 0) {
convertedWord = new String(wordInBytes, "UTF8");
return convertedWord;
} else {
return null;
}
}
private boolean checkProbability(String word) {
if(word.length() > MAX_LENGTH_LINE) {
return true;
} else {
double randomDouble = new Random().nextDouble();
double probability = (double) MIN_LENGTH_LINE / word.length();
return probability <= randomDouble;
}
}
But something is wrong. Could you look at this code and help me? Maybe you see some obvious errors but not obvious for me? I will appreciate any help.
Your file is in 1250, so you need to decode it in 1250, not UTF-8. You can save it as UTF-8 after the decoding process though.
Charset w1250 = Charset.forName("Windows-1250");
convertedWord = new String(wordInBytes, w1250);

CSVReader and InputStream

I have created CSVReader and I am trying to read csv file from assets for that reason I should use InputStream. But my code below does not have inputstream constructor. Could anyone tell me how i could add or change something in code, so I can use inputstream.
public class CSVReader {
private BufferedReader br;
private boolean hasNext = true;
private char separator;
private char quotechar;
private int skipLines;
private boolean linesSkiped;
public int linesCount = 0;
public static final char DEFAULT_SEPARATOR = '|';
public static final char DEFAULT_QUOTE_CHARACTER = '"';
public static final int DEFAULT_SKIP_LINES = 0;
public CSVReader(Reader reader) {
this(reader, DEFAULT_SEPARATOR, DEFAULT_QUOTE_CHARACTER,
DEFAULT_SKIP_LINES);
}
public CSVReader(Reader reader, char separator, char quotechar, int line) {
this.br = new BufferedReader(reader);
this.separator = separator;
this.quotechar = quotechar;
this.skipLines = line;
}
public String[] readNext() throws IOException {
String nextLine = getNextLine();
return hasNext ? parseLine(nextLine) : null;
}
public String getNextLine() throws IOException {
if (!this.linesSkiped) {
for (int i = 0; i < skipLines; i++) {
br.readLine();
}
this.linesSkiped = true;
}
String nextLine = br.readLine();
if (nextLine == null) {
hasNext = false;
}
return hasNext ? nextLine : null;
}
public List<String[]> readAll() throws IOException {
List<String[]> allElements = new ArrayList<String[]>();
while (hasNext) {
String[] nextLineAsTokens = readNext();
if (nextLineAsTokens != null)
allElements.add(nextLineAsTokens);
}
return allElements;
}
private String[] parseLine(String nextLine) throws IOException {
if (nextLine == null) {
return null;
}
List<String> tokensOnThisLine = new ArrayList<String>();
StringBuffer sb = new StringBuffer();
boolean inQuotes = false;
do {
if (inQuotes) {
// continuing a quoted section, reappend newline
sb.append("\n");
nextLine = getNextLine();
linesCount++;
if (nextLine == null)
break;
}
for (int i = 0; i < nextLine.length(); i++) {
char c = nextLine.charAt(i);
if (c == quotechar) {
if( inQuotes
&& nextLine.length() > (i+1)
&& nextLine.charAt(i+1) == quotechar ){
sb.append(nextLine.charAt(i+1));
i++;
}else{
inQuotes = !inQuotes;
if(i>2
&& nextLine.charAt(i-1) != this.separator
&& nextLine.length()>(i+1) &&
nextLine.charAt(i+1) != this.separator
){
sb.append(c);
}
}
} else if (c == separator && !inQuotes) {
tokensOnThisLine.add(sb.toString());
sb = new StringBuffer();
} else {
sb.append(c);
}
}
} while (inQuotes);
tokensOnThisLine.add(sb.toString());
return (String[]) tokensOnThisLine.toArray(new String[0]);
}
public void close() throws IOException{
br.close();
}
}
You can construct an InputStreamReader from that InputStream
new InputStreamReader(myInputStream, encoding)
Where myInputStream is your InputStream and encoding is a String that defines the encoding used by your datasource.
You can call your CSVReader like this:
new CSVReader(new InputStreamReader(myInputStream, encoding));

How do i skip comments with buffer reader?

I have written the following program to read from a file and skip comments, it works for single line comments, but not for multi line ones. Does anyone know why? I don't need to worry about "//" in Strings. And only java comments ie "//" and "/* */"
code:
import java.io.*;
public class IfCounter2
{
public static boolean lineAComment(String line)
{
if (line.contains("//"))
return true;
return false;
}
public static boolean multiLineCommentStart(String line)
{
if (line.contains("/*"))
return true;
return false;
}
public static boolean multiLineCommentEnd(String line)
{
if (line.contains("*/"))
return true;
return false;
}
public static void main(String[] args) throws IOException
{
String fileName = args[0];
int numArgs = args.length;
int ifCount = 0;
// create a new BufferReader
BufferedReader reader = new BufferedReader(new FileReader(fileName));
String line = null;
StringBuilder stringBuilder = new StringBuilder();
String ls = System.getProperty("line.separator");
line = reader.readLine();
// read from the text file
boolean multiLineComment = false;
while (( line = reader.readLine()) != null)
{
if (!multiLineCommentStart(line))
{
multiLineComment = true;
}
if (multiLineComment) {
if (!multiLineCommentEnd(line))
{
multiLineComment = false;
}
}
if (!lineAComment(line) && !multiLineComment)
{
stringBuilder.append(line);
stringBuilder.append(ls);
}
}
// create a new string with stringBuilder data
String tempString = stringBuilder.toString();
System.out.println(stringBuilder.toString());
}
}
You only set multiLineComment to true when !multiLineCommentStart(line) is true - that is, whenever the line does not contain /*.
Basically, your code should look sth like this (untested)
boolean multiLineComment = false;
while (( line = reader.readLine()) != null)
{
if (multiLineCommentStart(line))
{
multiLineComment = true;
}
if (multiLineComment) {
if (multiLineCommentEnd(line))
{
multiLineComment = false;
}
}
if (!lineAComment(line) && (multiLineComment == false))
{
stringBuilder.append(line);
stringBuilder.append(ls);
}
}
in that last if statement, you need to have an expression with your variable and a fixed
Andy's answer is right on the money but needs a validation in last if to make sure you are not counting */ as a valid line:
boolean multiLineComment = false;
while (( line = reader.readLine()) != null)
{
if (multiLineCommentStart(line))
{
multiLineComment = true;
}
if (multiLineComment) {
if (multiLineCommentEnd(line))
{
multiLineComment = false;
}
}
if (!lineAComment(line) && (multiLineComment == false) &&
!multiLineCommentEnd(line) )
{
stringBuilder.append(line);
stringBuilder.append(ls);
}
}

Categories