Transform String args[] into an object [closed] - java

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I need to parse the command line arguments and transform them into a Java object.
My command line to run the Java .jar:
java -cp "combo.jar" com.ascurra.Main --time=3 --limit=5000 --initDate=2017-01-01.13:00:00
I need transform the arguments --time=3 --limit=5000 --initDate=2017-01-01.13:00:00 into an object and save it to my database.
How do this in an elegant way?

First of all, create a Class with the corresponding fields
class Entry {
private int time;
private int limit;
private Date initDate;
public Entry() {
}
public Date getInitDate() {
return initDate;
}
public void setInitDate(Date initDate) {
this.initDate = initDate;
}
public int getLimit() {
return limit;
}
public void setLimit(int limit) {
this.limit = limit;
}
public int getTime() {
return time;
}
public void setTime(int time) {
this.time = time;
}
}
Then create an object of this class and parse the arguments to set the values
public static void main(String[] args) {
ArrayList<String> options = new ArrayList<>();
for (String arg : args) { // get the options from the arguments
if (arg.startsWith("--")) {
options.add(arg.replace("--", ""));
}
}
Entry entry = new Entry();
for (String option : options) {
String[] pair = option.split("=");
if (pair.length == 2) {
if (pair[0].equals("time")) { // parse time option
entry.setTime(Integer.parseInt(pair[1]));
} else if (pair[0].equals("limit")) { // parse limit option
entry.setLimit(Integer.parseInt(pair[1]));
} else if (pair[0].equals("initDate")) { // parse initDate option
SimpleDateFormat sdf = new SimpleDateFormat(
"yyyy-MM-dd.HH:mm:ss");
try {
entry.setInitDate(sdf.parse(pair[1]));
} catch (ParseException e) {
e.printStackTrace();
}
}
}
}
System.out.println(entry.getLimit() + " , " + entry.getTime() + " , "
+ entry.getInitDate());
}

I'd create a stream of the args array, map to retain the part of the string needed i.e:
String[] resultArray = Arrays.stream(args)
.map(s -> s.substring(s.indexOf("=") + 1))
.toArray(String[]::new);
the array should now contain:
[3, 5000, 2017-01-01.13:00:00]
in which case you can index into this array, then convert to any other type needed and populate your custom object.
Alternatively, as there are only 3 arguments, you could skip creating the stream entirely and just index into the array along with the use of substring to retain the parts needed. However, the approach above is more adaptable as if you were to enter more arguments, you need not change anything within your code in terms of retrieving the arguments.

how do this in an elegant way?
These three information(--time=3 --limit=5000 --initDate=2017-01-01.13:00:00) are passed as a specific element in String[] args of the main class.
Parsing them is really not a complex task (String.substring() or a regex will do the job).
But a good parser should also be able to not be annoyed by the order of arguments and should also think to produce relevant debugging information during data mapping to specific type as date or numeric types.
At last, adding or removing a supported parameter should be easy and safe and getting commands help could also be desirable.
So as first advise, if you can use a library, don't reinvent the wheel and use
Apache Commons CLI or better use arg4j that is reall simple to use and avoid boiler plate code.
If you cannot, at least inspire you from them.
Apache Commons CLI example
For example to create Options (arguments) :
public static final String TIME_ARG = "time";
public static final String LIMIT_ARG = "limit";
...
Options options = new Options();
options.addOption("t", TIME_ARG, true, "current time");
options.addOption("l", LIMIT_ARG, true, "limit of ...");
...
Then parse Options and retrieve value of it:
public static void main(String[] args) {
...
try{
CommandLineParser parser = new DefaultParser();
CommandLine cmd = parser.parse(options, args);
...
// then retrieve arguments
Integer time = null;
Integer limit = null;
LocalDateTime localDateTime = null;
String timeRaw = cmd.getOptionValue(TIME_ARG);
if (timeRaw.matches("\\d*")) {
time = Integer.valueOf(timeRaw);
}
...and so for until you create your object to save
MyObj obj = new MyObj(time, limit, localDateTime);
...
}
catch(ParseException exp ) {
System.out.println( "Unexpected exception:" + exp.getMessage() );
}
}
args4j example
args4j is much straight to use.
Besides, it provides some converters (from String to specific types) but date conversion is not provided out of the box.
So you should create your own handler to do that.example.
In the example, LocalDateTimeOptionHandler must so implement [OptionHandler][3].
import org.kohsuke.args4j.CmdLineException;
import org.kohsuke.args4j.CmdLineParser;
import org.kohsuke.args4j.Option;
import org.kohsuke.args4j.OptionHandlerFilter;
public class SampleMain {
#Option(name = "--time", usage = "...")
private Integer time;
#Option(name = "--limit", usage = "..")
private Integer limit;
#Option(name="--initDate", handler=LocalDateTimeOptionHandler.class, usage="...")
private LocalDateTime initDate;
public static void main(String[] args) throws IOException {
new SampleMain().doMain(args);
}
public void doMain(String[] args) throws IOException {
CmdLineParser parser = new CmdLineParser(this);
try {
// parse the arguments.
parser.parseArgument(args);
} catch (CmdLineException e) {
System.err.println(e.getMessage());
System.err.println("java SampleMain [options...] arguments...");
parser.printUsage(System.err);
System.err.println(" Example: java SampleMain" + parser.printExample(OptionHandlerFilter.ALL));
return;
}
if (time != null)
System.out.println("-time is set");
if (limit != null)
System.out.println("-limit is set");
if (initDate != null)
System.out.println("-initDate is set");
}
}

Related

Java sanitizing Arraylist records suggestions

I am looking for an idea how to accomplish this task. So I'll start with how my program is working.
My program reads a CSV file. They are key value pairs separated by a comma.
L1234456,ygja-3bcb-iiiv-pppp-a8yr-c3d2-ct7v-giap-24yj-3gie
L6789101,zgna-3mcb-iiiv-pppp-a8yr-c3d2-ct7v-gggg-zz33-33ie
etc
Function takes a file and parses it into an arrayList of String[]. The function returns the ArrayList.
public ArrayList<String[]> parseFile(File csvFile) {
Scanner scan = null;
try {
scan = new Scanner(csvFile);
} catch (FileNotFoundException e) {
}
ArrayList<String[]> records = new ArrayList<String[]>();
String[] record = new String[2];
while (scan.hasNext()) {
record = scan.nextLine().trim().split(",");
records.add(record);
}
return records;
}
Here is the code, where I am calling parse file and passing in the CSVFile.
ArrayList<String[]> Records = parseFile(csvFile);
I then created another ArrayList for files that aren't parsed.
ArrayList<String> NotParsed = new ArrayList<String>();
So the program then continues to sanitize the key value pairs separated by a comma. So we first start with the first key in the record. E.g L1234456. If the record could not be sanitized it then it replaces the current key with "CouldNOtBeParsed" text.
for (int i = 0; i < Records.size(); i++) {
if(!validateRecord(Records.get(i)[0].toString())) {
Logging.info("Records could not be parsed " + Records.get(i)[0]);
NotParsed.add(srpRecords.get(i)[0].toString());
Records.get(i)[0] = "CouldNotBeParsed";
} else {
Logging.info(Records.get(i)[0] + " has been sanitized");
}
}
Next we do the 2nd key in the key value pair e.g ygja-3bcb-iiiv-pppp-a8yr-c3d2-ct7v-giap-24yj-3gie
for (int i = 0; i < Records.size(); i++) {
if(!validateRecordKey(Records.get(i)[1].toString())) {
Logging.info("Record Key could not be parsed " + Records.get(i)[0]);
NotParsed.add(Records.get(i)[1].toString());
Records.get(i)[1] = "CouldNotBeParsed";
} else {
Logging.info(Records.get(i)[1] + " has been sanitized");
}
}
The problem is that I need both keyvalue pairs to be sanitized, make a separate list of the keyValue pairs that could not be sanitized and a list of the ones there were sanitized so they can be inserted into a database. The ones that cannot will be printed out to the user.
I thought about looping thought the records and removing the records with the "CouldNotBeParsed" text so that would just leave the ones that could be parsed. I also tried removing the records from the during the for loop Records.remove((i)); However that messes up the For loop because if the first record could not be sanitized, then it's removed, the on the next iteration of the loop it's skipped because record 2 is now record 1. That's why i went with adding the text.
Atually I need two lists, one for the Records that were sanitized and another that wasn't.
So I was thinking there must be a better way to do this. Or a better method of sanitizing both keyValue pairs at the same time or something of that nature. Suggestions?
Start by changing the data structure: rather than using a list of two-element String[] arrays, define a class for your key-value pairs:
class KeyValuePair {
private final String key;
private final String value;
public KeyValuePair(String k, String v) { key = k; value = v; }
public String getKey() { return key; }
public String getValue() { return value; }
}
Note that the class is immutable.
Now make an object with three lists of KeyValuePair objects:
class ParseResult {
private final List<KeyValuePair> sanitized = new ArrayList<KeyValuePair>();
private final List<KeyValuePair> badKey = new ArrayList<KeyValuePair>();
private final List<KeyValuePair> badValue = new ArrayList<KeyValuePair>();
public ParseResult(List<KeyValuePair> s, List<KeyValuePair> bk, List<KeyValuePair> bv) {
sanitized = s;
badKey = bk;
badValue = bv;
}
public List<KeyValuePair> getSanitized() { return sanitized; }
public List<KeyValuePair> getBadKey() { return badKey; }
public List<KeyValuePair> getBadValue() { return badValue; }
}
Finally, populate these three lists in a single loop that reads from the file:
public static ParseResult parseFile(File csvFile) {
Scanner scan = null;
try {
scan = new Scanner(csvFile);
} catch (FileNotFoundException e) {
???
// Do something about this exception.
// Consider not catching it here, letting the caller deal with it.
}
final List<KeyValuePair> sanitized = new ArrayList<KeyValuePair>();
final List<KeyValuePair> badKey = new ArrayList<KeyValuePair>();
final List<KeyValuePair> badValue = new ArrayList<KeyValuePair>();
while (scan.hasNext()) {
String[] tokens = scan.nextLine().trim().split(",");
if (tokens.length != 2) {
???
// Do something about this - either throw an exception,
// or log a message and continue.
}
KeyValuePair kvp = new KeyValuePair(tokens[0], tokens[1]);
// Do the validation on the spot
if (!validateRecordKey(kvp.getKey())) {
badKey.add(kvp);
} else if (!validateRecord(kvp.getValue())) {
badValue.add(kvp);
} else {
sanitized.add(kvp);
}
}
return new ParseResult(sanitized, badKey, badValue);
}
Now you have a single function that produces a single result with all your records cleanly separated into three buckets - i.e. sanitized records, records with bad keys, and record with good keys but bad values.

Fetch all the hyperlinks from a webpage and recursively doing that in java

1 .Fetch all contents from a Webpage
2. fetch hyperlinks from the webpage.
3. Repeat the 1 & 2 from the fetched hyperlink
4. repeat the process untill 200 hyperlinks regietered or no more hyperlink to fetch.
I wrote a sample programs but due to poor understanding of recursion , my loop became an infinite loop.
Suggest me to solve the code matching the expectation.
import java.net.URL;
import java.net.URLConnection;
import java.util.Scanner;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Content
{
private static final String HTML_A_HREF_TAG_PATTERN =
"\\s*(?i)href\\s*=\\s*(\"([^\"]*\")|'[^']*'|([^'\">\\s]+))";
Pattern pattern;
public Content ()
{
pattern = Pattern.compile(HTML_A_HREF_TAG_PATTERN);
}
private void fetchContentFromURL(String strLink) {
String content = null;
URLConnection connection = null;
try {
connection = new URL(strLink).openConnection();
Scanner scanner = new Scanner(connection.getInputStream());
scanner.useDelimiter("\\Z");
content = scanner.next();
}catch ( Exception ex ) {
ex.printStackTrace();
return;
}
fetchURL(content);
}
private void fetchURL ( String content )
{
Matcher matcher = pattern.matcher( content );
while(matcher.find()) {
String group = matcher.group();
if(group.toLowerCase().contains( "http" ) || group.toLowerCase().contains( "https" )) {
group = group.substring( group.indexOf( "=" )+1 );
group = group.replaceAll( "'", "" );
group = group.replaceAll( "\"", "" );
System.out.println("lINK "+group);
fetchContentFromURL(group);
}
}
System.out.println("DONE");
}
/**
* #param args
*/
public static void main ( String[] args )
{
new Content().fetchContentFromURL( "http://www.google.co.in" );
}
}
I am open for any other solution as well but want to stick with core java Api only no 3rd party.
One possible option here is to remember all visited links to avoid cyclic paths. Here's how to archive it with additional Set storage for already visited links:
public class Content {
private static final String HTML_A_HREF_TAG_PATTERN =
"\\s*(?i)href\\s*=\\s*(\"([^\"]*\")|'[^']*'|([^'\">\\s]+))";
private Pattern pattern;
private Set<String> visitedUrls = new HashSet<String>();
public Content() {
pattern = Pattern.compile(HTML_A_HREF_TAG_PATTERN);
}
private void fetchContentFromURL(String strLink) {
String content = null;
URLConnection connection = null;
try {
connection = new URL(strLink).openConnection();
Scanner scanner = new Scanner(connection.getInputStream());
scanner.useDelimiter("\\Z");
if (scanner.hasNext()) {
content = scanner.next();
visitedUrls.add(strLink);
fetchURL(content);
}
} catch (Exception ex) {
ex.printStackTrace();
}
}
private void fetchURL(String content) {
Matcher matcher = pattern.matcher(content);
while (matcher.find()) {
String group = matcher.group();
if (group.toLowerCase().contains("http") || group.toLowerCase().contains("https")) {
group = group.substring(group.indexOf("=") + 1);
group = group.replaceAll("'", "");
group = group.replaceAll("\"", "");
System.out.println("lINK " + group);
if (!visitedUrls.contains(group) && visitedUrls.size() < 200) {
fetchContentFromURL(group);
}
}
}
System.out.println("DONE");
}
/**
* #param args
*/
public static void main(String[] args) {
new Content().fetchContentFromURL("http://www.google.co.in");
}
}
I also fixed some other issues in fetching logic, now it works as expected.
inside the fetchContentFromURL method you should record which url u r currently fetching, and if that url has already be fetched then skip it. otherwise two page A, B, which has a link point to each other will cause your code keep fetching.
In addition to JK1's answer, for achieving target 4 of your question, you might want to maintain the count of hyperlinks as instance variable. A rough pseudo code might be(you can adjust the exact count. Also as an alternate, you can use HashSet length to know the number of Hyperlinks your program has parsed till now):
if (!visitedUrls.contains(group) && noOfHyperlinksVisited++ < 200) {
fetchContentFromURL(group);
}
However, I was not sure whether you want a total of 200 hyperlinks OR want to traverse to a depth of 200 links from starting page. In case it is later, you might wish to explore Breadth First Search, which will let you know when you have reached your target depth.

Is there a Java API or shortcut for a state name lookup/find? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
I'm writing a program that takes in a string, a state name (for example New York), and outputs the corresponding abbreviation (e.g. NY). My program considers all 50 states, so my first thought was to use a boatload of if/else if statements, but now I'm thinking there's gotta be a better way...a faster way...without so much seemingly redundant code.
Snippet:
if (dirtyState.equalsIgnoreCase("New York")) {
cleanState = "NY";
} else if (dirtyState.equalsIgnoreCase("Maryland")) {
cleanState = "MD";
} else if (dirtyState.equalsIgnoreCase("District of Columbia")) {
cleanState = "DC";
} else if (dirtyState.equalsIgnoreCase("Virginia")) {
cleanState = "VA";
} else if (dirtyState.equalsIgnoreCase("Alabama")) {
cleanState = "AL";
} else if (dirtyState.equalsIgnoreCase("California")) {
cleanState = "CA";
} else if (dirtyState.equalsIgnoreCase("Kentuky")) {
cleanState = "KY";
// and on and on...
Is there an API that could make this process simpler? A shortcut perhaps?
Any feedback is greatly appreciated and thanks in advance =)
You could use a TreeMap which allows you to use a custom comparator that is case insensitive. It would look like this:
Map<String, String> states = new TreeMap<>(String.CASE_INSENSITIVE_ORDER);
states.put("New York", "NY");
states.put("Maryland", "MD");
//etc.
And to retrieve an abbreviation:
String abbreviation = states.get("new york");
System.out.println(abbreviation); //prints NY
If you're using Java 7 you can use strings in a switch statement, e.g.:
switch (dirtyState.toLowerCase())
{
case "new york": cleanState = "NY"; break;
case "maryland": cleanState = "MD"; break;
// so on...
}
It would be better to grab a city code list and put it in a properties file like:
New York=NY
Maryland=MD
District of Columbia=DC
Virginia=VA
Then load the content in a Properties and loop on its entries (it extends HashTable):
Properties cityCodes = new Properties()
citycodes.load(new FileInputStream(...));
for(Entry<String,String> entry : cityCodes.entrySet()){
if(dirtyState.equalsIgnoreCase(entry.getKey())){
cleanState = entry.getValue();
}
}
Here is a working example :
public static void main(String[] args) throws Exception{
Properties cityCodes = new Properties();
cityCodes.load(new FileInputStream("/path/to/directory/cityCodes.properties"));
System.out.print(getCode("Maryland",cityCodes));
}
public static String getCode(String name, Properties cityCodes){
for(Map.Entry<Object,Object> entry : cityCodes.entrySet()){
String cityName=(String)entry.getKey();
String cityCode=(String)entry.getValue();
if(name.equalsIgnoreCase(cityName)){
return cityCode;
}
}
return null;
}
Output:
MD
You could use an enum:
public enum State {
AL("Alabama"), CA("California"), NY("New York");
private State(String name) {
this.name = name;
}
private String name;
static String findByName(String name) {
for ( int i = 0; i != values().length; ++i ) {
if ( name.equalsIgnoreCase(values()[i].name))
return values()[i].toString();
}
throw new IllegalArgumentException();
}
}
public class StateTest {
public static void main(String[] args) {
String name = "New York";
System.out.println(name + ": " + State.findByName(name));
}
}

Parsing a .txt file (considering performance measure)

DurationOfRun:5
ThreadSize:10
ExistingRange:1-1000
NewRange:5000-10000
Percentage:55 - AutoRefreshStoreCategories Data:Previous/30,New/70 UserLogged:true/50,false/50 SleepTime:5000 AttributeGet:1,16,10106,10111 AttributeSet:2060/30,10053/27
Percentage:25 - CrossPromoEditItemRule Data:Previous/60,New/40 UserLogged:true/50,false/50 SleepTime:4000 AttributeGet:1,10107 AttributeSet:10108/34,10109/25
Percentage:20 - CrossPromoManageRules Data:Previous/30,New/70 UserLogged:true/50,false/50 SleepTime:2000 AttributeGet:1,10107 AttributeSet:10108/26,10109/21
I am trying to parse above .txt file(first four lines are fixed and last three Lines can increase means it can be more than 3), so for that I wrote the below code and its working but it looks so messy. so Is there any better way to parse the above .txt file and also if we consider performance then which will be best way to parse the above txt file.
private static int noOfThreads;
private static List<Command> commands;
public static int startRange;
public static int endRange;
public static int newStartRange;
public static int newEndRange;
private static BufferedReader br = null;
private static String sCurrentLine = null;
private static List<String> values;
private static String commandName;
private static String percentage;
private static List<String> attributeIDGet;
private static List<String> attributeIDSet;
private static LinkedHashMap<String, Double> dataCriteria;
private static LinkedHashMap<Boolean, Double> userLoggingCriteria;
private static long sleepTimeOfCommand;
private static long durationOfRun;
br = new BufferedReader(new FileReader("S:\\Testing\\PDSTest1.txt"));
values = new ArrayList<String>();
while ((sCurrentLine = br.readLine()) != null) {
if(sCurrentLine.startsWith("DurationOfRun")) {
durationOfRun = Long.parseLong(sCurrentLine.split(":")[1]);
} else if(sCurrentLine.startsWith("ThreadSize")) {
noOfThreads = Integer.parseInt(sCurrentLine.split(":")[1]);
} else if(sCurrentLine.startsWith("ExistingRange")) {
startRange = Integer.parseInt(sCurrentLine.split(":")[1].split("-")[0]);
endRange = Integer.parseInt(sCurrentLine.split(":")[1].split("-")[1]);
} else if(sCurrentLine.startsWith("NewRange")) {
newStartRange = Integer.parseInt(sCurrentLine.split(":")[1].split("-")[0]);
newEndRange = Integer.parseInt(sCurrentLine.split(":")[1].split("-")[1]);
} else {
attributeIDGet = new ArrayList<String>();
attributeIDSet = new ArrayList<String>();
dataCriteria = new LinkedHashMap<String, Double>();
userLoggingCriteria = new LinkedHashMap<Boolean, Double>();
percentage = sCurrentLine.split("-")[0].split(":")[1].trim();
values = Arrays.asList(sCurrentLine.split("-")[1].trim().split("\\s+"));
for(String s : values) {
if(s.startsWith("Data")) {
String[] data = s.split(":")[1].split(",");
for (String n : data) {
dataCriteria.put(n.split("/")[0], Double.parseDouble(n.split("/")[1]));
}
//dataCriteria.put(data.split("/")[0], value)
} else if(s.startsWith("UserLogged")) {
String[] userLogged = s.split(":")[1].split(",");
for (String t : userLogged) {
userLoggingCriteria.put(Boolean.parseBoolean(t.split("/")[0]), Double.parseDouble(t.split("/")[1]));
}
//userLogged = Boolean.parseBoolean(s.split(":")[1]);
} else if(s.startsWith("SleepTime")) {
sleepTimeOfCommand = Long.parseLong(s.split(":")[1]);
} else if(s.startsWith("AttributeGet")) {
String[] strGet = s.split(":")[1].split(",");
for(String q : strGet) attributeIDGet.add(q);
} else if(s.startsWith("AttributeSet:")) {
String[] strSet = s.split(":")[1].split(",");
for(String p : strSet) attributeIDSet.add(p);
} else {
commandName = s;
}
}
Command command = new Command();
command.setName(commandName);
command.setExecutionPercentage(Double.parseDouble(percentage));
command.setAttributeIDGet(attributeIDGet);
command.setAttributeIDSet(attributeIDSet);
command.setDataUsageCriteria(dataCriteria);
command.setUserLoggingCriteria(userLoggingCriteria);
command.setSleepTime(sleepTimeOfCommand);
commands.add(command);
Well, parsers usually are messy once you get down to the lower layers of them :-)
However, one possible improvement, at least in terms of code quality, would be to recognize the fact that your grammar is layered.
By that, I mean every line is an identifying token followed by some properties.
In the case of DurationOfRun, ThreadSize, ExistingRange and NewRange, the properties are relatively simple. Percentage is somewhat more complex but still okay.
I would structure the code as (pseudo-code):
def parseFile (fileHandle):
while (currentLine = fileHandle.getNextLine()) != EOF:
if currentLine.beginsWith ("DurationOfRun:"):
processDurationOfRun (currentLine[14:])
elsif currentLine.beginsWith ("ThreadSize:"):
processThreadSize (currentLine[11:])
elsif currentLine.beginsWith ("ExistingRange:"):
processExistingRange (currentLine[14:])
elsif currentLine.beginsWith ("NewRange:"):
processNewRange (currentLine[9:])
elsif currentLine.beginsWith ("Percentage:"):
processPercentage (currentLine[11:])
else
raise error
Then, in each of those processWhatever() functions, you parse the remainder of the line based on the expected format. That keeps your code small and readable and easily changed in future, without having to navigate a morass :-)
For example, processDurationOfRun() simply gets an integer from the remainder of the line:
def processDurationOfRun (line):
this.durationOfRun = line.parseAsInt()
Similarly, the functions for the two ranges split the string on - and get two integers from the resultant values:
def processExistingRange (line):
values[] = line.split("-")
this.existingRangeStart = values[0].parseAsInt()
this.existingRangeEnd = values[1].parseAsInt()
The processPercentage() function is the tricky one but that is also easily doable if you layer it as well. Assuming those things are always in the same order, it consists of:
an integer;
a literal -;
some sort of textual category; and
a series of key:value pairs.
And even these values within the pairs can be parsed by lower levels, splitting first on commas to get subvalues like Previous/30 and New/70, then splitting each of those subvalues on slashes to get individual items. That way, a logical hierarchy can be reflected in your code.
Unless you're expecting to be parsing this text files many times per second, or unless it's many megabytes in size, I'd be more concerned about the readability and maintainability of your code than the speed of the parsing.
Mostly gone are the days when we need to wring the last ounce of performance from our code but we still have problems in fixing said code in a timely manner when bugs are found or enhancements are desired.
Sometimes it's preferable to optimise for readability.
I would not worry about performance until I was sure there was actually a performance issue. Regarding the rest of the code, if you won't be adding any new line types I would not worry about it. If you do worry about it, however, a factory design pattern can help you separate the selection of the type of processing needed from the actual processing. It makes adding new line types easier without introducing as much opportunity for error.
The younger and more convenient class is Scanner. You just need to modify the delimiter, and get reading of data in the desired format (readInt, readLong) in one go - no need for separate x.parseX - calls.
Second: Split your code into small, reusable pieces. They make the program readable, and you can hide details easily.
Don't hesitate to use a struct-like class for a range, for example. Returning multiple values from a method can be done by these, without boilerplate (getter,setter,ctor).
import java.util.*;
import java.io.*;
public class ReadSampleFile
{
// struct like classes:
class PercentageRow {
public int percentage;
public String name;
public int dataPrevious;
public int dataNew;
public int userLoggedTrue;
public int userLoggedFalse;
public List<Integer> attributeGet;
public List<Integer> attributeSet;
}
class Range {
public int from;
public int to;
}
private int readInt (String name, Scanner sc) {
String s = sc.next ();
if (s.startsWith (name)) {
return sc.nextLong ();
}
else err (name + " expected, found: " + s);
}
private long readLong (String name, Scanner sc) {
String s = sc.next ();
if (s.startsWith (name)) {
return sc.nextInt ();
}
else err (name + " expected, found: " + s);
}
private Range readRange (String name, Scanner sc) {
String s = sc.next ();
if (s.startsWith (name)) {
Range r = new Range ();
r.from = sc.nextInt ();
r.to = sc.nextInt ();
return r;
}
else err (name + " expected, found: " + s);
}
private PercentageLine readPercentageLine (Scanner sc) {
// reuse above methods
PercentageLine percentageLine = new PercentageLine ();
percentageLine.percentage = readInt ("Percentage", sc);
// ...
return percentageLine;
}
public ReadSampleFile () throws FileNotFoundException
{
/* I only read from my sourcefile for convenience.
So I could scroll up to see what's the next entry.
Don't do this at home. :) The dummy later ...
*/
Scanner sc = new Scanner (new File ("./ReadSampleFile.java"));
sc.useDelimiter ("[ \n/,:-]");
// ... is the comment I had to insert.
String dummy = sc.nextLine ();
List <String> values = new ArrayList<String> ();
if (sc.hasNext ()) {
// see how nice the data structure is reflected
// by this code:
long duration = readLong ("DurationOfRun");
int noOfThreads = readInt ("ThreadSize");
Range eRange = readRange ("ExistingRange");
Range nRange = readRange ("NewRange");
List <PercentageRow> percentageRows = new ArrayList <PercentageRow> ();
// including the repetition ...
while (sc.hasNext ()) {
percentageRows.add (readPercentageLine ());
}
}
}
public static void main (String args[]) throws FileNotFoundException
{
new ReadSampleFile ();
}
public static void err (String msg)
{
System.out.println ("Err:\t" + msg);
}
}

Java toString() using reflection?

I was writing a toString() for a class in Java the other day by manually writing out each element of the class to a String and it occurred to me that using reflection it might be possible to create a generic toString() method that could work on ALL classes. I.E. it would figure out the field names and values and send them out to a String.
Getting the field names is fairly simple, here is what a co-worker came up with:
public static List initFieldArray(String className) throws ClassNotFoundException {
Class c = Class.forName(className);
Field field[] = c.getFields();
List<String> classFields = new ArrayList(field.length);
for (int i = 0; i < field.length; i++) {
String cf = field[i].toString();
classFields.add(cf.substring(cf.lastIndexOf(".") + 1));
}
return classFields;
}
Using a factory I could reduce the performance overhead by storing the fields once, the first time the toString() is called. However finding the values could be a lot more expensive.
Due to the performance of reflection this may be more hypothetical then practical. But I am interested in the idea of reflection and how I can use it to improve my everyday programming.
Apache commons-lang ReflectionToStringBuilder does this for you.
import org.apache.commons.lang3.builder.ReflectionToStringBuilder
// your code goes here
public String toString() {
return ReflectionToStringBuilder.toString(this);
}
Another option, if you are ok with JSON, is Google's GSON library.
public String toString() {
return new GsonBuilder().setPrettyPrinting().create().toJson(this);
}
It's going to do the reflection for you. This produces a nice, easy to read JSON file. Easy-to-read being relative, non tech folks might find the JSON intimidating.
You could make the GSONBuilder a member variable too, if you don't want to new it up every time.
If you have data that can't be printed (like a stream) or data you just don't want to print, you can just add #Expose tags to the attributes you want to print and then use the following line.
new GsonBuilder()
.setPrettyPrinting()
.excludeFieldsWithoutExposeAnnotation()
.create()
.toJson(this);
W/reflection, as I hadn't been aware of the apache library:
(be aware that if you do this you'll probably need to deal with subobjects and make sure they print properly - in particular, arrays won't show you anything useful)
#Override
public String toString()
{
StringBuilder b = new StringBuilder("[");
for (Field f : getClass().getFields())
{
if (!isStaticField(f))
{
try
{
b.append(f.getName() + "=" + f.get(this) + " ");
} catch (IllegalAccessException e)
{
// pass, don't print
}
}
}
b.append(']');
return b.toString();
}
private boolean isStaticField(Field f)
{
return Modifier.isStatic(f.getModifiers());
}
If you're using Eclipse, you may also have a look at JUtils toString generator, which does it statically (generating the method in your source code).
You can use already implemented libraries, as ReflectionToStringBuilder from Apache commons-lang. As was mentioned.
Or write smt similar by yourself with reflection API.
Here is some example:
class UniversalAnalyzer {
private ArrayList<Object> visited = new ArrayList<Object>();
/**
* Converts an object to a string representation that lists all fields.
* #param obj an object
* #return a string with the object's class name and all field names and
* values
*/
public String toString(Object obj) {
if (obj == null) return "null";
if (visited.contains(obj)) return "...";
visited.add(obj);
Class cl = obj.getClass();
if (cl == String.class) return (String) obj;
if (cl.isArray()) {
String r = cl.getComponentType() + "[]{";
for (int i = 0; i < Array.getLength(obj); i++) {
if (i > 0) r += ",";
Object val = Array.get(obj, i);
if (cl.getComponentType().isPrimitive()) r += val;
else r += toString(val);
}
return r + "}";
}
String r = cl.getName();
// inspect the fields of this class and all superclasses
do {
r += "[";
Field[] fields = cl.getDeclaredFields();
AccessibleObject.setAccessible(fields, true);
// get the names and values of all fields
for (Field f : fields) {
if (!Modifier.isStatic(f.getModifiers())) {
if (!r.endsWith("[")) r += ",";
r += f.getName() + "=";
try {
Class t = f.getType();
Object val = f.get(obj);
if (t.isPrimitive()) r += val;
else r += toString(val);
} catch (Exception e) {
e.printStackTrace();
}
}
}
r += "]";
cl = cl.getSuperclass();
} while (cl != null);
return r;
}
}
Not reflection, but I had a look at generating the toString method (along with equals/hashCode) as a post-compilation step using bytecode manipulation. Results were mixed.
Here is the Netbeans equivalent to Olivier's answer; smart-codegen plugin for Netbeans.

Categories