How to convert NFA/DFA to java? - java

I have a scenario where I have designed the NFA and using JFLAP I have converted it to DFA.
I need to know, how to code it in Java?
Basically how to implement those state transitions in Java. I have seen some examples which do this using switch and if statements, but I can't see any relation to DFA/NFA design and how to use it to implement in Java.

if you want to use a more object oriented design over while(true)switch(state){...}
public class State{
private Map<Character,State> transitions=new HashMap<Character,State>();
public void addTransition(char ch,State st){
transitions.put(ch,st);
}
public State next(char ch){
return transitions.get(ch);
}
private boolean fin=false;
public boolean isFinal(){return fin;}
public boolean setFinal(boolean f){fin=f;}
}
and then the loop will be
State currState=startState;
while(currState!=null && input.hasNextChar()){//you can also end directly when final state is reached
char next = input.nextChar();//get next character
currState = currState.next(next);
}
if(currState!=null && currState.isFinal()){
// reached final state
}else{
// to bad didn't match
}

Take a look at dk.brics.automaton:
This Java package contains a DFA/NFA (finite-state automata) implementation with Unicode alphabet (UTF16) and support for the standard regular expression operations (concatenation, union, Kleene star) and a number of non-standard ones (intersection, complement, etc.)

Although you would have implemented it by now but there is very good implementation which is easy to digest. Use Digraph to maintain the epsilon transitions and stack to keep track of expressions. Check out this link from RS NFA.java .

Related

java enum string matching

I have an enum as follows:
public enum ServerTask {
HOOK_BEFORE_ALL_TASKS("Execute"),
COPY_MASTER_AND_SNAPSHOT_TO_HISTORY("Copy master db"),
PROCESS_CHECKIN_QUEUE("Process Check-In Queue"),
...
}
I also have a string (lets say string = "Execute") which I would like to make into an instance of the ServerTask enum based on which string in the enum that it matches with. Is there a better way to do this than doing equality checks between the string I want to match and every item in the enum? seems like this would be a lot of if statements since my enum is fairly large
At some level you're going to have to iterate over the entire set of enumerations that you have, and you'll have to compare them to equal - either via a mapping structure (initial population) or through a rudimentary loop.
It's fairly easy to accomplish with a rudimentary loop, so I don't see any reason why you wouldn't want to go this route. The code snippet below assumes the field is named friendlyTask.
public static ServerTask forTaskName(String friendlyTask) {
for (ServerTask serverTask : ServerTask.values()) {
if(serverTask.friendlyTask.equals(friendlyTask)) {
return serverTask;
}
}
return null;
}
The caveat to this approach is that the data won't be stored internally, and depending on how many enums you actually have and how many times you want to invoke this method, it would perform slightly worse than initializing with a map.
However, this approach is the most straightforward. If you find yourself in a position where you have several hundred enums (even more than 20 is a smell to me), consider what it is those enumerations represent and what one should do to break it out a bit more.
Create static reverse lookup map.
public enum ServerTask {
HOOK_BEFORE_ALL_TASKS("Execute"),
COPY_MASTER_AND_SNAPSHOT_TO_HISTORY("Copy master db"),
PROCESS_CHECKIN_QUEUE("Process Check-In Queue"),
...
FINAL_ITEM("Final item");
// For static data always prefer to use Guava's Immutable library
// http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/collect/ImmutableMap.html
static ImmutableMap< String, ServerTask > REVERSE_MAP;
static
{
ImmutableMap.Builder< String, ServerTask > reverseMapBuilder =
ImmutableMap.builder( );
// Build the reverse map by iterating all the values of your enum
for ( ServerTask cur : values() )
{
reverseMapBuilder.put( cur.taskName, cur );
}
REVERSE_MAP = reverseMapBuilder.build( );
}
// Now is the lookup method
public static ServerTask fromTaskName( String friendlyName )
{
// Will return ENUM if friendlyName matches what you stored
// with enum
return REVERSE_MAP.get( friendlyName );
}
}
If you have to get the enum from the String often, then creating a reverse map like Alexander suggests might be worth it.
If you only have to do it once or twice, looping over the values with a single if statement might be your best bet (like Nizil's comment insinuates)
for (ServerTask task : ServerTask.values())
{
//Check here if strings match
}
However there is a way to not iterate over the values at all. If you can ensure that the name of the enum instance and its String value are identical, then you can use:
ServerTask.valueOf("EXECUTE")
which will give you ServerTask.EXECUTE.
Refer this answer for more info.
Having said that, I would not recommend this approach unless you're OK with having instances have the same String representations as their identifiers and yours is a performance critical application which is most often not the case.
You could write a method like this:
static ServerTask getServerTask(String name)
{
switch(name)
{
case "Execute": return HOOK_BEFORE_ALL_TASKS;
case "Copy master db": return COPY_MASTER_AND_SNAPSHOT_TO_HISTORY;
case "Process Check-In Queue": return PROCESS_CHECKIN_QUEUE;
}
}
It's smaller, but not automatic like #Alexander_Pogrebnyak's solution. If the enum changes, you would have to update the switch.

How to check Type Ranges with Java development tools (JDT)?

I want to parse a String, which contains a number, using JDT to find out whether the contained number is inside the valid Range of one of the Primitive Types.
Let's say i got a float value like this as String "1.7976931348623157e350" and want to see whether it is still inside the allowed range for primitive type 'double'. (In this case it would not be inside the valid range, because the maximum exponent of double is 308).
I don't want to use the standard methods like : Double.parseDouble("1.7976931348623157e350"), because I'm afraid it might be too slow if I have a big amount of primitive types, which I want to check .
If you know the Eclipse development environment you will know that inside a normal java file, eclipse is able to tell whether a variable is out of range or not, by underlining it red, in the the case of 'out of range'. So basically i want to use this functionality. But as you can guess - it's easier said then done!
I have started experimenting with the ASTParser from this library: org.eclipse.jdt.core.dom
But I must admit I was not very successful here.
First i tried calling some of those vistor methods using methods like:
resolveBinding() , but they always only returned me "Null".
I have found some interesting class called ASTSyntaxErrorPropagator , but i'm not sure how this is used correctly. It seems to propagate parsing problems or something like that and gets it's information delivered by some thing class called CodeSnippetParsingUtil I assume. Anyways, these are only speculations.
Does anyone know how to use this ASTParser correctly?
I would be really thankful for some advice.
Here is some basic code-snipped which I tried to debug:
public class DatatypesParser {
public static void main(String[] args) {
ASTParser parser = ASTParser.newParser(AST.JLS4);
Map options = JavaCore.getOptions();
JavaCore.setComplianceOptions(JavaCore.VERSION_1_7, options);
String statement = new String("int i = " + Long.MAX_VALUE + ";");
parser.setSource(statement.toCharArray());
parser.setKind(ASTParser.K_STATEMENTS);
parser.setResolveBindings(true);
parser.setBindingsRecovery(true);
ASTNode ast = parser.createAST(null);
ast.accept(new ASTVisitor() {
#Override
public boolean visit(VariableDeclarationStatement node) {
CodeSnippetParsingUtil util = new CodeSnippetParsingUtil();
return true;
}
});
}
I don't want to use the standard methods like :
Double.parseDouble("1.7976931348623157e350"), because i'm afraid it
might be too slow if i have a big amount of primitive types, which i
want to check .
Under the hood JDT is actually using the standard methods of Double to parse the value, and quite a bit more - so you should always use the standard methods if performance is a concern.
Here is how the double gets parsed by JDT.
From org.eclipse.jdt.internal.compiler.ast.DoubleLiteral:
public void computeConstant() {
Double computedValue;
[...]
try {
computedValue = Double.valueOf(String.valueOf(this.source));
} catch (NumberFormatException e) {
[...]
return;
}
final double doubleValue = computedValue.doubleValue();
if (doubleValue > Double.MAX_VALUE) {
// error: the number is too large to represent
return;
}
[...]
}

Java OOP: Building Object Trees / Object Families

Been a while since I used Java and was wondering if this was a decent or even correct way of setting this up.
FYI, userResults refers to a JDBI variable that isn't present in the code below.
Feel free to suggest a better method, thanks.
public class Stat
{
private int current;
private int max;
public int getCurrent() {return current;}
public void setCurrent(int current) {this.current = current;}
public int getMax() {return max;}
public void setMax(int max) {this.max = max;}
}
public class Character
{
Stat hp = new Stat();
Stat mp = new Stat();
}
Character thisCharacter = new Character();
// Set the value of current & max HP according to db data.
thisCharacter.hp.setCurrent((Integer) userResults.get("hpColumn1"));
thisCharacter.hp.setMax((Integer) userResults.get("hpColumn2"));
// Print test values
System.out.println (thisCharacter.hp.Current);
System.out.println (thisCharacter.hp.Max);
Correct? Well, does it work? Then it probably is correct.
Wether or not it is a decent way to do it then the answer is "maybe". It is hard to tell from what context this code is in. But there are some things you could keep in mind though:
In which class (or object rather) are the Stat set in? Do you feel is it the responsibility of the class to do this and know what database values to get them from? If not, consider making some kind of a class that does this.
Making chained calls such as thisCharacter.hp.setCurrent(...) is a violation of principle of least knowledge. Sometimes you can't help it, but usually it leads to kludgy code. Consider having something that handles all the logic surrounding the stats. In your code you may need a HealthStatsHandler that have methods such as loadStats(), saveStats(), and mutator actions such as takeDamage(int dmg) and revive(int health).
If you have trouble figuring things out if it has the correct object design, then study up on the SOLID principles. They provide nice guidelines that any developer should follow if they want to have code that is extensible and "clean".
This is not really a tree. It is not possible two have more than one layer of children.
Usually you define an interface let's call it Node where both Stat and Character implements it and the two children of Character would have the type Node.
I would consider creating the Stat objects seperately and passing them into Character, and making the character attributes private as follows:
public class Character
{
private Stat hp;
private Stat mp;
public Stat getHp() {return hp;}
public void setHp(Stat h) {this.hp = h;}
public Stat getMp() {return mp;}
public void setMp(Stat m) {this.mp = m;}
}
// Set the value of current & max HP according to db data.
Stat hp = new Stat();
hp.setCurrent((Integer) userResults.get("hpColumn1"));
hp.setMax((Integer) userResults.get("hpColumn2"));
Character thisCharacter = new Character();
thisCharacter.setHp(hp);
// do the same for mp
One additional simple step would be to create a Character constructor that would take an hp and an mp

Using reserved words in an enum / switch statement, best workaround?

I am writing a flat file parser that reads token/value pairs using a Scanner. The files being read contain the token "class". The token is later used in a switch statement, and uses the (pre Java 7) valueOf(token) Java idiom to produce an enum value. (I am using Java6 for compatibility with GWT.) As a workaround, I am using uppercase values in the enum, and valueOf(token.toUpperCase()).
public enum ParseTags {
CODE, CLASS, INSTRUCTOR, HOURS;
}
// . . .
token = scanner.next();
value = scanner.next();
switch (ParseTags.valueOf(token.toUpperCase())) {
case CODE:
entry.setCode(value);
break;
case CLASS:
entry.setClass(value);
break;
Because this is being compiled into javascript, I want to avoid the extra "toUpperCase()" operation on each iteration; not sure what performance will be on target platform. Is there a more graceful way to represent reserved words in an enumeration? This would be handled well by Java7's switch on String, but again, I am confined to Java6sdk.
What you're doing right now is the preferred way to do it. I would be extraordinarily shocked if the toUpperCase were a bottleneck.
That said, I might consider something like
enum ParseTags {
CODE {
public void set(Entry entry, String value) {
entry.setCode(value);
}
},
...;
public abstract void set(Entry entry, String value);
}
so you can do
ParseTags.valueOf(token.toUpperCase()).set(entry, value);

Parsing field access flags in java

I have an assignment wherein I have to parse the field access flags of a java .class file.
The specification for a .class file can be found here: Class File Format (page 26 & 27 have the access flags and hex vals).
This is fine, I can do this no worries.
My issue is that there is a large number of combinations.
I know the public, private and protected are mutually exclusive, which reduces the combinations somewhat. Final and transient are also mutually exclusive. The rest however are not.
At the moment, I have a large switch statement to do the comparison. I read in the hex value of the access flag and then increment a counter, depending on if it is public, private or protected. This works fine, but it seems quite messy to just have every combination listed in a switch statement. i.e. public static, public final, public static final, etc.
I thought of doing modulo on the access flag and the appropriate hex value for public, private or protected, but public is 0x0001, so that won't work.
Does anyone else have any ideas as to how I could reduce the amount of cases in my switch statement?
What is the problem? The specification says that it's a bit flag, that means that you should look at a value as a binary number, and that you can test if a specific value is set by doing a bitwise AND.
E.g
/*
ACC_VOLATILE = 0x0040 = 10000000
ACC_PUBLIC = 0x0001 = 00000001
Public and volatile is= 10000001
*/
publicCount += flag & ACC_PUBLIC > 0 ? 1 : 0;
volatileCount += flag & ACC_VOLATILE > 0 ? 1 : 0;
If you are trying to avoid a pattern like this one I just stole:
if (access_flag & ACC_PUBLIC != 0)
{
public++;
}
if (access_flag & ACC_FINAL != 0)
{
final++;
}
...
It's a great instinct. I make it a rule never to write code that looks redundant like that. Not only is it error-prone and more code in your class, but copy & paste code is really boring to write.
So the big trick is to make this access "Generic" and easy to understand from the calling class--pull out all the repeated crap and just leave "meat", push the complexity to the generic routine.
So an easy way to call a method would be something like this that gives an array of bitfields that contain many bit combinations that need counted and a list of fields that you are interested in (so that you don't waste time testing fields you don't care about):
int[] counts = sumUpBits(arrayOfFlagBitfields, ACC_PUBLIC | ACC_FINAL | ACC_...);
That's really clean, but then how do you access the return fields? I was originally thinking something like this:
System.out.println("Number of public classes="+counts[findBitPosition(ACC_PUBLIC]));
System.out.println("Number of final classes="+counts[findBitPosition(ACC_FINAL)]);
Most of the boilerplate here is gone except the need to change the bitfields to their position. I think two changes might make it better--encapsulate it in a class and use a hash to track positions so that you don't have to convert bitPosition all the time (if you prefer not to use the hash, findBitPosition is at the end).
Let's try a full-fledged class. How should this look from the caller's point of view?
BitSummer bitSums=new BitSummer(arrayOfFlagBitfields, ACC_PUBLIC, ACC_FINAL);
System.out.println("Number of public classes="+bitSums.getCount(ACC_PUBLIC));
System.out.println("Number of final classes="+bitSums.getCount(ACC_FINAL));
That's pretty clean and easy--I really love OO! Now you just use the bitSums to store your values until they are needed (It's less boilerplate than storing them in class variables and more clear than using an array or a collection)
So now to code the class. Note that the constructor uses variable arguments now--less surprise/more conventional and makes more sense for the hash implementation.
By the way, I know this seems like it would be slow and inefficient, but it's probably not bad for most uses--if it is, it can be improved, but this should be much shorter and less redundant than the switch statement (which is really the same as this, just unrolled--however this one uses a hash & autoboxing which will incur an additional penalty).
public class BitSummer {
// sums will store the "sum" as <flag, count>
private final HashMap<Integer, Integer> sums=new HashMap<Integer, Integer>();
// Constructor does all the work, the rest is just an easy lookup.
public BitSummer(int[] arrayOfFlagBitfields, int ... positionsToCount) {
// Loop over each bitfield we want to count
for(int bitfield : arrayOfFlagBitfields) {
// and over each flag to check
for(int flag : positionsToCount) {
// Test to see if we actually should count this bitfield as having the flag set
if((bitfield & flag) != 0) {
sums.put(flag, sums.get(flag) +1); // Increment value
}
}
}
}
// Return the count for a given bit position
public int getCount(int bit) {
return sums.get(bit);
}
}
I didn't test this but I think it's fairly close. I wouldn't use it for processing video packets in realtime or anything, but for most purposes it should be fast enough.
As for maintaining code may look "Long" compared to the original example but if you have more than 5 or 6 fields to check, this will actually be a shorter solution than the chained if statements and significantly less error/prone and more maintainable--also more interesting to write.
If you really feel the need to eliminate the hashtable you could easily replace it with a sparse array with the flag position as the index (for instance the count of a flag 00001000/0x08 would be stored in the fourth array position). This would require a function like this to calculate the bit position for array access (both storing in the array and retrieving)
private int findBitPosition(int flag) {
int ret;
while( ( flag << 1 ) != 0 )
ret++;
return ret;
}
That was fun.
I'm not sure that's what you're looking for, but I would use if-cases with binary AND to check if a flag is set:
if (access_flag & ACC_PUBLIC != 0)
{
// class is public
}
if (access_flag & ACC_FINAL != 0)
{
// class is final
}
....

Categories