How do I turn a conditional chain into faster less ugly code? - java

I have 9 different grammars. One of these will be loaded depending on what the first line of txt is on the file it is parsing.
I was thinking about deriving the lexer/parser spawning into sep. classes and then instantiating them as soon as I get a match -- not sure whether that would slow me down or not though. I guess some benchmarking is in order.
Really, speed is definitely my goal here but I know this is ugly code.
Right now the code looks something like this:
sin.mark(0)
site = findsite(txt)
sin.reset()
if ( site == "site1") {
loadlexer1;
loadparser1;
} else if (site == "site2") {
loadlexer2;
loadparser2;
}
.................
} else if (site == "site8") {
loadparser8;
loadparser8;
}
findsite(txt) {
...................
if line.indexOf("site1-identifier") {
site = site1;
} else if(line.indexOf("site2-identifier") {
site = site2;
} else if(line.indexOf("site3-identifier") {
site = site3;
}
.........................
} else if(line.indexOf("site8-identifier") {
site = site8;
}
}
some clarifications
1) yes, I truly have 9 different grammars I built with antlr so they will ALL have their own lexer/parser objs.
2) yes, as of right now we are comparing strings and obivously that'll be replaced with some sort of integer map.
I've also considered sticking the site identifiers into one regex, however I don't believe that will speed anything up.
3) yes, this is pseudocode so I wouldn't get too picky on the semantics here..
4) kdgregory is correct in noting that I am unable to create one instance of the lexer/parser pair
I like the hash idea to make the code a little bit better looking, however I don't think it's going to speed me up any.

The standard approach is to use a Map to connect the key strings to the lexers that will handle them:
Map<String,Lexer> lexerMap = new HashMap<String,Lexer>();
lexerMap.put("source1", new Lexer01());
lexerMap.put("source2", new Lexer02());
// and so on
Once you've retrieve the string that identifies the lexer to use, you'd retrieve it from the Map like so:
String grammarId = // read it from a file, whatever
Lexer myLexer = lexerMap.get(grammarId);
Your example code has a few quirks, however. First, the indexOf() calls indicate that you don't have a stand-alone string, and Map won't look inside the string. So you need to have some way to extract the actual key from whatever string you read.
Second, lexers and parsers usually maintain state, so you won't be able to create a single instance and reuse it. That indicates that you need to create a factory class, and store it in the map (this is the Abstract Factory pattern).
If you expect to have lots of different lexers/parsers, then it makes sense to use a map-driven approach. For a small number, an if-else chain is probably your best bet, properly encapsulated (this is the Factory Method pattern).

Using polymorphism is almost guaranteed to be faster than string manipulation, and will be checked for correctness at compile time. Is site really a String? If so, FindSite should be called GetSiteName. I would expect FindSite to return a Site object that knows the appropriate lexer and parser.
Another speed issue is speed of coding. It would definitely be better to have your different lexers and parsers in individual classes (perhaps with shared functionality in another). It'll make your code slightly smaller, and it will be significantly easier for someone to understand.

Something like:
Map<String,LexerParserTuple> lptmap = new HashMap<String,LexerParserTuple>();
lpt=lptmap.get(site)
lpt.loadlexer()
lpt.loadparser()
combined with some regex magic rather than string.indexOf() to grab the names of the sites should dramatically clean up your code.

Replace Conditional With Polymorphism
For a half-measure, for findsite(), you could simply set up a HashMap to get you from site identifier to site. An alternative cleanup would be simply to return the site string, thus:
String findsite(txt) {
...................
if line.indexOf("site1-identifier")
return site1;
if(line.indexOf("site2-identifier")
return site2;
if(line.indexOf("site3-identifier")
return site3;
...
}
Using indexOf() in this way isn't really expressive; I'd use equals() or contains().

Suppose your code is inefficient.
Will it take more time than (say) 1% of the time to actually parse the input?
If not, you've got bigger "fish to fry".

I was thinking about deriving the lexer/parser spawning into sep. classes and then instantiating them as soon as I get a match
It looks like you have the answer already. That would create code that is more flexible, but not necessary faster.
I guess some benchmarking is in order
Yes, measure with both approaches and take an informed decision. My guess is the way you have it already would be enough.
Perhaps, if what's bothers you is to have a "kilometric" method you could refactor it in different functions with extract method.
The most important thing is to have first a solution that does the job even though it is slow, and once you have it working, profile it and detect points where the performance could be improved. Remember the "Rules of optimization"

i would change the type of findsite to return a site type (super class) and then leverage the polymorphism...
That should be faster than string manipulation...
Do you need separate lexers ?

Use a Map to configure a site to loadstrategy structure. Then a simple lookup is required based on 'site' and you execute the appropriate strategy. Same can be done for findSite().

Could have a map of idenifiers vs sites, then just iterate over the map entries.
// define this as a static somewhere ... build from a properties file
Map<String,String> m = new HashMap<String,String>(){{
put("site1-identifier","site2");
put("site2-identifier","site2");
}}
// in your method
for(Map.Entry<String,String> entry : m.entries()){
if( line.contains(entry.getKey())){
return line.getValue();
}
}
cleaner: yes
faster: dunno...should be fast enough

You could use reflection possibly
char site = line.charAt(4);
Method lexerMethod = this.getClass().getMethod( "loadLexer" + site, *parameters types here*)
Method parserMethod = this.getClass().getMethod( "loadparser" + site, *parameters types here*)
lexerMethod.invoke(this, *parameters here*);
parserMethod.invoke(this, *parameters here*);

I don't know about Java but some language allow switch to take strings.
switch(site)
{
case "site1": loadlexer1; loadparser1; break;
case "site2": loadlexer2; loadparser2; break;
...
}
As for the seconds bit, use a regex to extract the identifier and switch on that. You might be better off using an enum.

Related

Using optionals, is it possible to return early on "ifPresent" without adding a separate if-else statement?

public Void traverseQuickestRoute(){ // Void return-type from interface
findShortCutThroughWoods()
.map(WoodsShortCut::getTerrainDifficulty)
.ifPresent(this::walkThroughForestPath) // return in this case
if(isBikePresent()){
return cycleQuickestRoute()
}
....
}
Is there a way to exit the method at the ifPresent?
In case it is not possible, for other people with similar use-cases: I see two alternatives
Optional<MappedRoute> woodsShortCut = findShortCutThroughWoods();
if(woodsShortCut.isPresent()){
TerrainDifficulty terrainDifficulty = woodsShortCut.get().getTerrainDifficulty();
return walkThroughForrestPath(terrainDifficulty);
}
This feels more ugly than it needs to be and combines if/else with functional programming.
A chain of orElseGet(...) throughout the method does not look as nice, but is also a possibility.
return is a control statement. Neither lambdas (arrow notation), nor method refs (WoodsShortcut::getTerrainDifficulty) support the idea of control statements that move control to outside of themselves.
Thus, the answer is a rather trivial: Nope.
You have to think of the stream 'pipeline' as the thing you're working on. So, the question could be said differently: Can I instead change this code so that I can modify how this one pipeline operation works (everything starting at findShortCut() to the semicolon at the end of all the method invokes you do on the stream/optional), and then make this one pipeline operation the whole method.
Thus, the answer is: orElseGet is probably it.
Disappointing, perhaps. 'functional' does not strike me as the right answer here. The problem is, there are things for/if/while loops can do that 'functional' cannot do. So, if you are faced with a problem that is simpler to tackle using 'a thing that for/if/while is good at but functional is bad at', then it is probably a better plan to just use for/if/while then.
One of the core things lambdas can't do are about the transparencies. Lambdas are non-transparant in regards to these 3:
Checked exception throwing. try { list.forEach(x -> throw new IOException()); } catch (IOException e) {} isn't legal even though your human brain can trivially tell it should be fine.
(Mutable) local variables. int x = 5; list.forEach(y -> x += y); does not work. Often there are ways around this (list.mapToInt(Integer::intValue).sum() in this example), but not always.
Control flow. list.forEach(y -> {if (y < 0) return y;}); does not work.
So, keep in mind, you really have only 2 options:
Continually retrain yourself to not think in terms of such control flow. You find orElseGet 'not as nice'. I concur, but if you really want to blanket apply functional to as many places as you can possibly apply it, the whole notion of control flow out of a lambda needs not be your go-to plan, and you definitely can't keep thinking 'this code is not particularly nice because it would be simpler if I could control flow out', you're going to be depressed all day programming in this style. The day you never even think about it anymore is the day you have succeeded in retraining yourself to 'think more functional', so to speak.
Stop thinking that 'functional is always better'. Given that there are so many situations where their downsides are so significant, perhaps it is not a good idea to pre-suppose that the lambda/methodref based solution must somehow be superior. Apply what seems correct. That should often be "Actually just a plain old for loop is fine. Better than fine; it's the right, most elegant1 answer here".
[1] "This code is elegant" is, of course, a non-falsifiable statement. It's like saying "The Mona Lisa is a pretty painting". You can't make a logical argument to prove this and it is insanity to try. "This code is elegant" boils down to saying "I think it is prettier", it cannot boil down to an objective fact. That also means in team situations there's no point in debating such things. Either everybody gets to decide what 'elegant' is (hold a poll, maybe?), or you install a dictator that decrees what elegance is. If you want to fix that and have meaningful debate, the term 'elegant' needs to be defined in terms of objective, falsifiable statements. I would posit that things like:
in face of expectable future change requests, this style is easier to modify
A casual glance at code leaves a first impression. Whichever style has the property that this first impression is accurate - is better (in other words, code that confuses or misleads the casual glancer is bad). Said even more differently: Code that really needs comments to avoid confusion is worse than code that is self-evident.
this code looks familiar to a wide array of java programmers
this code consists of fewer AST nodes (the more accurate from of 'fewer lines = better')
this code has simpler semantic hierarchy (i.e. fewer indents)
Those are the kinds of things that should define 'elegance'. Under almost all of those definitions, 'an if statement' is as good or better in this specific case!
For example:
public Void traverseQuickestRoute() {
return findShortCutThroughWoods()
.map(WoodsShortCut::getTerrainDifficulty)
.map(this::walkThroughForestPath)
.orElseGet(() -> { if (isBikePresent()) { return cycleQuickestRoute(); } });
}
There is Optional#ifPresentOrElse with an extra Runnable for the else case. Since java 9.
public Void traverseQuickestRoute() { // Void return-type from interface
findShortCutThroughWoods()
.map(WoodsShortCut::getTerrainDifficulty)
.ifPresentOrElse(this::walkThroughForestPath,
this::alternative);
return null;
}
private void alternative() {
if (isBikePresent()) {
return cycleQuickestRoute()
}
...
}
I would split the method as above. Though for short code () -> { ... } might be readable.

Call void on possible null object, in one line

I use Java8 in my project but i cannot solve this issue with a nice implementation.
UIInput textInput = ...;
if (textInput != null)
{
textInput.setValid(false);
}
Is there a solution to check if the object is null, and if not, then call the function on it, in one line ?!
Don't.
What you have is easily readable for anyone with some Java knowledge. Any one-liner misusing a construct intended for something else will take most people way more time to read and understand than this will. And likely some people will misread it and have to read it again later when it does not behave like they expect during a debugging session.
Brevity / Number of lines of code is not an ultimate measure for readability or quality.
What you can do, if this is at the wrong level of detail compared with the rest of your method, is abstract it away with a single speaking method call. Say, create a method 'ensureTextInputIsSet' that just contains this code and returns the potentially modified object.
Optional.ofNullable(textInput).ifPresent(x -> x.setValid(false));
But this is not what Optional was designed for...
If a variable may be null, an object is optional, then:
Optional<UIInput> textInput = ...;
With a circumstantial, but always safe usage:
textInput.ifPresent(ti -> ti.setValid(false));
And nice chaining, for instance for Optional<UIInput> to Optional<String> calling a method on the UIInput.
String s = textInput.map(UIInput::getText).orElse("");

Is defining variables for every condition is better then using getter and setters?

I have two codes can anyone tell me which approach is the better and why.
Approach 1 -
if (("Male").equalsIgnoreCase(input.getSex()) || ("Female").equalsIgnoreCase(input.getSex())) {
// do something
}else{
//do somethong
}
Approach 2 -
String tempSex = input.getSex()
if (("Male").equalsIgnoreCase(tempSex) || ("Female").equalsIgnoreCase(tempSex)) {
// do something
}else{
//do somethong
}
this is one condition, in my code, I have a lot of conditions similar to this one. In some condition, I have to compare with a lot more Strings.
Is this a good approach to define variables for every condition or I can use getter and setters?
These two approaches are essentially identical in terms of performance assuming the getSex function is a trivial getter (if getSex is complex or involves changing some other state in the class then these two bits of code are NOT equivalent).
I would prefer the first from a style point of view in that the extra local variable is slightly confusing to the flow of the code.
However if you main purpose is using code of this form is to validate legal input (as it appears from your example) I would try to create a method
boolean input.isSexValid() to encapsulate that functionality which would make the code less repetitive and more readable.
Strong argument that this is primarily opinion based, but:
I vote Approach 2.
What if the getter is slow (like it has to go to a DB)? You have a redundant round trip to the DB.

Critical loop containing many “if” whose output is constant : How to save on condition tests ? For Java ;)

I just read this thread Critical loop containing many "if" whose output is constant : How to save on condition tests?
and this one Constant embedded for loop condition optimization in C++ with gcc which are exactly what I would like to do in Java.
I have some if conditions called many times, the conditions are composed of attributes define at initialization and which won't change.
Will the Javac optimize the bytecode by removing the unused branches of the conditions avoiding to spend time testing them?
Do I have to define the attributes as final or is it useless?
Thanks for you help,
Aurélien
Java compile time optimization is pretty lacking. If you can use a switch statement it can probably do some trivial optimizations. If the number of attributes is very large then a HashMap is going to be your best bet.
I'll close by saying that this sort of thing is very very rarely a bottleneck and trying to prematurely optimize it is counterproductive. If your code is, in fact, called a lot then the JIT optimizer will do its best to make your code run faster. Just say what you want to happen and only worry about the "how" when you find that's actually worth the time to optimize it.
In OO languages, the solution is to use delegation or the command pattern instead of if/else forests.
So your attributes need to implement a common interface like IAttribute which has a method run() (or make all attributes implement Runnable).
Now you can simply call the method without any decisions in the loop:
for(....) {
attr.run();
}
It's a bit more complex if you can't add methods to your attributes. My solution in this case is using enums and an EnumMap which contains the runnables. Access to an EnumMap is almost like an array access (i.e. O(1)).
for(....) {
map.get(attr).run();
}
I don't know about Java specifics regarding this, but you might want to look into a technique called Memoization which would allow you to look up results for a function in a table instead of calling the function. Effectively, memoization makes your program "remember" results of a function for a given input.
Try replacing the if with runtime polymorphism. No, that's not as strange as you think.
If, for example you have this:
for (int i=0; i < BIG_NUMBER; i++) {
if (calculateSomeCondition()) {
frobnicate(someValue);
} else {
defrobnicate(someValue);
}
}
then replace it with this (Function taken from Guava, but can be replaced with any other fitting interface):
Function<X> f;
if (calculateSomeCondition()) {
f = new Frobnicator();
else {
f = new Defrobnicator();
}
for int (i=0; i < BIG_NUMBER; i++) {
f.apply(someValue);
}
Method calls are pretty highly optimized on most modern JVMs even (or especially) if there are only a few possible call targets.

Returning from a method with implicit or explicit "else" or with a single "return" statement?

Some people consider multiple return statements as bad programming style. While this is true for larger methods, I'm not sure if it is acceptable for short ones. But there is another question: Should else explicitly be written, if there is a return statement in the previous if?
Implicit else:
private String resolveViewName(Viewable viewable) {
if(viewable.isTemplateNameAbsolute())
return viewable.getTemplateName();
return uriInfo.getMatchedResources().get(0).getClass().toString();
}
Explicit else:
private String resolveViewName(Viewable viewable) {
if(viewable.isTemplateNameAbsolute())
return viewable.getTemplateName();
else
return uriInfo.getMatchedResources().get(0).getClass().toString();
}
Technically else is not necessary here, but it make the sense more obvious.
And perhaps the cleanest approach with a single return:
private String resolveViewName(Viewable viewable) {
String templateName;
if(viewable.isTemplateNameAbsolute())
templateName = viewable.getTemplateName();
else
templateName = uriInfo.getMatchedResources().get(0).getClass().toString();
return templateName;
}
Which one would you prefer? Other suggestions?
Other obvious suggestion: use the conditional operator.
private String resolveViewName(Viewable viewable) {
return viewable.isTemplateNameAbsolute()
? viewable.getTemplateName()
: uriInfo.getMatchedResources().get(0).getClass().toString();
}
For cases where this isn't viable, I'm almost certainly inconsistent. I wouldn't worry too much about it, to be honest - it's not the kind of thing where the readability is like to be significantly affected either way, and it's unlikely to introduce bugs.
(On the other hand, I would suggest using braces for all if blocks, even single statement ones.)
i prefer the cleanest approach with single return.To me code is readable, maintainable and not confusing.Tomorrow if you need to add some lines to the if or else block it is easy.
1.) code should never be clever.
The "single point of exit" dogma comes from the days of Structured Programming.
In its day, structured programming was a GOOD THING, especially as an alternative to the GOTO ridden spaghetti code that was prevalent in 1960's and 1970's vintage Fortran and Cobol code. But with the popularity of languages such as Pascal, C and so on with their richer range of control structures, Structured Programming has been assimilated into mainstream programming, and certain dogmatic aspects have fallen out of favor. In particular, most developers are happy to have multiple exits from a loop or method ... provided that it makes the code easier to understand.
My personal feeling is that in this particular case, the symmetry of the second alternative makes it easiest to understand, but the first alternative is almost as readable. The last alternative strikes me as unnecessarily verbose, and the least readable.
But #Jon Skeet pointed out that there is a far more significant stylistic issue with your code; i.e. the absence of { } blocks around the 'then' and 'else' statements. To me the code should really be written like this:
private String resolveViewName(Viewable viewable) {
if (viewable.isTemplateNameAbsolute()) {
return viewable.getTemplateName();
} else {
return uriInfo.getMatchedResources().get(0).getClass().toString();
}
}
This is not just an issue of code prettiness. There is actually a serious point to always using blocks. Consider this:
String result = "Hello"
if (i < 10)
result = "Goodbye";
if (j > 10)
result = "Hello again";
At first glance, it looks like result will be "Hello again" if i is less than 10 and j is greater than 10. In fact, that is a misreading - we've been fooled by incorrect indentation. But if the code had been written with { } 's around the then parts, it would be clear that the indentation was wrong; e.g.
String result = "Hello"
if (i < 10) {
result = "Goodbye";
}
if (j > 10) {
result = "Hello again";
}
As you see, the first } stands out like a sore thumb and tells us not to trust the indentation as a visual cue to what the code means.
I usually prefer the first option since it's the shortest.
And I think that any decent programmer should realize how it works without me having to write the else or using a single return at the end.
Plus there are cases in long methods where you might need to do something like
if(!isValid(input)) { return null; }// or 0, or false, or whatever
// a lot of code here working with input
I find it's even clearer done like this for these types of methods.
Depends on the intention. If the first return is a quick bail-out, then I'd go without the else; if OTOH it's more like a "return either this or that" scenario, then I'd use else. Also, I prefer an early return statement over endlessly nested if statements or variables that exist for the sole purpose of remembering a return value. If your logic were slightly complex, or even as it is now, I'd consider putting the two ways of generating the return value into dedicated functions, and use an if / else to call either.
I prefer multiple returns in an if-else structure when the size of both statements is about equal, the code looks more balanced that way. For short expressions I use the ternary operator. If the code for one test is much shorter or is an exceptional case, I might use a single if with the rest of the code remaining in the method body.
I try to avoid modifying variables as much as possible, because I think that makes the code much harder to follow than multiple exits from a method.
Keep the lingo consistent and readable for the lowest common denominated programmer who might have to revisit the code in the future.
Its only a few extra letters to type the else, and makes no difference to anything but legibility.
I prefer the first one.
Or... you can use if return else return for equally important bifurcations, and if return return for special cases.
When you have assertions (if p==null return null) then the first one is the most clear by far. If you have equally weighted options... I find fine to use the explicit else.
It's completely a matter of personal preference - I've literally gone through phases of doing all 4 of those option (including the one Jon Skeet posted) - none of them are wrong, and I've never experienced any drawbacks as a result of using either of them.
The stuff about only one return statement dates from the 1970s when Dijkstra and Wirth were sorting out structured programming. They applied it with great success to control structures, which have now settled down according to their prescription of one entry and one exit. Fortran used to have multiple entries to a subroutine (or possibly function, sorry, about 35 years since I wrote any), and this is a feature I've never missed, indeed I don't think I ever used it.
I've never actually encountered this 'rule' as applied to methods outside academia, and I really can't see the point. You basically have to obfuscate your code considerably to obey the rule, with extra variables and so on, and there's no way you can convince me that's a good idea. Curiously enough, if you write it the natural way as per your first option the compiler usually generates the code according to the rule anyway ... so you can argue that the rule is being obeyed: just not by you ;-)
Sure, people have a lot to say about programming style, but don't be so concerned about something relatively trivial to your program's purpose.
Personally, I like to go without the else. If anybody is going through your code, chances are high he won't be too confused without the else.
I prefer the second option because to me it is the quickest to read.
I would avoid the third option because it doesn't add clarity or efficiency.
The first option is fine too, but at least I would put a blank line between the first bit (the if and its indented return) and the second return statement.
In the end, it comes to down to personal preference (as so many things in programming style).
Considering multiple return statements "bad style" is a long, long discredited fallacy. They can make the code far clearner and more maintainable than explicit return value variables. Especially in larger methods.
In your example, I'd consider the second option the cleanest because the symmetrical structure of the code reflects its semantics, and it's shorter and avoids the unnecessary variable.

Categories