Good practice for string pool in Ruby

Good practice for string pool in Ruby - java

In Java, we usually create a StringPool.class to store frequently used Strings. For example: we declarepublic static final String SPACE = " ";and we call StringPool.SPACEwhen needed.Is it good practice to do this in Ruby as well? If yes, can you give an example of StringPoolin Ruby?

If your intention is to group a set of constants in ruby within a specific context, you could do this using a class or a module like so:
class MyConstants
CONST_1 = "Constant1"
CONST_2 = "Constant2"
# ...
end
or
module MyConstants
CONST_1 = "Constant1"
CONST_2 = "Constant2"
end
You could then access those constants in the following way:
MyConstants::CONST
Note that constant values can be anything besides strings, even symbols. As previously mentioned in other answers, it is a common idiom in ruby to use symbols. However, this pattern makes sense when you want to make explicit the fact that your constants belong to a certain context (i.e. like an enumeration). This enforces IMHO the semantics of your application.

Your answer was already provided by Ari in comments, but I will duplicate it here to help others find answers fast.
You need a symbol.
What is a symbol?
Symbol is immutable, fast to compare string.
How to make a symbol?
You can define its instances these ways:
:foo # => :foo <- it's a symbol
"string".to_sym # => :string <- symbol too. But prefer a first way
# due to avoid string object creation.
It mustn't be magic!
And it is not. All symbols are stored in special symbol-table even your :foo and :string. You can look into symbols table:
Symbol.all_symbols #=> [:execute_script, :array_new_range, :ExtendCommand, ...]
This table seems to be really big. What if you want to find symbol in it?
Symbol.all_symbols.index(:foobar42) # => bad. It will find index of symbol
# :foobar42 but it will create it first
# if it wasn't.
Symbol.all_symbols.index(/foobar42/) # returns nil if there's no such a symbol
The symbols are fast to compare because they are immutable. Method == just checks if object_ids are same.
When to use symbols?
I'm pretty sure you know. When you want to use one string couple of times and not planning to change it.
That's all.
My first thought after reading your question was: hey, why do they need string pools? I thought strings are immutable in java. That's why we have to use string builders and other stuff. But then I saw public static final String. Oh yeah, it is java.

Related

Why using default trash value for string is wrong?

tl;dr;
Why using
string myVariable = "someInitialValueDifferentThanUserValue99999999999";
as default value is wrong?
explanation of situation:
I had a discussion with a colleague at my workplace.
He proposed to use some trash value as default in order to differentiate it from user value.
An easy example it would be like this:
string myVariable = "someInitialValueDifferentThanUserValue99999999999";
...
if(myVariable == "someInitialValueDifferentThanUserValue99999999999")
{
...
}
This is quite obvious and intuitive for me that this is wrong.
But I could not give a nice argument for this, beyond that:
this is not professional.
there is a slight chance that someone would input the same value.
Once I read that if you have such a situation your architecture or programming habits are wrong.
edit:
Thank you for the answers. I found a solution that satisfied me, so I share with the others:
It is good to make a bool guard value that indicates if the initialization of a specific object has been accomplished.
And based on this private bool variable I can deduce if I play with a string that is default empty value "" from my mechanism (that is during initialization) or empty value from the user.
For me, this is a more elegant way.

Optional
Optional can be used.
Returns an empty Optional instance. No value is present for this Optional.
API Note:
Though it may be tempting to do so, avoid testing if an object is empty by comparing with == against instances returned by Option.empty(). There is no guarantee that it is a singleton. Instead, use isPresent().
Ref: Optional
Custom escape sequence shared by server and client
Define default value
When the user enter's the default value, escape the user value
Use a marker character
Always define the first character as the marker character
Take decision based on this character and strip this character for any actual comparison
Define clear boundaries for the check as propagating this character across multiple abstractions can lead to code maintenance issues.

Small elaboration on "It's not professional":
It's often a bad idea, because
it wastes memory when not a constant (at least in Java - of course, unless you're working with very limited space that's negligible).
Even as constant it may introduce ambiguity once you have more classes, packages or projects ("Was it NO_INPUT, INPUT_NOT_PROVIDED, INPUT_NONE?")
usually it's a sign that there will be no standardized scope-bound Defined_Marker_Character in the Project Documentation like suggested in the other answers
it introduces ambiguity for how to deal with deciding if an input has been provided or not
In the end you will either have a lot of varying NO_INPUT constants in different classes or end up with a self-made SthUtility class that defines one constant SthUtility.NO_INPUT and a static method boolean SthUtility.isInputEmpty(...) that compares a given input against that constant, which basically is reinventing Optional. And you will be copy-pasting that one class into every of your projects.

There is really no need as you can do the following as of Java 11 which was four releases ago.
String value = "";
// true only if length == 0
if (value.isEmpty()) {
System.out.println("Value is empty");
}
String value = " ";
// true if empty or contains only white space
if (value.isBlank()) {
System.out.println("Value is blank");
}
And I prefer to limit uses of such strings that can be searched in the class file that might possibly lead to exploitation of the code.

How to store mathematical formula in MS SQL Server DB and interpret it using JAVA?

I have to give the user the option to enter in a text field a mathematical formula and then save it in the DB as a String. That is easy enough, but I also need to retrieve it and use it to do calculations.
For example, assume I allow someone to specify the formula of employee salary calculation which I must save in String format in the DB.
GROSS_PAY = BASIC_SALARY - NO_PAY + TOTAL_OT + ALLOWANCE_TOTAL
Assume that terms such as GROSS_PAY, BASIC_SALARY are known to us and we can make out what they evaluate to. The real issue is we can't predict which combinations of such terms (e.g. GROSS_PAY etc.) and other mathematical operators the user may choose to enter (not just the +, -, ×, / but also the radical sigh - indicating roots - and powers etc. etc.). So how do we interpret this formula in string format once where have retrieved it from DB, so we can do calculations based on the composition of the formula.

Building an expression evaluator is actually fairly easy.
See my SO answer on how to write a parser. With a BNF for the range of expression operators and operands you exactly want, you can follow this process to build a parser for exactly those expressions, directly in Java.
The answer links to a second answer that discusses how to evaluate the expression as you parse it.
So, you read the string from the database, collect the set of possible variables that can occur in the expression, and then parse/evaluate the string. If you don't know the variables in advance (seems like you must), you can parse the expression twice, the first time just to get the variable names.

as of Evaluating a math expression given in string form there is a JavaScript Engine in Java which can execute a String functionality with operators.
Hope this helps.

You could build a string representation of a class that effectively wraps your expression and compile it using the system JavaCompiler — it requires a file system. You can evaluate strings directly using javaScript or groovy. In each case, you need to figure out a way to bind variables. One approach would be to use regex to find and replace known variable names with a call to a binding function:
getValue("BASIC_SALARY") - getValue("NO_PAY") + getValue("TOTAL_OT") + getValue("ALLOWANCE_TOTAL")
or
getBASIC_SALARY() - getNO_PAY() + getTOTAL_OT() + getALLOWANCE_TOTAL()
This approach, however, exposes you to all kinds of injection type security bugs; so, it would not be appropriate if security was required. The approach is also weak when it comes to error diagnostics. How will you tell the user why their expression is broken?
An alternative is to use something like ANTLR to generate a parser in java. It's not too hard and there are a lot of examples. This approach will provide both security (users can't inject malicious code because it won't parse) and diagnostics.

recommended way to find a word in a comma-separated string?

I want to find if a utility is in one of the utilities.
I have a JUnit test as following
#Test
public void testUtilityInUtilities() {
final String utilities = "Pacific Gas & Electric (PG&E),San Diego Gas & Electric (SDG&E), Salt River Project (SRP),Southern California Edison (SCE)";
final String utility = "San Diego Gas & Electric (SDG&E)";
assertTrue(utilities.contains(utility));
}
Is it a good enough test? or shall I do something similar to following?
String[] splitString = (utilities.split(","));
for (String string : splitString) {
if (string.equals(utility)) {return true;}
}
return false;
which method is recommended? split or contains or anything else?

The contains way is faster, but it is prone to false positives: it will match a sub-string, say, "Gas & Electric", even though the actual string was "Pacific Gas & Electric (PG&E)". You can guard against this by requiring that the points around the match be at an end of the string or at a comma. You could improve upon the first method by constructing a regular expression from the regex-quoted search string framed by end markers (i.e. commas, $ and ^) to require a complete match, too.
The split way is more reliable, but it is wasteful: you end up creating a whole array of substrings, only to check for a presence of a single string, and throw away the rest.
All in all, I would prefer the first method in situations where performance matters, because it is not wasteful. If you run this method once in a while, though, the split-based method is easier to code and to read.

For the case that you have mentioned contains should suffice. Split would unnecessarily end up creating an additional array which you are not using for your data processing (atleast in the above mentioned code).
Also another point that you need to consider is how many searches will you be performing in the given String. If you are performing multiple searches of String utility in the utilities String then you should think of using more complex data structures that enable multiple fast searches eg :Suffix trees.

Accessing specific Grails Params in controller

This is probably a simple one and more Java related than grails but I'm a bit lost and not sure where to even start looking on this, I've googled about but am not really sure what I'm after, so would appreciate a pointer if possible please!
In the grails app I have a form which I save, all well and good. In the controller I can see the list of params it returns via a simple println and when I want to find a specific value currently I do a params.each and then compare the key to a pre defined string to find the one I want, my question is: -
Can I, and how would I, specifically say "get me the value of the parameter with the key "banana", rather than having to loop through the whole list to find it?
Also is there a way of creating a new set of secondary params, or just another plain old dictionary item (is that the right term?) where I use a regular expression to say "give me all the items whose key match the pattern "XYZ"?
It probably doesn't make much difference speed wise as the params are never that big but it'd be nice to make things more efficient where possible.
Any feedback much appreciated!

For a first question, to get 'banana' parameter you have to use:
params.banana
For second, find all with regexp:
def matched = params.findAll { it.key =~ /XYZ/ }
//or
Pattern p = ~/XYZ/
def matched = params.findAll { p.matcher(it.key).matches() }

There's a params object you can use. Eg with someurl.com?myparam=test you can access it with "params.myparam"
More information over here: http://grails.org/doc/2.2.x/ref/Controllers/params.html

Is StringBuffer the same as Strings in Ruby and Symbols the same as regular Java strings?

I just started reading this book Eloquent Ruby and I have reached the chapter about Symbols in Ruby.
Strings in Ruby are mutable, which means each string allocate memory since the content can change, and even though the content is equal. If I need a mutable String in Java I would use StringBuffer. However since regular Java Strings are immutable one String object can be shared by multiple references. So if I had two regular Strings with the content of "Hello World", both references would point to the same object.
So is the purpose of Symbols in Ruby actually the same as "normal" String objects in Java? Is it a feature given to the programmer to optimize memory?
Is something of what I written here true? Or have I misunderstood the concept of Symbols?

Symbols are close to strings in Ruby, but they are not the equivalent to regular Java strings, although they, too, do share some commonalities such as immutability. But there is a slight difference - there is more than one way to obtain a reference to a Symbol (more on that later on).
In Ruby, it is entirely possible to convert the two back and forth. There is String#to_sym to convert a String into a Symbol and there is Symbol#to_s to convert a Symbol into a String. So what is the difference?
To quote the RDoc for Symbol:
The same Symbol object will be created for a given name or string for the duration of a program‘s execution, regardless of the context or meaning of that name.
Symbols are unique identifiers. If the Ruby interpreter stumbles over let's say :mysymbol for the first time, here is what happens: Internally, the symbol gets stored in a table if it doesn't exist yet (much like the "symbol table" used by parsers; this happens using the C function rb_intern in CRuby/MRI), otherwise Ruby will look up the existing value in the table and use that. After the symbol gets created and stored in the table, from then on wherever you refer to the Symbol :mysymbol, you will get the same object, the one that was stored in that table.
Consider this piece of code:
sym1 = :mysymbol
sym2 = "mysymbol".to_sym
puts sym1.equal?(sym2) # => true, one and the same object
str1 = "Test"
str2 = "Test"
puts str1.equal?(str2) # => false, not the same object
to notice the difference. It illustrates the major difference between Java Strings and Ruby Symbols. If you want object equality for Strings in Java you will only achieve it if you compare exactly the same reference of that String, whereas in Ruby it's possible to get the reference to a Symbol in multiple ways as you saw in the example above.
The uniqueness of Symbols makes them perfect keys in hashes: the lookup performance is improved compared to regular Strings since you don't have to hash your key explicitly as it would be required by a String, you can simply use the Symbol's unique identifier for the lookup directly. By writing :somesymbol you tell Ruby to "give me that one thing that you stored under the identifier 'somesymbol'". So symbols are your first choice when you need to uniquely identify things as in:
hash keys
naming or referring to variable, method and constant names (e.g. obj.send :method_name )
But, as Jim Weirich points out in the article below, Symbols are not Strings, not even in the duck-typing sense. You can't concatenate them or retrieve their size or get substrings from them (unless you convert them to Strings first, that is). So the question when to use Strings is easy - as Jim puts it:
Use Strings whenever you need … umm … string-like behavior.
Some articles on the topic:
Ruby Symbols.
Symbols are not immutable strings
13 Ways of looking at a Ruby Symbol

The difference is that Java Strings need not point to the same object if they contain the same text. When declaring constant strings in your code, this normally is the case since the compiler will put it in the constant pool.
However, if you create a String dynamically at runtime in Java, two Strings can perfectly point to different objects and still contain the same literal text. You can however force this by internalizing the String objects (calling String.intern(), see Java API
A nice example can be found here.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.