Substituting multiple Strings in String - java

I know there are a lot of questions and answers related to similar questions but I couldn't find an answer to my question. This a small snippet of my code:
private String substitute(String text) {
List<Macro> macros = getMacros();
for (Macro macro : macros) {
text = StringUtils.replace(text, macro.getKey(), macro.getValue());
}
return text;
}
Would this be a good way to substitute multiple macros variables in a text String? This creates a new String object on every loop so I am wondering if there's a better way to do this. Ideally I would have used Apache Commons StrSubstitutor class but I can't because of the format of the tokens/macros (different formats and not between a fixed prefix/suffix). I also don't want to use Regex because of performance issues.
According to some coding rules at work I need to mark the argument as final. I wonder if that's indeed good practice here. I know that Strings are immutable and I know that whenever I call StringUtils.replace() it will return me a new String object. But I am wondering if the String argument here should be marked as final as suggested and in the method do something like this:
String result = text;
for (Macro macro : macros) {
result = StringUtils.replace(result, macro.getKey(), macro.getValue());
}
I just don't like this.
Any help would be appreciated. Thanks.

You can use apache velocity to replace a string with keys with the equivalent string with values.

Your concern seems to be valid. String is immutable so it creates multiple objects. You should you either use StringBuilder or StringBuffer.
I wrote a sample for you. Build from here
private static String substitute(String text) {
List<Macro> macros = getMacros();
StringBuffer st = new StringBuffer(text);
for (Macro macro : macros) {
int start = st.indexOf(macro.getKey());
if (start != -1) {
st.replace(start, start + macro.getKey().length(), macro.getValue());
}
}
return st.toString();
}
Cheers!!

If you have some concerns about performance you could use a StringBuilder, which allows you to declare the text param as final:
private String substitute(final String text) {
List<Macro> macros = getMacros();
StringBuilder stringBuilder=new StringBuilder(text);
for(Macro macro: macros) {
int index=stringBuilder.indexOf(macro.getKey());
if (index!=-1) {
stringBuilder.replace(index, index+macro.getKey().length(), macro.getValue());
}
}
return stringBuilder.toString();
}

Related

alternate method for using substring on a String

I have a string which contains an underscore as shown below:
123445_Lisick
I want to remove all the characters from the String after the underscore. I have tried the code below, it's working, but is there any other way to do this, as I need to put this logic inside a for loop to extract elements from an ArrayList.
public class Test {
public static void main(String args[]) throws Exception {
String str = "123445_Lisick";
int a = str.indexOf("_");
String modfiedstr = str.substring(0, a);
System.out.println(modfiedstr);
}
}
Another way is to use the split method.
String str = "123445_Lisick";
String[] parts = string.split("_");
String modfiedstr = parts[0];
I don't think that really buys you anything though. There's really nothing wrong with the method you're using.
Your method is fine. Though not explicitly stated in the API documentation, I feel it's safe to assume that indexOf(char) will run in O(n) time. Since your string is unordered and you don't know the location of the underscore apriori, you cannot avoid this linear search time. Once you have completed the search, extraction of the substring will be needed for future processing. It's generally safe to assume the for simple operations like this in a language which is reasonably well refined the library functions will have been optimized.
Note however, that you are making an implicit assumption that
an underscore will exist within the String
if there are more than one underscore in the string, all but the first should be included in the output
If either of these assumptions will not always hold, you will need to make adjustments to handle those situations. In either case, you should at least defensively check for a -1 returned from indexAt(char) indicating that '_' is not in the string. Assuming in this situation the entire String is desired, you could use something like this:
public static String stringAfter(String source, char delim) {
if(source == null) return null;
int index = source.indexOf(delim);
return (index >= 0)?source.substring(index):source;
}
You could also use something like that:
public class Main {
public static void main(String[] args) {
String str = "123445_Lisick";
Pattern pattern = Pattern.compile("^([^_]*).*");
Matcher matcher = pattern.matcher(str);
String modfiedstr = null;
if (matcher.find()) {
modfiedstr = matcher.group(1);
}
System.out.println(modfiedstr);
}
}
The regex groups a pattern from the start of the input string until a character that is not _ is found.
However as #Bill the lizard wrote, i don't think that there is anything wrong with the method you do it now. I would do it the same way you did it.

How to convert a string[] into string?

I know how to separate the string to string[]. In my project, I use t = time.split("-") to split a time into a string array t, and t[0]=DD, t[1]=MM. Now I need to convert the string array t into string time with format DD-MM. Did java have functions for that?
Guava does:
String joined = Joiner.on('-').join(parts);
On the other hand, I'd actually suggest not splitting and joining your string to start with. Instead, parse it into an appropriate date/time type (ideally Joda Time), perform any manipulation you need, and then reformat it using a different format pattern.
This will improve your error detection, and basically make your code really reflect the nature of the data you're working with - instead of just talking about splitting and joining text.
You can use String Utils from Apache Commons like this:
String res = StringUtils.join(myStrings, "-");
If you are not looking to use external frameworks, you can roll your own, like this:
StringBuilder res = new StringBuilder();
boolean isFirst = true;
for (String s : myStrings) {
if (!isFirst) {
res.append('-');
} else {
isFirst = false;
}
res.append(s);
}
public static String join(String[] arr, String separator)
{
StringBuilder b = new StringBuilder();
for(int i = 0; i < arr.length; i++)
{
if(i != 0) b.append(separator);
b.append(arr[i]);
}
return b.toString();
}
For this simple example, what's wrong with this?
String time = t[0]+"-"+t[1];
Yes, use StringBuilder when performance matters, but this is much more concise.

StringBuilder vs. String considering replace

When doing concatenating lots of strings, I have been recommended to do it using a StringBuilder as such:
StringBuilder someString = new StringBuilder("abc");
someString.append("def");
someString.append("123");
someString.append("moreStuff");
as opposed to
String someString = "abc";
someString = someString + "def";
someString = someString + "123";
someString = someString + "moreStuff";
which would result in the creation of quite a few Strings, as opposed to one.
Now, I need to do a similar thing, but instead of using concatenation I use the replace method of String as such:
String someString = SOME_LARGE_STRING_CONSTANT;
someString = someString.replace("$VARIABLE1", "abc");
someString = someString.replace("$VARIABLE2", "def");
someString = someString.replace("$VARIABLE3", "123");
someString = someString.replace("$VARIABLE4", "moreStuff");
To accomplish the same thing using StringBuilder, I have to do this, just for one replace:
someString.replace(someString.indexOf("$VARIABLE1"), someString.indexOf("$VARIABLE1")+10, "abc");
So my question is: "Is it better to use String.replace and have lots of extra Strings created, or to use StringBuilder still, and have lots of long winded lines such as the one above?"
It is true that StringBuilder tends to be better than concatenating or modifying Strings manually, since StringBuilder is mutable, while String is immutable and you need to create a new String for each modification.
Just to note, though, the Java compiler will automatically convert an example like this:
String result = someString + someOtherString + anotherString;
into something like:
String result = new StringBuilder().append(someString).append(someOtherString).append(anotherString).toString();
That said, unless you're replacing a whole lot of Strings, go for whichever is more readable and more maintainable. So if you can keep it cleaner by having a sequence of 'replace' calls, go ahead and do that over the StringBuilder method. The difference will be negligible compared to the stress you save from dealing with the sad tragedy of micro-optimizations.
PS
For your code sample (which, as OscarRyz pointed out, won't work if you have more than one "$VARIABLE1" in someString, in which case you'll need to use a loop), you could cache the result of the indexOf call in:
someString.replace(someString.indexOf("$VARIABLE1"), someString.indexOf("$VARIABLE1")+10, "abc");
With
int index = someString.indexOf("$VARIABLE1");
someString.replace(index, index+10, "abc");
No need to search the String twice :-)
Guess what? If you are running with Java 1.5+ the concatenation works the same with string literals
String h = "hello" + "world";
and
String i = new StringBuilder().append("hello").append("world").toString();
Are the same.
So, the compiler did the work for you already.
Of course better would be:
String j = "hellworld"; // ;)
As for the second, yeap, that's preferred, but should't be that hard, with the power of "search and replace" and a bit of regex foo
For instance you can define a method like the one in this sample:
public static void replace( String target, String replacement,
StringBuilder builder ) {
int indexOfTarget = -1;
while( ( indexOfTarget = builder.indexOf( target ) ) >= 0 ) {
builder.replace( indexOfTarget, indexOfTarget + target.length() , replacement );
}
}
And your code currently looks like this:
someString = someString.replace("VARIABLE1", "abc");
someString = someString.replace("VARIABLE2", "xyz");
All you have to do is grab text editor an trigger something like this vi search and replace:
%s/^.*("\(.*\)".\s"\(.*\)");/replace("\1","\2",builder);
That read: "take anything in parenthesis and that looks like a string literal, and put it in this other string".
And your code will look from this:
someString = someString.replace("VARIABLE1", "abc");
someString = someString.replace("VARIABLE2", "xyz");
to this:
replace( "VARIABLE1", "abc", builder );
replace( "VARIABLE2", "xyz", builder );
In no time.
Here's a working demo:
class DoReplace {
public static void main( String ... args ) {
StringBuilder builder = new StringBuilder(
"LONG CONSTANT WITH VARIABLE1 and VARIABLE2 and VARIABLE1 and VARIABLE2");
replace( "VARIABLE1", "abc", builder );
replace( "VARIABLE2", "xyz", builder );
System.out.println( builder.toString() );
}
public static void replace( String target, String replacement,
StringBuilder builder ) {
int indexOfTarget = -1;
while( ( indexOfTarget = builder.indexOf( target ) ) > 0 ) {
builder.replace( indexOfTarget, indexOfTarget + target.length() ,
replacement );
}
}
}
I would say go for using StringBuilder but simply write a wrapper that facilitates making the code more readable and thus more maintainable, while still maintaining efficiency. =D
import java.lang.StringBuilder;
public class MyStringBuilder
{
StringBuilder sb;
public MyStringBuilder()
{
sb = new StringBuilder();
}
public void replace(String oldStr, String newStr)
{
int start = -1;
while ((start = sb.indexOf(oldStr)) > -1)
{
int end = start + oldStr.length();
sb.replace(start, end, newStr);
}
}
public void append(String str)
{
sb.append(str);
}
public String toString()
{
return sb.toString();
}
//.... other exposed methods
public static void main(String[] args)
{
MyStringBuilder sb = new MyStringBuilder();
sb.append("old old olD dudely dowrite == pwn");
sb.replace("old", "new");
System.out.println(sb);
}
}
OUTPUT:
new new olD dudely dowrite == pwn
Now you can just use the new version that is one easy liner
MyStringBuilder mySB = new MyStringBuilder();
mySB.append("old dudley dowrite == pwn");
mySB.replace("old", "new"):
Instead of having long lines like that, you could just write a method for replacing parts of StringBuilder strings, something along the lines of this:
public StringBuilder replace(StringBuilder someString, String replaceWhat, String replaceWith) {
return someString.replace(someString.indexOf(replaceWhat), someString.indexOf(replaceWhat)+replaceWhat.length(), replaceWith);
}
May be the String Class internally uses
indexOf
method to find index of old string and replace it with new string.
And also StringBuilder is not threadsafe so it executes much faster.
If your string really is large and you're worried about performance I would recommend writing a class which takes your template text and a list of variables, then reads over the source string character by character and builds the result using StringBuilder. That should be the most efficient both in terms of CPU and memory usage. Also, if you are reading this template text from a file I wouldn't load it all into memory up front. Process it in chunks as you read it from the file.
If you're just looking for a nice way to build a string that's not quite as efficient as StringBuilder but more efficient than appending strings over and over you can use String.format(). It works like sprintf() in C. MessageFormat.format() is an option too but it uses StringBuffer.
There is another related question here: Inserting a Java string in another string without concatenation?
All guys' codes have a bug .try yourReplace("x","xy").It will loop infinitely
Jam Hong is correct - the above solutions all contain the potential to loop infinitely. I guess the lesson to take away here is that micro optimisations can often cause all sorts of horrible issues and don't really save you much. Still, be that as it may - here is a solution that will not infinite loop.
private static void replaceAll(StringBuilder builder, String replaceWhat, String replaceWith){
int occuranceIndex = builder.indexOf(replaceWhat);
int lastReplace = -1;
while(occuranceIndex >= 0){
if(occuranceIndex >= lastReplace){
builder.replace(occuranceIndex, occuranceIndex+replaceWhat.length(), replaceWith);
lastReplace = occuranceIndex + replaceWith.length();
occuranceIndex = builder.indexOf(replaceWhat);
}else{
break;
}
}
}
while it's true that micro optimizations can be problematic, it sometimes depends on the context, for instance, if your replace happens to run inside of a loop with 10000 iterations, your will see a significant performance difference from the "useless" optimizations.
in most cases however, it's best to err on the side of readability

Convert single char in String to lower case

I like to 'guess' attribute names from getter methods. So 'getSomeAttribute' shall be converted to 'someAttribute'.
Usually I do something like
String attributeName = Character.toLowerCase(methodName.indexOf(3))
+ methodName.substring(4);
Pretty ugly, right? I usually hide it in a method, but does anybody know a better solution?
The uncapitalize method of Commons Lang shall help you, but I don't think your solution is so crude.
Have a look at the JavaBeans API:
BeanInfo info = Introspector.getBeanInfo(bean
.getClass(), Object.class);
for (PropertyDescriptor propertyDesc : info
.getPropertyDescriptors()) {
String name = propertyDesc.getName();
}
Also see decapitalize.
uncapitalize from commons lang would do it:
String attributeName = StringUtils.uncapitalize(methodName.substring(3));
I need commons lang a lot, but if you don't like that extra jar, you could copy the method. As you can see in it, they doin' it like you:
public static String uncapitalize(String str) {
int strLen;
if (str == null || (strLen = str.length()) == 0) {
return str;
}
return new StringBuffer(strLen)
.append(Character.toLowerCase(str.charAt(0)))
.append(str.substring(1))
.toString();
}
Its worth remembering that;
not all getXXX methods are getters e.g. double getSqrt(double x), void getup().
methods which return boolean, start with is and don't take an argument can be a getter, e.g. boolean isActive().
Given a character buffer, you can apply the below code:
int i = 0;
for(char x : buffer) {
buffer[i] = Character.toLowerCase(x);
i++;
}
Tested and functions :)
Looks fine to me. Yes, it looks verbose, but consider what you're trying to do, and what another programmer would think if they were trying to understand what this code is trying to do. If anything, I'd make it longer, by adding what you're doing (guessing attribute names from getter methods) as a comment.

Avoiding multiple If statements in Java

I've coded a method something like this. But I guess this should undergo refactoring.
Can any one suggest the best approach to avoid using this multiple if statements?
private String getMimeType(String fileName){
if(fileName == null) {
return "";
}
if(fileName.endsWith(".pdf")) {
return "application/pdf";
}
if(fileName.endsWith(".doc")) {
return "application/msword";
}
if(fileName.endsWith(".xls")) {
return "application/vnd.ms-excel";
}
if(fileName.endsWith(".xlw")) {
return "application/vnd.ms-excel";
}
if(fileName.endsWith(".ppt")) {
return "application/vnd.ms-powerpoint";
}
if(fileName.endsWith(".mdb")) {
return "application/x-msaccess";
}
if(fileName.endsWith(".rtf")) {
return "application/rtf";
}
if(fileName.endsWith(".txt")) {
return "txt/plain";
}
if(fileName.endsWith(".htm") || fileName.endsWith(".html")) {
return "txt/html";
}
return "txt/plain";
}
I cannot use switch-case here as my 'condition' is a java.lang.String.
You can use a Map to hold your solutions:
Map<String,String> extensionToMimeType = new HashMap<String,String>();
extensionToMimeType.put("pdf", "application/pdf");
extensionToMimeType.put("doc", "application/msword");
// and the rest
int lastDot = fileName.lastIndexOf(".");
String mimeType;
if (lastDot == -1) {
mimeType = NO_EXTENSION_MIME_TYPE;
} else {
String extension = fileName.substring(lastDot+1);
mimeType = extensionToMimeType.getOrDefault(extension,
UNKNOWN_EXTENSION_MIME_TYPE);
}
For this code to work you'll need to have defined NO_EXTENSION_MIME_TYPE and UNKNOWN_EXTENSION_MIME_TYPE as in your class, somewhat like this:
private static final String NO_EXTENSION_MIME_TYPE = "application/octet-stream";
private static final String UNKNOWN_EXTENSION_MIME_TYPE = "text/plain";
Using a HashMap perhaps?
This way you could do myMap.get(mystr);
Command pattern is the way to go. Here is one example using java 8:
1. Define the interface:
public interface ExtensionHandler {
boolean isMatched(String fileName);
String handle(String fileName);
}
2. Implement the interface with each of the extension:
public class PdfHandler implements ExtensionHandler {
#Override
public boolean isMatched(String fileName) {
return fileName.endsWith(".pdf");
}
#Override
public String handle(String fileName) {
return "application/pdf";
}
}
and
public class TxtHandler implements ExtensionHandler {
#Override public boolean isMatched(String fileName) {
return fileName.endsWith(".txt");
}
#Override public String handle(String fileName) {
return "txt/plain";
}
}
and so on .....
3. Define the Client:
public class MimeTypeGetter {
private List<ExtensionHandler> extensionHandlers;
private ExtensionHandler plainTextHandler;
public MimeTypeGetter() {
extensionHandlers = new ArrayList<>();
extensionHandlers.add(new PdfHandler());
extensionHandlers.add(new DocHandler());
extensionHandlers.add(new XlsHandler());
// and so on
plainTextHandler = new PlainTextHandler();
extensionHandlers.add(plainTextHandler);
}
public String getMimeType(String fileExtension) {
return extensionHandlers.stream()
.filter(handler -> handler.isMatched(fileExtension))
.findFirst()
.orElse(plainTextHandler)
.handle(fileExtension);
}
}
4. And this is the sample result:
public static void main(String[] args) {
MimeTypeGetter mimeTypeGetter = new MimeTypeGetter();
System.out.println(mimeTypeGetter.getMimeType("test.pdf")); // application/pdf
System.out.println(mimeTypeGetter.getMimeType("hello.txt")); // txt/plain
System.out.println(mimeTypeGetter.getMimeType("my presentation.ppt")); // "application/vnd.ms-powerpoint"
}
Personally I don't have problems with the if statements. The code is readable, it took just milliseconds to understand what you're doing. It's a private method anyway and if the list of mime types is static then there's no urgent need to move the mapping to a properties file and use a lookup table (map). Map would reduce lines of code, but to understand the code, then you're forced to read the code and the implementation of the mapping - either a static initializer or an external file.
You could change the code a bit and use an enum:
private enum FileExtension { NONE, DEFAULT, PDF, DOC, XLS /* ... */ }
private String getMimeType(String fileName){
String mimeType = null;
FileExtension fileNameExtension = getFileNameExtension(fileName);
switch(fileNameExtension) {
case NONE:
return "";
case PDF:
return "application/pdf";
// ...
case DEFAULT:
return "txt/plain";
}
throw new RuntimeException("Unhandled FileExtension detected");
}
The getFileNameExtension(String fileName) method will just return the fitting enum value for the fileName, FileExtension.NONE if fileName is empty (or null?) and FileExtension.DEFAULT if the file extension is not mapped to a mime type.
what about using a MIME detection library instead?
mime-util
mime4j
JMimeMagic library - Free. Uses file extension and magic headers to determine MIME type.
mime-util - Free. Uses file extension and magic headers to determine MIME type.
DROID (Digital Record Object Identification) - Free. Uses batch automation to detect MIME types.
Aperture Framework - Free. A framework for crawling external sources to identify MIME types.
(feel free to add more, there so many libraries..)
I consider your approach to be the best overall. This comes after having tested with a number of different approaches myself.
I see a number of huge benefits in your current approach, namely:
Easily readable and understandable by anyone (in my experience, medium-level programmers often underestimate this and usually prefer going with fancy-patterns which, in the end are not readable at all for the vast majority of programmers who do not know that specific pattern)
All the information is in one single place. As Andreas_D pointed out, hunting around files or classes is not a good option for someone that needs to fix a bug while you are on holiday!
Easily maintainable: I could "F3" (if you are Eclipse-ing) on the method and add a new content type in seconds without any worries of introducing bugs!
I can suggest a few things anyway:
This method is very general purpose:
Why should it be private?! This is a
public method of some utility/helper class!
Moreover it should be a static method!! You don't need anything
from the Object itself to perform
your job!
You could use indenting to make
things prettier and compact. I know
that indenting is some kind of
religion for the most of us, but I
think it should not be a strict rule;
it should be properly used to make
our code more readable and compact.
If this would be a config file you
would probably have something like:
pdf=application/pdf
doc=application/msword
You could have a very similar result with:
public static String getMimeType(String fileName){
if(fileName == null) return "";
if(fileName.endsWith(".pdf")) return "application/pdf";
if(fileName.endsWith(".doc")) return "application/msword";
if(fileName.endsWith(".xls")) return "application/vnd.ms-excel";
return "txt/plain";
}
This is also what a lot of the Map based implementations look like.
There is no way to evade that in general. In your case - if there is a set of allowed extensions - you could create an Enum, convert the extension to the Enum type via valueOf(), and then you can switch over your enum.
Easiest and shortest way for this particular problem would be using the builtin Java SE or EE methods.
Either in "plain vanilla" client application (which derives this information from the underlying platform):
String mimeType = URLConnection.guessContentTypeFromName(filename);
Or in a JSP/Servlet web application (which derives this information from the web.xml files):
String mimeType = getServletContext().getMimeType(filename);
I would do this by putting the associations in a map, and then using the map for lookup:
Map<String, String> map = new HashMap<String, String>();
map.put(".pdf", "application/pdf");
map.put(".doc", "application/msword");
// ... etc.
// For lookup:
private String getMimeType(String fileName) {
if (fileName == null || fileName.length() < 4) {
return null;
}
return map.get(fileName.substring(fileName.length() - 4));
}
Note that using the switch statements on strings is one of the proposed new features for the next version of Java; see this page for more details and an example of how that would look in Java 7:
switch (fileName.substring(fileName.length() - 4)) {
case ".pdf": return "application/pdf";
case ".doc": return "application/msword";
// ...
default: return null;
(edit: My solution assumes the file extension is always 3 letters; you'd have to change it slightly if it can be longer or shorter).
You can always use a Groovy class here as it allows for switch-case on Strings :)
Create an enum called MimeType with 2 String variables: extension and type. Create an appropriate constructor and pass in the ".xxx" and the "application/xxx" values. Create a method to do the lookup. You can use enums in switch.
Just to mention it: A direct equivalent to your code would not be using a map for direct lookup (since that would require each extension to have exactly 3 characters) but a for loop:
...
Map<String, String> extmap = GetExtensionMap();
for (Map.Entry<String,String> entry: extmap.entrySet())
if (fileName.endsWith(entry.getKey))
return entry.getValue();
...
This solution works with extensions of any length but is less performant than the hash lookup of course (and slightly less performant than the original solution)
The Algorithmic-Design-Guy solution
A more performant way would be to implement a tree structure starting with the last character of the extension and storing the appropriate MIME types at the respective nodes.
You could then walk down the tree starting with the last character of the file name. But this is probably an overkill ...
How about mapping the extensions to MIME types, then using a loop? Something like:
Map<String,String> suffixMappings = new HashMap<String,String>();
suffixMappings.put(".pdf", "application/pdf");
...
private String getMimeType(String fileName){
if (fileName == null) {
return "";
}
String suffix = fileName.substring(fileName.lastIndexOf('.'));
// If fileName might not have extension, check for that above!
String mimeType = suffixMappings.get(suffix);
return mimeType == null ? "text/plain" : mimeType;
}

Categories