I am trying to replace the short form words to normal from a string in java but don't know how to do it in a good way because I can have multiple ('ve 're 'nt) and such. Is it good to use array list and if so how do I achieve that?
What I have tried so far:
public class main {
public static void main(String[] args) {
String s = "We've been doing this for ages. I'm having a difficulty doing this. Thats getting confusing.";
s = s.replaceAll("we've", "we have");
s = s.replaceAll("I'm", "I am");
s = s.replaceAll("that's", "that is");
}
}
Thanks!
You can do it more efficiently using regexes.
First, build a map containing your searches and replacements.
Map<String, String> replacements =
Map.of("we've", "we have", "I'm", "I am" /* etc */);
(or some pre-Java 9 equivalent)
Now, build a regex to match the things you want to replace:
Pattern p = Pattern.compile(
replacements.keySet()
.stream()
.map(Pattern::quote)
.collect(Collectors.joining("|")));
Now, create a Matcher and a StringBuilder in which to accumulate your new string:
Matcher m = p.matcher(s);
StringBuffer sb = new StringBuffer();
while (m.find()) {
String replacement = replacements.get(m.group(0));
m.appendReplacement(sb, replacement);
}
m.appendTail(sb);
String newS = sb.toString();
Ideone demo
You could use a Map<String, String>, for example a HashMap where the keys would be the short form to replace and the value the replacement string. Then you could just iterate Map.entrySet() and call the replace method on the string.
Code could be (please note that I have omitted the initial letter - except for I which has to be upper case - to avoid the capitalized problem):
String str = s;
HashMap<String, String> replacements = new HashMap<>();
replacements.put("e've", "e have");
replacements.put("I'm", "I am");
replacements.put("hat's", "hat is");
for (Map.Entry<String, String> entry: replacements.entrySet()) {
str = str.replaceAll(entry.getKey(), entry.getValue());
}
It does not really make sense if it is intended to be used only once, but it could be the base of a method that could be re-used on many strings.
You can use a StringBuilder if you don't want to keep creating new strings all the time:
StringBuilder builder = new StringBuilder("We've been doing this for ages. I'm having a difficulty doing this. That's getting confusing.");
HashMap<String, String> replacements = new HashMap<>();
replacements.put("'ve", " have");
replacements.put("'m", " am");
replacements.put("'s", " is");
// others...
for (Map.Entry<String, String> entry: replacements.entrySet()) {
int index;
while ((index = builder.indexOf(entry.getKey())) != -1) {
builder.replace(index, index + entry.getKey().length(), entry.getValue());
}
}
System.out.println(builder);
Do note that if you are trying to replace all contractions like this, you are unlikely going to succeed 100% as some phrases contract to the same contraction, for example:
That has -> That's
That is -> That's
Also note that some 's don't indicate a contraction:
Mary's <-- how are you handling this?
You can kind of tackle the second problem by looking for more specific sequences like That's instead of just 's, but for the first problem, you will need to somehow understand the context.
Related
I have this mathematical expression:
String exp = "k+(mP/P+mL/L)";
Then i create a new HashMap and put exactly the same params as the above expression :
Map<String, Integer> mp = new HashMap<>();
mp.put("k", 1);
mp.put("mP", 2);
mp.put("P", 3);
mp.put("mL", 4);
mp.put("L", 5);
Finally i continue doing a litteration all of the entry-set by replace the parameter of my expression to values and after that i print my result:
for(Map.Entry<String,Integer> e: mp.entrySet()){
exp = exp.replace(e.getKey(),e.getValue().toString());
}
System.out.println(exp);
the result of the above is : "1+(m3/3 +m5/5)"
*but i want this instead : "1+(2/3+4/5)"
Is there any way?
Use regular expression replaceAll with word-boundary \b.
exp = exp.replaceAll("\\b" + e.getKey() + "\\b", e.getValue().toString());
You could also look into the scripting API or the java REPL.
The problem is caused by HashMap not keeping insert order. Using LinkedHashMap, which keeps insertion order will solve the problem.
Map<String, Integer> mp = new LinkedHashMap<>();
String.replace() replaces all occurrence of the substring, so if P gets replaced before mP, it can lead to problems like the one you described. To avoid that, you can build the expression with placeholders, which don't have common letters. Like this for example:
String exp = "a+(b/c+d/e)";
You can also sort the keys by their length:
import java.util.Arrays;
import java.util.HashMap;
import java.util.Map;
public static void main(String[] args) {
String exp = "k+(mP/P+mL/L)";
Map<String, Integer> mp = new HashMap<String, Integer>();
mp.put("mP", 2);
mp.put("mL", 4);
mp.put("k", 1);
mp.put("P", 3);
mp.put("L", 5);
String[] keys = new String[mp.keySet().size()];
int i = 0;
for (String k : mp.keySet()) {
keys[i] = k;
i++;
}
/* Sort array by string length (longest string at the beginning) */
Arrays.sort(keys, (a, b) -> Integer.compare(b.length(), a.length()));
for (String k : keys) {
exp = exp.replace(k, mp.get(k).toString());
}
System.out.println(exp); // 1+(2/3+4/5)
}
Let's say I have a String text = "abc" and I want to replace a map of values, eg:
a->b
b->c
c->a
How would you go for it?
Because obviously:
map.entrySet().forEach(el -> text = text.replaceAll(el.getKey(), el.getValue()))
won't work, since the second replacement will overwrite also the first replacement (and at the end you won't get bca)
So how would you avoid this "replacement of the previous replacement"?
I saw this answer but I hope in a more concise and naive solution (and hopefully without the use of Apache external packages)
By the way the string can be also more than one character
I came up with this solution with java streams.
String text = "abc";
Map<String, String> replaceMap = new HashMap<>();
replaceMap.put("a", "b");
replaceMap.put("b", "c");
replaceMap.put("c", "a");
System.out.println("Text = " + text);
text = Arrays.stream(text.split("")).map(x -> {
String replacement = replaceMap.get(x);
if (replacement != null) {
return x.replace(x, replacement);
} else {
return x;
}
}).collect(Collectors.joining(""));
System.out.println("Processed Text = " + text);
Output
Text = abc
Processed Text = bca
This is a problem I'd normal handle with regex replacement. The code for that in Java is a bit verbose, but this should work:
String text = "abc";
Map<String, String> map = new HashMap<>();
map.put("a", "b");
map.put("b", "c");
map.put("c", "a");
String regex = map.keySet()
.stream()
.map(s -> Pattern.quote(s))
.collect(Collectors.joining("|"));
String output = Pattern.compile(regex)
.matcher(text)
.replaceAll((m) -> {
String s = m.group();
String r = map.get(s);
return r != null ? r : s;
});
System.out.println(output);
// bca
It's relatively straightforward, if a little verbose because Java. First, create a regex expression that will accept any of the keys in the map (using Pattern.quote() to sanitize them), and then use lambda replacement to pluck the appropriate replacement from the map whenever an instance is found.
The performance-intensive part is just compiling the regex in the first place; the replacement itself should make only one pass through the string.
Should be compatible with Java 1.9+
Java 8 onwards, there is a method called chars that returns an IntStream from which you can get a character corresponding to integer represented by the character and map it using your map.
If your map is String to String map then you could use:
text = text.chars().mapToObj(el -> map.get(String.valueOf((char)el))).
collect(Collectors.joining(""));
if your map is Character to Character then just remove String.valueOf()
text = text.chars().mapToObj(el -> map.get((char)el)).collect(Collectors.joining(""));
I have an ArrayList from recursively crawling through a directory
[project1_john_document1, project1_john_document2, project2_jose_document1, project2_jose_document2, project3_juan_document1, ...... ]
I am trying to count the instances for each project to get the following output
project1 = 3,
project2 = 2,
project3 = 1, ....
What I have done is to Iterate through the list but somehow I am stuck as to how to get the "project1" as a common project, as there are numerous project names on the directory. I tried splitting the string using the split("_"), but since I am entirely noob, I couldn't get the logic of classifying the different project name.
Newbie in java here and sorry for the vague phrasing of my question.
You can use regex to get all the projects names, then you can use a Map for example :
String str = "[project1_john_document1, project1_john_document2, project2_jose_document1, project2_jose_document2, project3_juan_document1]";
Pattern p = Pattern.compile("project\\d+");
Matcher m = p.matcher(str);
Map<String, Integer> map = new HashMap<>();
String project;
while (m.find()) {
project = m.group();
if (map.containsKey(project )) {
map.put(project , map.get(project ) + 1);
} else {
map.put(project , 1);
}
}
for (Map.Entry<String, Integer> entry : map.entrySet()) {
System.out.println(entry.getKey() + "\t" + entry.getValue());
}
Outputs
project2 2
project1 2
project3 1
regex demo
If the pattern is simple - projectNumber before "_" this one do this job:
Map<String, Long> projectNumbers = Arrays.asList("project1_john_document1", "project1_john_document2", "project2_jose_document1", "project2_jose_document2", "project3_juan_document1")
.stream().map(s -> s.split("_")[0])
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
If you want to get the count of how many project1, project2 and so on folders you have, you could achieve it with the following code:
String[] names = {"project1_john_document1", "project1_john_document2", "project2_jose_document1", "project2_jose_document2", "project3_juan_document1"};
Map<String, Integer> counts = new HashMap<>();
for (String entry : names) {
String project = entry.split("_")[0];
int count = counts.containsKey(project) ? counts.get(project) : 0;
counts.put(project, count + 1);
}
System.out.println(counts);
// prints: {project2=2, project1=2, project3=1}
As the other answers mentioned, you could use regex, streams, etc. to do similar things. But the basic logic is the same: for each folder, get the root name, increment the counter in a map. If you're a beginner, I'd probably get my head around the most basic flow first before diving into slightly more complex things (e.g., streams).
define your custom Function that converts your string names into something more simpler to be compared, then collect that
List<String> myL = Arrays.asList("project1_john_document1", "project1_john_document2",
"project2_jose_document1", "project2_jose_document2", "project3_juan_document1");
Function<String, String> myFRegex = t -> {
return t.substring(0, t.indexOf("_"));
};
Map<String, Long> primeFactorCount = myL.stream()
.collect(Collectors.groupingBy(myFRegex, Collectors.counting()));
System.out.println(primeFactorCount);
the output can look like:
{project2=2, project1=2, project3=1}
This is the "Longest common substring problem". You can find an algorithm (in pseudo code) on the following article:
https://en.wikipedia.org/wiki/Longest_common_substring_problem
I have a Treemap:
TreeMap<String, Integer> map = new TreeMap<String, Integer>();
It counts words that are put in, for example if I insert:
"Hi hello hi"
It prints:
{Hi=2, Hello=1}
I want to replace that "," with a "\n", but I did not understand the methods in Java library. Is this possible to do? And is it possible to convert the whole TreeMap to a String?
When printing the map to the System.out is uses the map's toString function to print the map to the console.
You could either string replace the comma with a newline like this:
String stringRepresentation = map.toString().replace(", ", "\n");
This might however poses problems when your key in the map contains commas.
Or you could create a function to produce the desired string format:
public String mapToMyString(Map<String, Integer> map) {
StringBuilder builder = new StringBuilder("{");
for (Map.Entry<String, Integer> entry : map.entrySet()) {
builder.append(entry.getKey()).append('=').append(entry.getValue()).append('\n');
}
builder.append('}');
return builder.toString();
}
String stringRepresentation = mapToMyString(map);
Guava has a lot of useful methods. Look at Joiner.MapJoiner
Joiner.MapJoiner joiner = Joiner.on('\n').withKeyValueSeparator("=");
System.out.println(joiner.join(map));
Is there a pre-written method to replace dollar-sign name variables in a string with a predefined constant?
For example, the following code :
Map<String, Object> myVars = new TreeMap<String, Object>();
String str = "The current year is ${currentYear}.";
myVars.put("currentYear", "2014");
System.out.println(Replacer.replaceVars(str, myVars));
... would have this output:
The current year is 2014.
Spring does this too if you need to support more advanced use cases. I was able to utilize code from the following class for my use cases. See the parseStringValue method.
https://github.com/spring-projects/spring-framework/blob/master/spring-core/src/main/java/org/springframework/util/PropertyPlaceholderHelper.java
In your case, you need to pass in a PlaceholderResolver that uses your Map to resolve the placeholder.
Have a look at the MessageFormat class. Your example would look like this:
int currentYear = 2014;
String str = "The current year is {0}.";
str = MessageFormat.format(str, currentYear);
This is probably the best version, but you could always use a regex as well:
public String format (String input, Map<String, String> replacement)
{
for (String key : replacement.keySet())
input = input.replaceAll("\\${"+replacement+"}", replacement.get(key));
return input;
}
I am not aware of any standard Java class which would support this behaviour but writing your tool wouldn't be so hard. Here is example of such solution:
class Replacer {
private static Pattern pattern = Pattern.compile("\\$\\{(?<key>[^}]*)\\}");
public static String replaceVars(String format, Map<String, ?> map) {
StringBuffer sb = new StringBuffer();
Matcher m = pattern.matcher(format);
while (m.find()) {
String key = m.group("key");
if (map.containsKey(key)) {//replace if founded key exists in map
m.appendReplacement(sb, map.get(key).toString());
} else {//do not replace, or to be precise replace with same value
m.appendReplacement(sb, m.group());
}
}
m.appendTail(sb);
return sb.toString();
}
}
We could take advantage of default method getOrDefault introduced in Java 8 to Map interface and replace
while (m.find()) {
String key = m.group("key");
if (map.containsKey(key)) {//replace if founded key exists in map
m.appendReplacement(sb, map.get(key).toString());
} else {//do not replace, or to be precise replace with same value
m.appendReplacement(sb, m.group());
}
}
m.appendTail(sb);
with
while (m.find())
m.appendReplacement(sb, map.getOrDefault(m.group("key"), m.group()));
m.appendTail(sb);
But to be able to use this method we first would need to specify type of value in map - in other words we would need to change type of accepted Map from Map<String, ?> map to type like Map<String, String> map.