Removing Html tag from Java Hashmap value with efficient way
main(String str[]){
HashMap<String, String> hm = new HashMap<>();
hm.put("A", "Apple");
hm.put("B", "<b>Ball</b>");
hm.put("C", "Cat");
hm.put("D", "Dog");
hm.put("E", "<h1>Elephant</h1>");
}
// we have to remove only html tags which have like B = <b>Ball</b> so the B = Ball
// and E = <h1>Elephant</h1> should be E =Elephant
import java.util.HashMap;
import java.util.stream.Collectors;
import java.util.Map;
public class MyClass {
public static void main(String args[]) {
HashMap<String, String> hm = new HashMap<>();
hm.put("A", "Apple");
hm.put("B", "<b>Ball</b>");
hm.put("C", "Cat");
hm.put("D", "Dog");
hm.put("E", "<h1>Elephant</h1>");
Map<String, String> newHm = hm.entrySet().
stream()
.collect(Collectors.toMap(Map.Entry::getKey, e -> e.getValue().replaceAll("\\<[^>]*>","")));
System.out.println(newHm);
}
}
There is method Map::replaceAll which accepts a function replacing the values.
In this case the HTML tags may be removed from values using regular expressions and method String::replaceAll:
hm.replaceAll((k, v) -> v.replaceAll("(\\<\\w+\\>)(.*)(\\</\\w+\\>)", "$2"));
System.out.println(hm);
Output showing that Apple and Elephant values are cleared from the HTML tags:
{A=Apple, B=Ball, C=Cat, D=Dog, E=Elephant}
Regular expression: "(\\<\\w+\\>)(.*)(\\</\\w+\\>)" looks for a sequence containing opening (\\<\\w+\\>) and closing (\\</\\w+\\>) tags and any text between them (.*).
#Test
public void test1() {
final Map<String, String> hm = new HashMap<>();
hm.put("A", "Apple");
hm.put("B", "<b>Ball</b>");
hm.put("C", "Cat");
hm.put("D", "Dog");
hm.put("E", "<h1>Elephant</h1>");
hm.entrySet().stream()
.forEach(entry -> entry.setValue(entry.getValue().replaceAll("</.*>", "").replaceAll("<.*>", "")));
assertEquals("Ball", hm.get("B"));
assertEquals("Elephant", hm.get("E"));
}
Be sure to replace the ending tag first.
This will work with multiple tags as well (i.e. <hi><b>Elephant</b></h1>
You can do it in many ways. The two easiest would be to use:
Regex - match html tags and remove them from code
private static String removeHtmlTags(String input) {
return input.replaceAll("<.*?>", "");
}
Use external library like Jsoup to parse String into HTML tag and print content. The cons is that you have to add it to your pom.xml.
<dependency>
<groupId>org.jsoup</groupId>
<artifactId>jsoup</artifactId>
<version>1.14.2</version>
</dependency>
private static String removeHtmlTagsUsingParser(String input) {
Document document = Jsoup.parse(input);
return document.text();
}
I have the following code:
public class NewClass {
public String noTags(String str){
return Jsoup.parse(str).text();
}
public static void main(String args[]) {
String strings="<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN \">" +
"<HTML> <HEAD> <TITLE></TITLE> <style>body{ font-size: 12px;font-family: verdana, arial, helvetica, sans-serif;}</style> </HEAD> <BODY><p><b>hello world</b></p><p><br><b>yo</b> googlez</p></BODY> </HTML> ";
NewClass text = new NewClass();
System.out.println((text.noTags(strings)));
}
And I have the result:
hello world yo googlez
But I want to break the line:
hello world
yo googlez
I have looked at jsoup's TextNode#getWholeText() but I can't figure out how to use it.
If there's a <br> in the markup I parse, how can I get a line break in my resulting output?
The real solution that preserves linebreaks should be like this:
public static String br2nl(String html) {
if(html==null)
return html;
Document document = Jsoup.parse(html);
document.outputSettings(new Document.OutputSettings().prettyPrint(false));//makes html() preserve linebreaks and spacing
document.select("br").append("\\n");
document.select("p").prepend("\\n\\n");
String s = document.html().replaceAll("\\\\n", "\n");
return Jsoup.clean(s, "", Whitelist.none(), new Document.OutputSettings().prettyPrint(false));
}
It satisfies the following requirements:
if the original html contains newline(\n), it gets preserved
if the original html contains br or p tags, they gets translated to newline(\n).
With
Jsoup.parse("A\nB").text();
you have output
"A B"
and not
A
B
For this I'm using:
descrizione = Jsoup.parse(html.replaceAll("(?i)<br[^>]*>", "br2n")).text();
text = descrizione.replaceAll("br2n", "\n");
Jsoup.clean(unsafeString, "", Whitelist.none(), new OutputSettings().prettyPrint(false));
We're using this method here:
public static String clean(String bodyHtml,
String baseUri,
Whitelist whitelist,
Document.OutputSettings outputSettings)
By passing it Whitelist.none() we make sure that all HTML is removed.
By passsing new OutputSettings().prettyPrint(false) we make sure that the output is not reformatted and line breaks are preserved.
On Jsoup v1.11.2, we can now use Element.wholeText().
String cleanString = Jsoup.parse(htmlString).wholeText();
user121196's answer still works. But wholeText() preserves the alignment of texts.
Try this by using jsoup:
public static String cleanPreserveLineBreaks(String bodyHtml) {
// get pretty printed html with preserved br and p tags
String prettyPrintedBodyFragment = Jsoup.clean(bodyHtml, "", Whitelist.none().addTags("br", "p"), new OutputSettings().prettyPrint(true));
// get plain text with preserved line breaks by disabled prettyPrint
return Jsoup.clean(prettyPrintedBodyFragment, "", Whitelist.none(), new OutputSettings().prettyPrint(false));
}
For more complex HTML none of the above solutions worked quite right; I was able to successfully do the conversion while preserving line breaks with:
Document document = Jsoup.parse(myHtml);
String text = new HtmlToPlainText().getPlainText(document);
(version 1.10.3)
You can traverse a given element
public String convertNodeToText(Element element)
{
final StringBuilder buffer = new StringBuilder();
new NodeTraversor(new NodeVisitor() {
boolean isNewline = true;
#Override
public void head(Node node, int depth) {
if (node instanceof TextNode) {
TextNode textNode = (TextNode) node;
String text = textNode.text().replace('\u00A0', ' ').trim();
if(!text.isEmpty())
{
buffer.append(text);
isNewline = false;
}
} else if (node instanceof Element) {
Element element = (Element) node;
if (!isNewline)
{
if((element.isBlock() || element.tagName().equals("br")))
{
buffer.append("\n");
isNewline = true;
}
}
}
}
#Override
public void tail(Node node, int depth) {
}
}).traverse(element);
return buffer.toString();
}
And for your code
String result = convertNodeToText(JSoup.parse(html))
Based on the other answers and the comments on this question it seems that most people coming here are really looking for a general solution that will provide a nicely formatted plain text representation of an HTML document. I know I was.
Fortunately JSoup already provide a pretty comprehensive example of how to achieve this: HtmlToPlainText.java
The example FormattingVisitor can easily be tweaked to your preference and deals with most block elements and line wrapping.
To avoid link rot, here is Jonathan Hedley's solution in full:
package org.jsoup.examples;
import org.jsoup.Jsoup;
import org.jsoup.helper.StringUtil;
import org.jsoup.helper.Validate;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.nodes.Node;
import org.jsoup.nodes.TextNode;
import org.jsoup.select.Elements;
import org.jsoup.select.NodeTraversor;
import org.jsoup.select.NodeVisitor;
import java.io.IOException;
/**
* HTML to plain-text. This example program demonstrates the use of jsoup to convert HTML input to lightly-formatted
* plain-text. That is divergent from the general goal of jsoup's .text() methods, which is to get clean data from a
* scrape.
* <p>
* Note that this is a fairly simplistic formatter -- for real world use you'll want to embrace and extend.
* </p>
* <p>
* To invoke from the command line, assuming you've downloaded the jsoup jar to your current directory:</p>
* <p><code>java -cp jsoup.jar org.jsoup.examples.HtmlToPlainText url [selector]</code></p>
* where <i>url</i> is the URL to fetch, and <i>selector</i> is an optional CSS selector.
*
* #author Jonathan Hedley, jonathan#hedley.net
*/
public class HtmlToPlainText {
private static final String userAgent = "Mozilla/5.0 (jsoup)";
private static final int timeout = 5 * 1000;
public static void main(String... args) throws IOException {
Validate.isTrue(args.length == 1 || args.length == 2, "usage: java -cp jsoup.jar org.jsoup.examples.HtmlToPlainText url [selector]");
final String url = args[0];
final String selector = args.length == 2 ? args[1] : null;
// fetch the specified URL and parse to a HTML DOM
Document doc = Jsoup.connect(url).userAgent(userAgent).timeout(timeout).get();
HtmlToPlainText formatter = new HtmlToPlainText();
if (selector != null) {
Elements elements = doc.select(selector); // get each element that matches the CSS selector
for (Element element : elements) {
String plainText = formatter.getPlainText(element); // format that element to plain text
System.out.println(plainText);
}
} else { // format the whole doc
String plainText = formatter.getPlainText(doc);
System.out.println(plainText);
}
}
/**
* Format an Element to plain-text
* #param element the root element to format
* #return formatted text
*/
public String getPlainText(Element element) {
FormattingVisitor formatter = new FormattingVisitor();
NodeTraversor traversor = new NodeTraversor(formatter);
traversor.traverse(element); // walk the DOM, and call .head() and .tail() for each node
return formatter.toString();
}
// the formatting rules, implemented in a breadth-first DOM traverse
private class FormattingVisitor implements NodeVisitor {
private static final int maxWidth = 80;
private int width = 0;
private StringBuilder accum = new StringBuilder(); // holds the accumulated text
// hit when the node is first seen
public void head(Node node, int depth) {
String name = node.nodeName();
if (node instanceof TextNode)
append(((TextNode) node).text()); // TextNodes carry all user-readable text in the DOM.
else if (name.equals("li"))
append("\n * ");
else if (name.equals("dt"))
append(" ");
else if (StringUtil.in(name, "p", "h1", "h2", "h3", "h4", "h5", "tr"))
append("\n");
}
// hit when all of the node's children (if any) have been visited
public void tail(Node node, int depth) {
String name = node.nodeName();
if (StringUtil.in(name, "br", "dd", "dt", "p", "h1", "h2", "h3", "h4", "h5"))
append("\n");
else if (name.equals("a"))
append(String.format(" <%s>", node.absUrl("href")));
}
// appends text to the string builder with a simple word wrap method
private void append(String text) {
if (text.startsWith("\n"))
width = 0; // reset counter if starts with a newline. only from formats above, not in natural text
if (text.equals(" ") &&
(accum.length() == 0 || StringUtil.in(accum.substring(accum.length() - 1), " ", "\n")))
return; // don't accumulate long runs of empty spaces
if (text.length() + width > maxWidth) { // won't fit, needs to wrap
String words[] = text.split("\\s+");
for (int i = 0; i < words.length; i++) {
String word = words[i];
boolean last = i == words.length - 1;
if (!last) // insert a space if not the last word
word = word + " ";
if (word.length() + width > maxWidth) { // wrap and reset counter
accum.append("\n").append(word);
width = word.length();
} else {
accum.append(word);
width += word.length();
}
}
} else { // fits as is, without need to wrap text
accum.append(text);
width += text.length();
}
}
#Override
public String toString() {
return accum.toString();
}
}
}
text = Jsoup.parse(html.replaceAll("(?i)<br[^>]*>", "br2n")).text();
text = descrizione.replaceAll("br2n", "\n");
works if the html itself doesn't contain "br2n"
So,
text = Jsoup.parse(html.replaceAll("(?i)<br[^>]*>", "<pre>\n</pre>")).text();
works more reliable and easier.
Try this:
public String noTags(String str){
Document d = Jsoup.parse(str);
TextNode tn = new TextNode(d.body().html(), "");
return tn.getWholeText();
}
Use textNodes() to get a list of the text nodes. Then concatenate them with \n as separator.
Here's some scala code I use for this, java port should be easy:
val rawTxt = doc.body().getElementsByTag("div").first.textNodes()
.asScala.mkString("<br />\n")
Try this by using jsoup:
doc.outputSettings(new OutputSettings().prettyPrint(false));
//select all <br> tags and append \n after that
doc.select("br").after("\\n");
//select all <p> tags and prepend \n before that
doc.select("p").before("\\n");
//get the HTML from the document, and retaining original new lines
String str = doc.html().replaceAll("\\\\n", "\n");
This is my version of translating html to text (the modified version of user121196 answer, actually).
This doesn't just preserve line breaks, but also formatting text and removing excessive line breaks, HTML escape symbols, and you will get a much better result from your HTML (in my case I'm receiving it from mail).
It's originally written in Scala, but you can change it to Java easily
def html2text( rawHtml : String ) : String = {
val htmlDoc = Jsoup.parseBodyFragment( rawHtml, "/" )
htmlDoc.select("br").append("\\nl")
htmlDoc.select("div").prepend("\\nl").append("\\nl")
htmlDoc.select("p").prepend("\\nl\\nl").append("\\nl\\nl")
org.jsoup.parser.Parser.unescapeEntities(
Jsoup.clean(
htmlDoc.html(),
"",
Whitelist.none(),
new org.jsoup.nodes.Document.OutputSettings().prettyPrint(true)
),false
).
replaceAll("\\\\nl", "\n").
replaceAll("\r","").
replaceAll("\n\\s+\n","\n").
replaceAll("\n\n+","\n\n").
trim()
}
/**
* Recursive method to replace html br with java \n. The recursive method ensures that the linebreaker can never end up pre-existing in the text being replaced.
* #param html
* #param linebreakerString
* #return the html as String with proper java newlines instead of br
*/
public static String replaceBrWithNewLine(String html, String linebreakerString){
String result = "";
if(html.contains(linebreakerString)){
result = replaceBrWithNewLine(html, linebreakerString+"1");
} else {
result = Jsoup.parse(html.replaceAll("(?i)<br[^>]*>", linebreakerString)).text(); // replace and html line breaks with java linebreak.
result = result.replaceAll(linebreakerString, "\n");
}
return result;
}
Used by calling with the html in question, containing the br, along with whatever string you wish to use as the temporary newline placeholder.
For example:
replaceBrWithNewLine(element.html(), "br2n")
The recursion will ensure that the string you use as newline/linebreaker placeholder will never actually be in the source html, as it will keep adding a "1" untill the linkbreaker placeholder string is not found in the html. It wont have the formatting issue that the Jsoup.clean methods seem to encounter with special characters.
Based on user121196's and Green Beret's answer with the selects and <pre>s, the only solution which works for me is:
org.jsoup.nodes.Element elementWithHtml = ....
elementWithHtml.select("br").append("<pre>\n</pre>");
elementWithHtml.select("p").prepend("<pre>\n\n</pre>");
elementWithHtml.text();
In Python, when formatting string, I can fill placeholders by name rather than by position, like that:
print "There's an incorrect value '%(value)s' in column # %(column)d" % \
{ 'value': x, 'column': y }
I wonder if that is possible in Java (hopefully, without external libraries)?
StrSubstitutor of jakarta commons lang is a light weight way of doing this provided your values are already formatted correctly.
http://commons.apache.org/proper/commons-lang/javadocs/api-3.1/org/apache/commons/lang3/text/StrSubstitutor.html
Map<String, String> values = new HashMap<String, String>();
values.put("value", x);
values.put("column", y);
StrSubstitutor sub = new StrSubstitutor(values, "%(", ")");
String result = sub.replace("There's an incorrect value '%(value)' in column # %(column)");
The above results in:
"There's an incorrect value '1' in column # 2"
When using Maven you can add this dependency to your pom.xml:
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>3.4</version>
</dependency>
not quite, but you can use MessageFormat to reference one value multiple times:
MessageFormat.format("There's an incorrect value \"{0}\" in column # {1}", x, y);
The above can be done with String.format() as well, but I find messageFormat syntax cleaner if you need to build complex expressions, plus you dont need to care about the type of the object you are putting into the string
Another example of Apache Common StringSubstitutor for simple named placeholder.
String template = "Welcome to {theWorld}. My name is {myName}.";
Map<String, String> values = new HashMap<>();
values.put("theWorld", "Stackoverflow");
values.put("myName", "Thanos");
String message = StringSubstitutor.replace(template, values, "{", "}");
System.out.println(message);
// Welcome to Stackoverflow. My name is Thanos.
You can use StringTemplate library, it offers what you want and much more.
import org.antlr.stringtemplate.*;
final StringTemplate hello = new StringTemplate("Hello, $name$");
hello.setAttribute("name", "World");
System.out.println(hello.toString());
public static String format(String format, Map<String, Object> values) {
StringBuilder formatter = new StringBuilder(format);
List<Object> valueList = new ArrayList<Object>();
Matcher matcher = Pattern.compile("\\$\\{(\\w+)}").matcher(format);
while (matcher.find()) {
String key = matcher.group(1);
String formatKey = String.format("${%s}", key);
int index = formatter.indexOf(formatKey);
if (index != -1) {
formatter.replace(index, index + formatKey.length(), "%s");
valueList.add(values.get(key));
}
}
return String.format(formatter.toString(), valueList.toArray());
}
Example:
String format = "My name is ${1}. ${0} ${1}.";
Map<String, Object> values = new HashMap<String, Object>();
values.put("0", "James");
values.put("1", "Bond");
System.out.println(format(format, values)); // My name is Bond. James Bond.
Thanks for all your help! Using all your clues, I've written routine to do exactly what I want -- python-like string formatting using dictionary. Since I'm Java newbie, any hints are appreciated.
public static String dictFormat(String format, Hashtable<String, Object> values) {
StringBuilder convFormat = new StringBuilder(format);
Enumeration<String> keys = values.keys();
ArrayList valueList = new ArrayList();
int currentPos = 1;
while (keys.hasMoreElements()) {
String key = keys.nextElement(),
formatKey = "%(" + key + ")",
formatPos = "%" + Integer.toString(currentPos) + "$";
int index = -1;
while ((index = convFormat.indexOf(formatKey, index)) != -1) {
convFormat.replace(index, index + formatKey.length(), formatPos);
index += formatPos.length();
}
valueList.add(values.get(key));
++currentPos;
}
return String.format(convFormat.toString(), valueList.toArray());
}
This is an old thread, but just for the record, you could also use Java 8 style, like this:
public static String replaceParams(Map<String, String> hashMap, String template) {
return hashMap.entrySet().stream().reduce(template, (s, e) -> s.replace("%(" + e.getKey() + ")", e.getValue()),
(s, s2) -> s);
}
Usage:
public static void main(String[] args) {
final HashMap<String, String> hashMap = new HashMap<String, String>() {
{
put("foo", "foo1");
put("bar", "bar1");
put("car", "BMW");
put("truck", "MAN");
}
};
String res = replaceParams(hashMap, "This is '%(foo)' and '%(foo)', but also '%(bar)' '%(bar)' indeed.");
System.out.println(res);
System.out.println(replaceParams(hashMap, "This is '%(car)' and '%(foo)', but also '%(bar)' '%(bar)' indeed."));
System.out.println(replaceParams(hashMap, "This is '%(car)' and '%(truck)', but also '%(foo)' '%(bar)' + '%(truck)' indeed."));
}
The output will be:
This is 'foo1' and 'foo1', but also 'bar1' 'bar1' indeed.
This is 'BMW' and 'foo1', but also 'bar1' 'bar1' indeed.
This is 'BMW' and 'MAN', but also 'foo1' 'bar1' + 'MAN' indeed.
Apache Commons StringSubstitutor can be used. Note that StrSubstitutor is deprecated.
import org.apache.commons.text.StringSubstitutor;
// ...
Map<String, String> values = new HashMap<>();
values.put("animal", "quick brown fox");
values.put("target", "lazy dog");
StringSubstitutor sub = new StringSubstitutor(values);
String result = sub.replace("The ${animal} jumped over the ${target}.");
// "The quick brown fox jumped over the lazy dog."
This class supports providing default values for variables.
String result = sub.replace("The number is ${undefined.property:-42}.");
// "The number is 42."
To use recursive variable replacement, call setEnableSubstitutionInVariables(true);.
Map<String, String> values = new HashMap<>();
values.put("b", "c");
values.put("ac", "Test");
StringSubstitutor sub = new StringSubstitutor(values);
sub.setEnableSubstitutionInVariables(true);
String result = sub.replace("${a${b}}");
// "Test"
I am the author of a small library that does exactly what you want:
Student student = new Student("Andrei", 30, "Male");
String studStr = template("#{id}\tName: #{st.getName}, Age: #{st.getAge}, Gender: #{st.getGender}")
.arg("id", 10)
.arg("st", student)
.format();
System.out.println(studStr);
Or you can chain the arguments:
String result = template("#{x} + #{y} = #{z}")
.args("x", 5, "y", 10, "z", 15)
.format();
System.out.println(result);
// Output: "5 + 10 = 15"
There is nothing built into Java at the moment of writing this. I would suggest writing your own implementation. My preference is for a simple fluent builder interface instead of creating a map and passing it to function -- you end up with a nice contiguous chunk of code, for example:
String result = new TemplatedStringBuilder("My name is {{name}} and I from {{town}}")
.replace("name", "John Doe")
.replace("town", "Sydney")
.finish();
Here is a simple implementation:
class TemplatedStringBuilder {
private final static String TEMPLATE_START_TOKEN = "{{";
private final static String TEMPLATE_CLOSE_TOKEN = "}}";
private final String template;
private final Map<String, String> parameters = new HashMap<>();
public TemplatedStringBuilder(String template) {
if (template == null) throw new NullPointerException();
this.template = template;
}
public TemplatedStringBuilder replace(String key, String value){
parameters.put(key, value);
return this;
}
public String finish(){
StringBuilder result = new StringBuilder();
int startIndex = 0;
while (startIndex < template.length()){
int openIndex = template.indexOf(TEMPLATE_START_TOKEN, startIndex);
if (openIndex < 0){
result.append(template.substring(startIndex));
break;
}
int closeIndex = template.indexOf(TEMPLATE_CLOSE_TOKEN, openIndex);
if(closeIndex < 0){
result.append(template.substring(startIndex));
break;
}
String key = template.substring(openIndex + TEMPLATE_START_TOKEN.length(), closeIndex);
if (!parameters.containsKey(key)) throw new RuntimeException("missing value for key: " + key);
result.append(template.substring(startIndex, openIndex));
result.append(parameters.get(key));
startIndex = closeIndex + TEMPLATE_CLOSE_TOKEN.length();
}
return result.toString();
}
}
Apache Commons Lang's replaceEach method may come in handy dependeding on your specific needs. You can easily use it to replace placeholders by name with this single method call:
StringUtils.replaceEach("There's an incorrect value '%(value)' in column # %(column)",
new String[] { "%(value)", "%(column)" }, new String[] { x, y });
Given some input text, this will replace all occurrences of the placeholders in the first string array with the corresponding values in the second one.
You should have a look at the official ICU4J library. It provides a MessageFormat class similar to the one available with the JDK but this former supports named placeholders.
Unlike other solutions provided on this page. ICU4j is part of the ICU project that is maintained by IBM and regularly updated. In addition, it supports advanced use cases such as pluralization and much more.
Here is a code example:
MessageFormat messageFormat =
new MessageFormat("Publication written by {author}.");
Map<String, String> args = Map.of("author", "John Doe");
System.out.println(messageFormat.format(args));
As of 2022 the up-to-date solution is Apache Commons Text StringSubstitutor
From the doc:
// Build map
Map<String, String> valuesMap = new HashMap<>();
valuesMap.put("animal", "quick brown fox");
valuesMap.put("target", "lazy dog");
String templateString = "The ${animal} jumped over the ${target} ${undefined.number:-1234567890} times.";
// Build StringSubstitutor
StringSubstitutor sub = new StringSubstitutor(valuesMap);
// Replace
String resolvedString = sub.replace(templateString)
;
You could have something like this on a string helper class
/**
* An interpreter for strings with named placeholders.
*
* For example given the string "hello %(myName)" and the map <code>
* <p>Map<String, Object> map = new HashMap<String, Object>();</p>
* <p>map.put("myName", "world");</p>
* </code>
*
* the call {#code format("hello %(myName)", map)} returns "hello world"
*
* It replaces every occurrence of a named placeholder with its given value
* in the map. If there is a named place holder which is not found in the
* map then the string will retain that placeholder. Likewise, if there is
* an entry in the map that does not have its respective placeholder, it is
* ignored.
*
* #param str
* string to format
* #param values
* to replace
* #return formatted string
*/
public static String format(String str, Map<String, Object> values) {
StringBuilder builder = new StringBuilder(str);
for (Entry<String, Object> entry : values.entrySet()) {
int start;
String pattern = "%(" + entry.getKey() + ")";
String value = entry.getValue().toString();
// Replace every occurence of %(key) with value
while ((start = builder.indexOf(pattern)) != -1) {
builder.replace(start, start + pattern.length(), value);
}
}
return builder.toString();
}
Based on the answer I created MapBuilder class:
public class MapBuilder {
public static Map<String, Object> build(Object... data) {
Map<String, Object> result = new LinkedHashMap<>();
if (data.length % 2 != 0) {
throw new IllegalArgumentException("Odd number of arguments");
}
String key = null;
Integer step = -1;
for (Object value : data) {
step++;
switch (step % 2) {
case 0:
if (value == null) {
throw new IllegalArgumentException("Null key value");
}
key = (String) value;
continue;
case 1:
result.put(key, value);
break;
}
}
return result;
}
}
then I created class StringFormat for String formatting:
public final class StringFormat {
public static String format(String format, Object... args) {
Map<String, Object> values = MapBuilder.build(args);
for (Map.Entry<String, Object> entry : values.entrySet()) {
String key = entry.getKey();
Object value = entry.getValue();
format = format.replace("$" + key, value.toString());
}
return format;
}
}
which you could use like that:
String bookingDate = StringFormat.format("From $startDate to $endDate"),
"$startDate", formattedStartDate,
"$endDate", formattedEndDate
);
I created also a util/helper class (using jdk 8) which can format a string an replaces occurrences of variables.
For this purpose I used the Matchers "appendReplacement" method which does all the substitution and loops only over the affected parts of a format string.
The helper class isn't currently well javadoc documented. I will changes this in the future ;)
Anyway I commented the most important lines (I hope).
public class FormatHelper {
//Prefix and suffix for the enclosing variable name in the format string.
//Replace the default values with any you need.
public static final String DEFAULT_PREFIX = "${";
public static final String DEFAULT_SUFFIX = "}";
//Define dynamic function what happens if a key is not found.
//Replace the defualt exception with any "unchecked" exception type you need or any other behavior.
public static final BiFunction<String, String, String> DEFAULT_NO_KEY_FUNCTION =
(fullMatch, variableName) -> {
throw new RuntimeException(String.format("Key: %s for variable %s not found.",
variableName,
fullMatch));
};
private final Pattern variablePattern;
private final Map<String, String> values;
private final BiFunction<String, String, String> noKeyFunction;
private final String prefix;
private final String suffix;
public FormatHelper(Map<String, String> values) {
this(DEFAULT_NO_KEY_FUNCTION, values);
}
public FormatHelper(
BiFunction<String, String, String> noKeyFunction, Map<String, String> values) {
this(DEFAULT_PREFIX, DEFAULT_SUFFIX, noKeyFunction, values);
}
public FormatHelper(String prefix, String suffix, Map<String, String> values) {
this(prefix, suffix, DEFAULT_NO_KEY_FUNCTION, values);
}
public FormatHelper(
String prefix,
String suffix,
BiFunction<String, String, String> noKeyFunction,
Map<String, String> values) {
this.prefix = prefix;
this.suffix = suffix;
this.values = values;
this.noKeyFunction = noKeyFunction;
//Create the Pattern and quote the prefix and suffix so that the regex don't interpret special chars.
//The variable name is a "\w+" in an extra capture group.
variablePattern = Pattern.compile(Pattern.quote(prefix) + "(\\w+)" + Pattern.quote(suffix));
}
public static String format(CharSequence format, Map<String, String> values) {
return new FormatHelper(values).format(format);
}
public static String format(
CharSequence format,
BiFunction<String, String, String> noKeyFunction,
Map<String, String> values) {
return new FormatHelper(noKeyFunction, values).format(format);
}
public static String format(
String prefix, String suffix, CharSequence format, Map<String, String> values) {
return new FormatHelper(prefix, suffix, values).format(format);
}
public static String format(
String prefix,
String suffix,
BiFunction<String, String, String> noKeyFunction,
CharSequence format,
Map<String, String> values) {
return new FormatHelper(prefix, suffix, noKeyFunction, values).format(format);
}
public String format(CharSequence format) {
//Create matcher based on the init pattern for variable names.
Matcher matcher = variablePattern.matcher(format);
//This buffer will hold all parts of the formatted finished string.
StringBuffer formatBuffer = new StringBuffer();
//loop while the matcher finds another variable (prefix -> name <- suffix) match
while (matcher.find()) {
//The root capture group with the full match e.g ${variableName}
String fullMatch = matcher.group();
//The capture group for the variable name resulting from "(\w+)" e.g. variableName
String variableName = matcher.group(1);
//Get the value in our Map so the Key is the used variable name in our "format" string. The associated value will replace the variable.
//If key is missing (absent) call the noKeyFunction with parameters "fullMatch" and "variableName" else return the value.
String value = values.computeIfAbsent(variableName, key -> noKeyFunction.apply(fullMatch, key));
//Escape the Map value because the "appendReplacement" method interprets the $ and \ as special chars.
String escapedValue = Matcher.quoteReplacement(value);
//The "appendReplacement" method replaces the current "full" match (e.g. ${variableName}) with the value from the "values" Map.
//The replaced part of the "format" string is appended to the StringBuffer "formatBuffer".
matcher.appendReplacement(formatBuffer, escapedValue);
}
//The "appendTail" method appends the last part of the "format" String which has no regex match.
//That means if e.g. our "format" string has no matches the whole untouched "format" string is appended to the StringBuffer "formatBuffer".
//Further more the method return the buffer.
return matcher.appendTail(formatBuffer)
.toString();
}
public String getPrefix() {
return prefix;
}
public String getSuffix() {
return suffix;
}
public Map<String, String> getValues() {
return values;
}
}
You can create a class instance for a specific Map with values (or suffix prefix or noKeyFunction)
like:
Map<String, String> values = new HashMap<>();
values.put("firstName", "Peter");
values.put("lastName", "Parker");
FormatHelper formatHelper = new FormatHelper(values);
formatHelper.format("${firstName} ${lastName} is Spiderman!");
// Result: "Peter Parker is Spiderman!"
// Next format:
formatHelper.format("Does ${firstName} ${lastName} works as photographer?");
//Result: "Does Peter Parker works as photographer?"
Further more you can define what happens if a key in the values Map is missing (works in both ways e.g. wrong variable name in format string or missing key in Map).
The default behavior is an thrown unchecked exception (unchecked because I use the default jdk8 Function which cant handle checked exceptions) like:
Map<String, String> map = new HashMap<>();
map.put("firstName", "Peter");
map.put("lastName", "Parker");
FormatHelper formatHelper = new FormatHelper(map);
formatHelper.format("${missingName} ${lastName} is Spiderman!");
//Result: RuntimeException: Key: missingName for variable ${missingName} not found.
You can define a custom behavior in the constructor call like:
Map<String, String> values = new HashMap<>();
values.put("firstName", "Peter");
values.put("lastName", "Parker");
FormatHelper formatHelper = new FormatHelper(fullMatch, variableName) -> variableName.equals("missingName") ? "John": "SOMETHING_WRONG", values);
formatHelper.format("${missingName} ${lastName} is Spiderman!");
// Result: "John Parker is Spiderman!"
or delegate it back to the default no key behavior:
...
FormatHelper formatHelper = new FormatHelper((fullMatch, variableName) -> variableName.equals("missingName") ? "John" :
FormatHelper.DEFAULT_NO_KEY_FUNCTION.apply(fullMatch,
variableName), map);
...
For better handling there are also static method representations like:
Map<String, String> values = new HashMap<>();
values.put("firstName", "Peter");
values.put("lastName", "Parker");
FormatHelper.format("${firstName} ${lastName} is Spiderman!", map);
// Result: "Peter Parker is Spiderman!"
There is Java Plugin to use string interpolation in Java (like in Kotlin, JavaScript). Supports Java 8, 9, 10, 11… https://github.com/antkorwin/better-strings
Using variables in string literals:
int a = 3;
int b = 4;
System.out.println("${a} + ${b} = ${a+b}");
Using expressions:
int a = 3;
int b = 4;
System.out.println("pow = ${a * a}");
System.out.println("flag = ${a > b ? true : false}");
Using functions:
#Test
void functionCall() {
System.out.println("fact(5) = ${factorial(5)}");
}
long factorial(int n) {
long fact = 1;
for (int i = 2; i <= n; i++) {
fact = fact * i;
}
return fact;
}
For more info, please read the project README.
The quick answer is no, unfortunately. However, you can come pretty close to a reasonable syntaks:
"""
You are $compliment!
"""
.replace('$compliment', 'awesome');
It's more readable and predictable than String.format, at least!
My answer is to:
a) use StringBuilder when possible
b) keep (in any form: integer is the best, speciall char like dollar macro etc) position of "placeholder" and then use StringBuilder.insert() (few versions of arguments).
Using external libraries seems overkill and I belive degrade performance significant, when StringBuilder is converted to String internally.
Try Freemarker, templating library.
I tried in just a quick way
public static void main(String[] args)
{
String rowString = "replace the value ${var1} with ${var2}";
Map<String,String> mappedValues = new HashMap<>();
mappedValues.put("var1", "Value 1");
mappedValues.put("var2", "Value 2");
System.out.println(replaceOccurence(rowString, mappedValues));
}
private static String replaceOccurence(String baseStr ,Map<String,String> mappedValues)
{
for(String key :mappedValues.keySet())
{
baseStr = baseStr.replace("${"+key+"}", mappedValues.get(key));
}
return baseStr;
}
I ended up with the next solution:
Create class TemplateSubstitutor with method substitute() and use it to format output
Then create a string template and fill it with values
import java.util.*;
public class MyClass {
public static void main(String args[]) {
String template = "WRR = {WRR}, SRR = {SRR}\n" +
"char_F1 = {char_F1}, word_F1 = {word_F1}\n";
Map<String, Object> values = new HashMap<>();
values.put("WRR", 99.9);
values.put("SRR", 99.8);
values.put("char_F1", 80);
values.put("word_F1", 70);
String message = TemplateSubstitutor.substitute(values, template);
System.out.println(message);
}
}
class TemplateSubstitutor {
public static String substitute(Map<String, Object> map, String input_str) {
String output_str = input_str;
for (Map.Entry<String, Object> entry : map.entrySet()) {
String key = entry.getKey();
Object value = entry.getValue();
output_str = output_str.replace("{" + key + "}", String.valueOf(value));
}
return output_str;
}
}
https://dzone.com/articles/java-string-format-examples String.format(inputString, [listOfParams]) would be the easiest way. Placeholders in string can be defined by order. For more details check the provided link.