I'm writing a text to HTML converter.
I'm looking for a simple way to wrap each line of text (which ends with carriage return) with
<p>.....text.....</p>
Can you suggest some String replacement/regular expression that will work in Java ?
Thanks
String txtFileContent = ....;
String htmlContent = "<p>" + txtFileContent.replaceAll("\\n","</p>\\n<p>") + "</p>";
Assuming,
line delimitter is "\n".
One line is one paragraph.
The end of txtFileContent is not "\n"
Hope this help
Try using StringEscapeUtils.escapeHtml and then adding the tags you want at the beginning end.
String escapeHTML = StringEscapeUtils.escapeHtml(inputStr);
String output = "<p>"+escapeHTML+"</p>";
Related
Im importing Excel data to Java program. Having a column that should eliminate whitespace in case user mistype on the column.
Example value : "12341 "
i've used
replaceAll("\\s+", "");
replaceAll(" ", "");
StringUtils.trim(stringValue);
However, it still return "12341 " with length :6. It didn't remove the unnecessary white-spaces
EDIT
Complete code for replace return.
stringArray[x] = stringArray[x].replaceAll("\\s+", "");
stringArray[x] = stringArray[x].replaceAll(" ", "");
stringArray[x] = StringUtils.trim(stringArray[x]);
This should work:
stringValue = stringValue.replaceAll(" ", "");
You need to use the returned value.
I too face the same problem and i resolved it by using
text = text.replaceAll("[^\x00-\x7F]", "");
And make sure this will remove your special character too
Ref link :
https://howtodoinjava.com/regex/java-clean-ascii-text-non-printable-chars/
I got the following regex working to search for video links in a page
(http(s?):/)(/[^/]+)\\S+.\\.(?:avi|flv|mp4)
Unfortunately it does not stop at the end of the link if there is another match right behind it, for example this video link
somevideoname.avi
would, after regex return this:
http://somevideo.flv">somevideoname.avi
How can I adjust the regex to avoid this? I would like to learn more about regex, its fascinating but so complex!
Here is how you can do something similar with JSoup parser.
Scanner scanner = new Scanner(new File("input.txt"));
scanner.useDelimiter("\\Z");
String htmlString = scanner.next();
scanner.close();
Document doc = Jsoup.parse(htmlString);
// or to get connect of some page use
// Document doc = Jsoup.connect("http://example.com/").get();
Elements elements = doc.select("a[href]");//find all anchors with href attribute
for (Element el : elements) {
URL url = new URL(el.attr("href"));
if (url.getPath().matches(".*\\.(?:avi|flv|mp4)")) {
System.out.println("url: " + url);
//System.out.println("file: " + url.getPath());
System.out.println("file name: "
+ new File(url.getPath()).getName());
System.out.println("------");
}
}
I'm not sure I understand the groupings in your regexp. At any rate, this one should work:
\\bhttps?://[^\"]+?\\.(?:avi|flv|mp4)\\b
If you only want to extract href attribute values then you're better off matching against the following pattern:
href=("|')(.*?)\.(avi|flv|mp4)\1
This should match "href" followed by either a double-quote or single-quote character, then capture everything up to (and including) the next character which matches the starting quote character. Then your href attribute can be extracted by
matcher.group(2) + "." + matcher.group(3)
to concatenate the file path and name with a period and then the file extension.
Your regex is greedy:
Limit its greediness read this:
(http(s?):/)(/[^/]+?)\\S+.\\.(?:avi|flv|mp4)
I have following string in Java:
"this is text1\r\nthis is text2\r\nthis is text3"
I am replacing \r\n with <br/> as follows:
String temp = "this is text1\r\nthis is text2\r\nthis is text3"
temp = temp.replaceAll("[\r\n]+", "<br/>");
which produces the following string: "is text1 this is text2 this is text3"
now, I want to send it to JavaScript element as follows:
var desc_str = "<%=temp%>";
document.getElementById('proc_desc').value = desc_str;
The output from Java is fine, but after passing to HTML element, I am getting JavaScript error "unterminated String literal", I am not finding the clue, please help.
It sounds like you haven't gotten all of the line breaks out of your String.
You might find this post helpful:
How do I replace all line breaks in a string with <br /> tags?
i use this simple code to rename a file when an event happens:
String newFileName = oldFileName + "_" + new Date().getTime();
if the event happens more and more time i will have a string like:
myfile_1372933712717_1372933715279_1372933716234
while i would like to have only the last timestamp...
Of course i could do a substring to remove the string after "_" and replace it with the new timestamp, but let's suppose i will have a file like: myfile_mycomment...mycomment will be replaced and it's not a good thing...
So how could i recognize if there is already a filestamp in the name of the file?!?
You can try to approach this with RegEx, as the timestamps will always have the same pattern. By this, you can differ between comments and timestamps and remove only the timestamps.
This code
String test = "Hallo_Comment_1372933712717_1372933712717";
test = test.replaceAll("_1[0-9_]{12}", "");
System.out.println(test);
generated this output
Hallo_Comment
Assuming your original file name does'nt contains "_"
Before appending split file name with "_" and get always the 0th element from the string array and append the timestamp
I have a requirement in my project.
I generate a comment string in javascript.
Coping Option: Delete all codes and replace
Source Proj Num: R21AR058864-02
Source PI Last Name: SZALAI
Appl ID: 7924675; File Year: 7924675
I send this to server where I store it as a string in db and then after that I retrieve it back and show it in a textarea.
I generate it in javascript as :
codingHistoryComment += 'Source Proj Num: <%=mDefault.getFullProjectNumber()%>'+'\n';
codingHistoryComment += 'Source PI Last Name: <%=mDefault.getPILastName()%>'+'\n';
codingHistoryComment += 'Appl ID: <%=mDefault.getApplId()%>; File Year: <%=mDefault.getApplId()%>'+'\n';
In java I am trying to replace the \n to :
String str = soChild2.getChild("codingHistoryComment").getValue().trim();
if(str.contains("\\n")){
str = str.replaceAll("(\r\n|\n)", "<br>");
}
However the textarea still get populated with the "\n" characters:
Coping Option: Delete all codes and replace\nSource Proj Num: R21AR058864-02\nSource PI Last Name: SZALAI\nAppl ID: 7924675; File Year: 7924675\n
Thanks.
In java I am trying to replace the \n to
Don't replace the "\n". A JTextArea will parse that as a new line string.
Trying to convert it to a "br" tag won't help either since a JTextArea does not support html.
I always just use code like the following to populate a text area with text:
JTextArea textArea = new JTextArea(5, 20);
textArea.setText("1\n2\n3\n4\n5\n6\n7\n8\n9\n0");
// automatically wrap lines
jTextArea.setLineWrap( true );
// break lines on word, rather than character boundaries.
jTextArea.setWrapStyleWord( true );
From here.
Here is a test that works, try it out:
String str = "This is a test\r\n test.";
if(str.contains("\r\n")) {
System.out.println(str);
}
Assuming Javascript (since you try to replace with a HTML break line):
A HTML textarea newline should be a newline character \n and not the HTML break line <br>. Try to use the code below to remove extra slashes instead of your current if statement and replace. Don't forget to assign the value to the textarea after the replacement.
Try:
str = str.replaceAll("\\n", "\n");
I think your problem is here:
if(str.contains("\\n")){
Instead of "\\n" you just need "\n"
Then instead of "\n" you need "\\n" here:
str = str.replaceAll("(\r\n|\n)", "<br>");
By the way, the if(str.contains() is not really needed because it won't hurt to run replace all if there is no "\n" characters.