Markdown algorithm: string difficulties

Markdown algorithm: string difficulties - java

I started writing this algorithm:
public static String convert(String str) {
if (str.equals("# "))
return " ";
if (str.matches("#+.+")) {
int n = str.length() - str.replaceFirst("#+", "").length();
return "<h" + n + ">" + str.substring(n) + "<h" + n + ">";
}
return str;
}
}
So when I type, ####title, it returns < h4>title< /h4>
My problem is that when I write ####title###title, I would like it to return < h4>title< /h4> < h3>title< /h3> but it only returns < h4>title< /h4>...What am I doing wrong???

Thats because you are using the pattern: - #+.+.
Now, since . matches everything in Regex, so in the above pattern, it matches everything after an initial set of #'s.
So, for your input: - ####title###title, your pattern will match: -
#+ will match ####
.+ will match title###title
You need to change your regex to : - (#+[^#]+), and probably need to use Pattern class here to get the desired output, becaues you want to match every part of your string to the given pattern.
#+[^#]+ -> Will match the first set of # and then everything after that, except #. So it stops where the next set of #'s start.
Here's how you can use it: -
String str = "####title###title"; // str is the method parameter
if (str.equals("# "))
System.out.println(" ");
Pattern pattern = Pattern.compile("(#+[^#]+)");
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
String str1 = matcher.group(1);
int n = str1.length() - str1.replaceFirst("#+", "").length();
System.out.println("<h" + n + ">" + str1.substring(n) + "</h" + n + ">");
}
OUTPUT: -
<h4>title</h4>
<h3>title</h3>

You are matching wrong string, try this one:
#+[^#]+
And of course you want to make call it recursivly or in a loop

You are replacing only the first occurrence of #+. Try replacing the if with a while and instead of return inside the if, append the result into a StringBuilder.
Something like:
String str = "####title###title2";
StringBuilder sb = new StringBuilder();
while (str.matches("#+.+")) {
int n = str.length() - str.replaceFirst("#+", "").length();
str = str.replaceFirst("#+", "");
int y = str.length();
if(str.matches(".+#+.+")) {
y = str.indexOf("#");
sb.append( "<h" + n + ">" + str.substring(0,y) + "<h" + n + ">");
str = str.substring(y, str.length());
} else {
sb.append( "<h" + n + ">" + str.substring(0,y) + "<h" + n + ">");
}
}
System.out.println(sb.toString());
}

Related

How to cut word in java before and after space

Can you help me with this, please?
I would like to get only specified WORD from the below String.
String test1="This is WORD test".
I did this:
String regex = "\\s*\\bWORD\\b\\s*";
Text= test1.replaceAll(regex, " ");
and I get this: This is test
But what I want is the opposite: I want only the part matching the regex.
Sometime my String could be:
String test2="WORD it is the text"
String test3="Text WORD"
But all the time I would like to cut only specified word and put into other string. Thanks

Simple solution using regular expression where I only check for the word being either surrounded by space or at the beginning of the line with space after or at the end of the line with space before.
String regex = "( WORD )|(^WORD )|( WORD$)";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(test1);
if (m.find()) {
System.out.println("[" + m.group(0).trim() + "]");
}

EDIT
A possible way to solve this
String test1 = "This is WORD test";
String wordToFind = "WORD";
String message = "";
int k = 0;
for (int i = -1; (i = test1.indexOf(wordToFind, i + 1)) != -1; i++) {
k = i;
}
String s = test1.substring(k, k+ (wordToFind.length()));
if(s.equals(wordToFind)){
message = s;
} else {
message = "The word \"" + wordToFind + "\" was not found in \"" + test1 + "\"";
}
System.out.print(message);

How to tokenize the string with and without delimiter in single split

Assume i have a single string content as follows
Input:
FTX+AAA+++201707141009UTC'
FTX+BBB+++201707141009UTC'
FTX+CCC+++201707141009UTC?:??'
PISCO US LTS;?:V.D??'
SOUZA?:GB?:GB'
FTX+ZZZ+++201707141009UTC'
Expected Output:
Number of segments: 4
Input:
FTX+AAA+++201707141009UTC'
FTX+CCC+++201707141009UTC?:??'
PISCO US LTS;?:V.D??'
FTX+ZZZ+++201707141009UTC'
Expected Output:
Number of segments: 3
Basically i want to consider as same line when the delimiter ' comes with a question mark. The line delimiter is '
How to tokenize and get the count the segments in Java ???
Thanks in advance.

You can use a negative lookbehind in a regex:
String input = "FTX+AAA+++201707141009UTC'\n"
+ " FTX+BBB+++201707141009UTC'\n"
+ " FTX+CCC+++201707141009UTC?:??'\n"
+ " PISCO US LTS;?:V.D??' \n"
+ " SOUZA?:GB?:GB'\n"
+ " FTX+ZZZ+++201707141009UTC'";
String[] tokens = input.split("(?<!\\?)'\\s*");
System.out.println(tokens.length);
4
But, in the second example I would expect two segments, not three...

Another alternative to the above - but again demonstrating that the second example you post may be wrong because the third line ends with a ?' which, by your definition should not be a break.
public void test() {
test("FTX+AAA+++201707141009UTC'" +
"FTX+BBB+++201707141009UTC'" +
"FTX+CCC+++201707141009UTC?:??'" +
"PISCO US LTS;?:V.D??'" +
"SOUZA?:GB?:GB'" +
"FTX+ZZZ+++201707141009UTC'");
test("FTX+AAA+++201707141009UTC'" +
"FTX+CCC+++201707141009UTC?:??'" +
"PISCO US LTS;?:V.D??'" +
"FTX+ZZZ+++201707141009UTC'");
}
private void test(String s) {
String[] split = s.split("(?<!\\?)'");
System.out.println(split.length+"->"+Arrays.toString(split));
}
prints
4->[FTX+AAA+++201707141009UTC, FTX+BBB+++201707141009UTC, FTX+CCC+++201707141009UTC?:??'PISCO US LTS;?:V.D??'SOUZA?:GB?:GB, FTX+ZZZ+++201707141009UTC]
2->[FTX+AAA+++201707141009UTC, FTX+CCC+++201707141009UTC?:??'PISCO US LTS;?:V.D??'FTX+ZZZ+++201707141009UTC]

I think what he/she want is this:
String a = "FTX+AAA+++201707141009UTC'"
+ "FTX+BBB+++201707141009UTC'"
+ "FTX+CCC+++201707141009UTC?:??'"
+ "PISCO US LTS;?:V.D??' "
+ "SOUZA?:GB?:GB'"
+ "FTX+ZZZ+++201707141009UTC'";
String result[] = a.split("'");
List<String> stringList = new ArrayList<String>(Arrays.asList(result));
for (int i = 0; i < stringList.size(); i++) {
if (!stringList.get(i).startsWith("FTX") && i != 0) {
stringList.set(i-1, stringList.get(i-1) + stringList.get(i));
stringList.remove(i);
i--;
}
}
for (int j = 0; j < stringList.size(); j++) {
System.out.println(stringList.get(j));
}
FTX+AAA+++201707141009UTC
FTX+BBB+++201707141009UTC
FTX+CCC+++201707141009UTC?:??PISCO US LTS;?:V.D?? SOUZA?:GB?:GB
FTX+ZZZ+++201707141009UTC

how to delete empty line and rest of the character in java

I want to delete empty line and rest of the character from my string, I would like to parse particular value alone from the string.
I want this value alone 23243232 from my string, after product price I've have empty line space and again I've some character so I'm using that empty line as delimiter and trying to get product price alone. But I'm getting other values also along with 23243232. Can someone help me to get only 23243232 from this string
String actualResponse = "--sGEFoZV85Qnkco_QAU5b6B3Tt1OrOOFkArwzoF_yDmmW5DfupJDtuHlh20LL2SAbWZb8a3exzoF_yDmmW5DfupJDtuHlh20LL2SAbWZb8a3exsGEFoZV85Qnkco_QAU5b6B3Tt1OrOOFkArw\r\n"
+ "Product-Discription: form-name; productName=\"iPhone\"\r\n" + "Product-Type: Mobile\r\n"
+ "Product-Price: 23243232\r\n" + "\r\n" + "%dsafdfw32.323efaeed\r\n" + "#$#####";
String productPrice = actualResponse.substring(actualResponse.lastIndexOf("Product-Price:") + 15);
System.out.println("Printing product price ..." + productPrice);
String finalString = productPrice.replaceAll(" .*", "");
This is the output I'm getting:
Printing product price ...23243232
%dsafdfw32.323efaeed
#$#####
But I want only 23243232 - this value alone.

Apply Regular Expression for more flexibility.
String content = "--sGEFoZV85Qnkco_QAU5b6B3Tt1OrOOFkArwzoF_yDmmW5DfupJDtuHlh20LL2SAbWZb8a3exzoF_yDmmW5DfupJDtuHlh20LL2SAbWZb8a3exsGEFoZV85Qnkco_QAU5b6B3Tt1OrOOFkArw\r\n"
+ "Product-Discription: form-name; productName=\"iPhone\"\r\n" + "Product-Type: Mobile\r\n"
+ "Product-Price: 23243232\r\n" + "\r\n" + "%dsafdfw32.323efaeed\r\n" + "#$#####";
String re1 = "\\bProduct-Price:\\s"; // Word 1
String re2 = "(\\d+)"; // Integer Number 1
Pattern p = Pattern.compile(re1 + re2, Pattern.DOTALL);
Matcher m = p.matcher(content);
while (m.find()) {
for (int i = 0; i <= m.groupCount(); i++) {
System.out.println(String.format("Group=%d | Value=%s",i, m.group(i)));
}
}
It will print out:
Group=0 | Value=Product-Price: 23243232
Group=1 | Value=23243232

first solution came in my mind. its not the best but will solve your problem.
StringBuilder finalString =new StringBuilder();
for (Character c : productPrice.toCharArray()) {
if(Character.isDigit(c)){
finalString.append(c);
}else{
break;
}
}

This is because you are printing the entire sub-string right from index: actualResponse.lastIndexOf("Product-Price:") + 15 to the end of the string.
You need to provide the end index too as a second parameter in substring method.
You need to use this:
int start = actualResponse.lastIndexOf("Product-Price:") + 15;
int end = actualResponse.indexOf("\r\n", start); // The first "\r\n" from the index `start`
String productPrice = actualResponse.substring(start, end);

This will give your final ans...
String actualResponse ="--sGEFoZV85Qnkco_QAU5b6B3Tt1OrOOFkArwzoF_yDmmW5DfupJDtuHlh20LL2SAbWZb8a3exzoF_y DmmW5DfupJDtuHlh20LL2SAbWZb8a3exsGEFoZV85Qnkco_QAU5b6B3Tt1OrOOFkArw\r\n"
+ "Product-Discription: form-name; productName=\"iPhone\"\r\n" + "Product-Type: Mobile\r\n"
+ "Product-Price: 23243232\r\n" + "\r\n" + "%dsafdfw32.323efaeed\r\n" + "#$#####";
String productPrice = actualResponse.substring(actualResponse.lastIndexOf("Product-Price:") + 15);
System.out.println("Printing content lenght..." + productPrice.split("\r\n")[0]);

find the particular string that not being trapped between double quotes java regex

I want to find a string (say x) that satisfies two conditions:
matches the pattern \b(x)\b
does not match the pattern ".*?(x).*?(?<!\\)"
In other words, I am looking for a value of x that is a complete word (condition1) and it is not in double quotes (condition2).
" x /" m" not acceptable
" x \" " + x + " except" :only the second x is acceptable.
What Java code will find x?

The first condition is straight forward. To check second condition you will have to check number of valid double quotes. If they are even then the string captured in first condition is valid.
String text = "basdf + \" asdf \\\" b\" + b + \"wer \\\"\"";
String toCapture = "b";
Pattern pattern1 = Pattern.compile("\\b" + toCapture + "\\b");
Pattern pattern2 = Pattern.compile("(?<!\\\\)\"");
Matcher m1 = pattern1.matcher(text);
Matcher m2;
while(m1.find()){ // if any <toCapture> found (first condition fulfilled)
int start = m1.start();
m2 = pattern2.matcher(text);
int count = 0;
while(m2.find() && m2.start() < start){ // count number of valid double quotes "
count++;
}
if(count % 2 == 0) { // if number of valid quotes is even
char[] tcar = new char[text.length()];
Arrays.fill(tcar, '-');
tcar[start] = '^';
System.out.println(start);
System.out.println(text);
System.out.println(new String(tcar));
}
}
Output :
23
basdf + " asdf \" b" + b + "wer \""
-----------------------^-----------

Split with multiple special characters: √(A&B)

I have √(A&B)=|C| equation,
after split, I get this value
[√, (, A&B, ),=,|, C,|]
how can get value like this
[√(, A&B, ),=,|, C,|]
This my code,
return teks.split(""
+ "((?<=\\ )|(?=\\ ))|"
+ "((?<=\\!)|(?=\\!))|"
+ "((?<=\\√\\()|(?=\\√\\())|" //this is my problem
+ "((?<=\\√)|(?=\\√))|" //and this
+ "((?<=\\∛)|(?=\\∛))|"
+ "((?<=\\/)|(?=\\|))"
+ "((?<=\\&)|(?=\\&))"
+ "");
}

Try Matcher.find() for following regexp:
String s = "√(A&B)=|C|";
Matcher m = Pattern.compile("("
+ "(√\\()"
+ "|(\\))"
+ "|(\\w(\\&\\w)*)"
+ "|(=)"
+ "|(\\|)"
+ ")").matcher(s);
ArrayList<String> r = new ArrayList<>();
while(m.find())
r.add(m.group(1));
System.out.printf("%s", r.toString());
Result:
[√(, A&B, ), =, |, C, |]
Upd.
Or, if any symbol before parenthesis (except of "=") should be counted as one symbol with that "(":
String s = "√(A&(B&C))=(|C| & (! D))";
Matcher m = Pattern.compile("("
+ "[^\\s=]?\\(" // capture opening bracket with modifier (if any)
// you can replace it with "[√]?\\(", if only
// "√" symbol should go in conjunction with brace
+ "|\\)" // capture closing bracket
+ "|\\w" // capture identifiers
+ "|[=!\\&\\|]" // capture symbols "=", "!", "&" and "|"
+ ")").matcher(s.replaceAll("\\s", ""));
ArrayList<String> r = new ArrayList<>();
while(m.find())
r.add(m.group(1));
System.out.printf("%s -> %s\n", s, r.toString().replaceAll(", ", ",")); // ArrayList joins it's elements with ", ", so, removing extra space
Result:
√(A&(B&C))=(|C| & (! D)) -> [√(,A,&(,B,&,C,),),=,(,|,C,|,&(,!,D,),)]

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Markdown algorithm: string difficulties - java

You are matching wrong string, try this one: #+[^#]+ And of course you want to make call it recursivly or in a loop

Related

How to cut word in java before and after space

How to tokenize the string with and without delimiter in single split

how to delete empty line and rest of the character in java

find the particular string that not being trapped between double quotes java regex

Split with multiple special characters: √(A&B)

Categories

Resources