How to Remove Special Character Except Comma - java

I have an Output in my Android EditText like below :
["HOT","SMALL"]
I want my Output like below :
HOT,SMALL
I want to remove [] and "" but not the Comma , . I have read this but its not work. I tried this but this one remove all Special Chars. Anybody can help my problem, any suggest will helpfull for me. Thanks Before.

There are a couple of ways I'd do this.
The first, quick and straight forward is to just replace all the special characters with "", using a regex and String.replaceAll
myString.replaceAll("[\\\"\\[\\]]", "");
(Btw, I used http://rubular.com/ as a quick way to check my regex. Remember that the regex needs to be escaped for java - I used this tool to do that.)
The alternative is that you're actually looking at the String representation of a JSON object here, so convert the JSON string into a Java array of Strings using something like org.json, and then concatenate the strings together with a , delimiter.

Related

Java regex: Replacing dynamic substrings

Suppose I have a String containing static tags that looks like this:
mystring = "[tag]some text[/tag] untagged text [tag]some more text[/tag]"
I want to remove everything between each tag pair. I've figured out how to do so by using the following regex:
mystring = mystring.replaceAll("(?<=\\[tag])(.*?)(?=\\[/tag])", "");
The result of which will be:
mystring = "[tag][/tag] untagged text [tag][/tag]"
However, I'm unsure how to accomplish the same goal if the opening tag is dynamic. Example:
mystring = "[tag parameter="123"]some text[/tag] untagged text [tag parameter="456"]some more text[/tag]"
The "value" of the parameter portion of the tag is dynamic. Somehow, I have to introduce a wildcard to my current regex, but I am unsure how to do this.
Essentially, replace the contents of all pairings of "[tag*]" and "[/tag]" with empty string.
An obvious solution would be to do something like this:
mystring = mystring.replaceAll("(?<=\\[tag)(.*?)(?=\\[/tag])", "");
However, I feel like that would be hacking around the problem because I'm not really capturing a full tag.
Could anyone provide me with a solution to this problem? Thanks!
I guess I've got it.
I thought long and hard about what #AshishMathew said, and yeah, lookbehinds can't have unfixed, lengths, but maybe instead of replacing it with nothing, we add a ] to it, like so:
mystring = mystring.replaceAll("(?<=\\[tag)(.*?)(?=\\[/tag])", "]");
(?<=\\[tag) is the look-behind which matches [tag
(.*?) is all the code between [tag and [/tag], which may even be the parameters of the tag, all of which is replaced by a ]
When I tried this code by replacing the match with "", I got [tag[/tag] untagged text [tag[/tag] as the output. Hence, by replacing the match with a ] instead of nothing, you get the (hopefully) desired output.
So this is my lazy solution (pardon the regex pun) to the problem.
I suggest matching the whole tag with content and replacing with the opening/closing tags without content :
mystring.replaceAll("\\[tag[^\\]]*\\][^\\[]*\\[/tag]", "[tag][/tag]")
Ideone test.
Note that I didn't bother conserving the tag attributes since you mentionned in another answer's comments that you didn't need them, but they could be kept by using a capturing group.

regex to replace the value of a key in a json file

I want to make a regex so I can do a "Search/Replace" over a json file with many object. Every object has a key named "resource" containing a URL.
Take a look at these examples:
"resource":"/designs/123/image.jpg"
"resource":"/designs/221/elephant.gif"
"resource":"/designs/icon.png"
I want to make a regex to replace the whole url with a string like this: localhost:8080/filepath.
This way, the result would be:
"resource":"localhost:8080/designs/123/image.jpg"
"resource":"localhost:8080/designs/221/elephant.gif"
"resource":"localhost:8080/designs/icon.png"
I'm just starting with regular expressions and I'm completely lost. I was thinking that one valid idea would be to write something starting with this pattern "resource":"
How could I write the regular expression?
The easiest method is probably just to replace "resource":"/ with "resource":"localhost:8080/. You don't even need a regex for this (but if you do you just have to escape some stuff).
With vim this would be
:%s/"resource":"\(.*\)"/"resource":"localhost:8080\1"
this should be easily transferable to java.

Java Splitting a String

I have this string
G234101,Non-Essential,ATPases,Respiration chain complexes,"Auxotrophies, carbon and",PS00017,2,IONIC HOMEOSTASIS,mitochondria.
That I have been trying to split in java. The file is comma delimeted but some of the strings have commas within them and I don't want them to get split up. Currently in the above example
"Auxotrophies, carbon and"
is getting split into two strings.
Any suggestions on how to best split this up by comma's. Not all of the strings have the " " for example the following string:
G234103,Essential,Protein Kinases,?,Cell cycle defects,PS00479,2,CELLULAR COMMUNICATION/SIGNAL TRANSDUCTION,cytoplasm.
http://opencsv.sourceforge.net/
But if you really do need to reinvent the wheel (homework), you need to use a more complicated regular expression than just "what,ever".split(","). It's not simple though. And you might be better off creating your own custom Lexer. http://en.wikipedia.org/wiki/Lexical_analysis
This isn't too hard in your case. As you process your text character by character you just need to keep track of opening and closing quotes to decide when to ignore commas and when to act on them.
Also see StreamTokenizer for a built-in configurable Lexer - you should be able to use this to meet your requirements.
I would think that this would be a multi step process. First, find all the comma's in quotes from your original string, replace it with something like {comma}. You can do this with some regex. Then on the new string, split the new string with the comma symbol(,). Then go through your list, and replace the {comma} with the comma symbol {,}.

Input Sanitizing to not break JSON syntax

So, in a nutshell I'm trying to create a regex that I can use in a java program that is about to submit a JSON object to my php server.
myString.replaceAll(myRegexString,"");
My question is that I am absolutely no good with regex and to add onto that I need to escape the characters properly as its stored in a string, and then also escape the characters properly inside the regex. good lordy.
What I came up with was this:
String myRegexString = "[\"',{}[]:;]"
The first backslash was to escape outer quotes to get a " in there. And then it struck me that {} and [] are also regex commands. Would I escape those as well? Like:
String myRegexString = "[\"',\{\}\[\]:;]"
Thanks in advance. In case it wasnt clear from examples above the only characters I really care about at this moment in time is:
" { } [ ] , and also ; : ' for general sqlinj protection.
UPDATE:
This is the final regex:
[\\Q\"',{}[\]:;\\E] for anyone else curious. Thanks Amit!
Why don't you use an actual JSON encoding API/framework? What you're doing is not sanitizing. What you're doing is corrupting the data. If my name is O'Reilly, I want it to be spelled O'Reilly, not OReilly. If I send a message containing [ or {, I want these to be in the messages. Use a framework or API that escapes those characters when needed rather than removing them blindly.
Googling for JSON Java will lead you to many APIs and frameworks.
Try something like
String myRegexString = "[\\Q\"',{}[]:;\\E]";
now the characters between \Q and \E are now treated as normal characters.

Java Inner Text (getTextContents()) Problem

I'm trying to do some parsing in Java and I'm using Cobra HTML Parser to get the HTML into a DOM then I'm using XPath to get the nodes I want. When I get down to the desired level I call node.getTextContents(), but this gives me a string like
"\n\n\nValue\n-\nValue\n\n\n"
Is there a built in way to get rid of the line breaks? I would like to do a RegEx like
(?:\s*([^-]+)\s*-\s*([^-]+)\s*)
on the inner text and would really prefer not to have to deal with the possible different white space symbols in between the text.
Example Input:
Value
-
Value
Thanks
You can use String.replaceAll().
String trimmed = original_string.replaceAll("\n", "");
The first argument is a regular expression: you could replace all contiguous blocks of whitespace in the original string with replaceAll("\\s+", "") for instance.
I'm not totally sure I understood the question correctly, but the simplest way to remove all the whitespace would be:
String s = node.getTextContents().replaceAll("\\s","");
If you just want to get rid of the leading/trailing whitespace, use trim().

Categories