If the code we have looks like
for(...){
}
after reformatting I'd like it to look like
for(...)
{
}
as well for all functions, methods, classes etc.
I found something similar in other article in stackoverflow but it was a regular expression and needed to type every time in the vim console. And I am looking for something to put in the vimrc file (if possible) and to work every time I open it.
Well this is the one I've found:
:%s/^(\s*).*\zs{\s*$/\r\1{/
in http://stackoverflow.com/questions/4463211/is-there-a-way-to-reformat-braces-automatically-with-vim but the thing is it adds a new line even if the bracket is on the right place... and still don't know how to map it to key combination.
(edited with a more accurate pattern)
This should do the trick:
nnoremap <F9> :%s/^\(\s*\).\+\zs{\ze\s*$/\r\1{<cr>
But it it doesn't really sound "safe" to me.
Instead, you could do:
nnoremap <F9> :%s/^\(\s*\).\+\zs{\ze\s*$/\r\1{/c<cr>
which will ask for a confirmation for each match.
Or record a macro and play it back using :global.
edit
Your pattern, :%s/^(\s*).*\zs{\s*$/\r\1{/, is wrong because:
the capture parentheses are not properly escaped, (\s*) instead of \(\s*\)
.* would match any number of any character, including 0 which is why the substitution also works on lines with a single {.
Related
I am trying to create a regex in Java to match and get the name, version, channel and owner for each dependency but I haven't been able to have one that covers all the possible scenarios:
the structure is something like name/version#owner/channel, where the version might have a semver structure, the owner and channel are optional.
Currently, I have :
^(?<name>[\d\w][\d\w\+\.-]+)\/(?<version>[\d\w][\d\w\.-]+)(#(?<owner>\w+))?(\/(?<channel>.+))?$
but it's failing for boost_atomic/1.59.0+4#owner/release, since the +4 is not matched and I need the value before that -> 1.59.0
Some other scenarios that need to be valid and are valid for the regex above are:
Poco/1.9.0#pocoproject/stable
zlib/1.2.11#conan/stable
freetype/2.10.1/stable
openssl/1.0.2g/stable
openssl/1.0.2g
openssl/1.0.2g#owner
Also, there might be some dependencies with comments :
zlib/1.2.11#conan/stable # comment
In that case I would need to get rid of the component and only get the relevant information with the regex.
I am not sure if my current regex is good, but from what I've tested only some scenarios are missing
You can simplify your regex and avoid putting too many characters in that character set and escaping them, instead use something like [^\/] to capture anything except / as you want to capture anything preceding a slash.
I've made some modifications and the updated regex that should work for you is following,
^(?<name>[^\/]+)\/(?<version>[^\/#\s]+)(#(?<owner>\w+))?(\/(?<channel>\S+))?(?:\s*#\s*(?<comment>.+))?$
I've added another named group for comment as you mentioned that can also be present. Let me know if this works for you.
Try this demo
Edit: If channel contains a text like release:132434 and anything followed by a colon is to be ignored as part of channel, you can use updated regex below,
^(?<name>[^\/]+)\/(?<version>[^\/#\s]+)(?:#(?<owner>\w+))?(?:\/(?<channel>[^:\s]+)\S*)?(?:\s*#\s*(?<comment>.+))?\s*$
Updated Demo
I want to search all if conditions in .java file.
I am using BufferedReader to read file and pattern to search condition.
My program searching all if but when my file look this:
// if{}
I get bad result.
I want to get only valid if conditions (also if{} and if {} - between if and { is space), without conditions in comments.
How should it look regex?
Full code: http://pastebin.com/55RMfwg2
^(?!a\\/\\/) *if *\\{(.|\n)*}
This regex will look for if without // before it and with optional space after it,
it will also catch the closing bracket } and allow new line character between the brackets.
Moreover it will accept spaces before the if.
If multi-line comments /* */ should be skipped also, I think as other people wrote, it will be easier to just clean the file before.
There are many websites that can help you to find the exact regex, i will recommend RegExr.
I have a little problem.
I have a text that i have to read in browser several time.
Everytime, I open this text, automatically start a replaceAll that i wrote.
It's very simple, basic but that problem is that when i do replace next time (every time i read this text) i have a replaceAll of replaceAll.
For example i have in the text:
XIII
I want to replace it whith
<b>XIII</b>
with:
txt.replaceAll("XIII","<b>XIII</b>")
The first time it's everything fine, but then, when i read again the text, it become:
<b><b>XIII</b></b>
It's a stupid problem, but i start now with Java.
I read that is possibile use regex.Could someone post a little example?
Thanks, and excuse me for my poor english.
You need negative lookbehind to prevent a match on an already marked-up string:
txt.replaceAll("(?<!>)XIII","<b>XIII</b");
This expression looks a bit convoluted, but this is how it decomposes:
(?<! ... ) is the template for the negative lookbehind;
> is the specific character we want to make sure doesn't occur in front of your string.
I should also warn you that fixing up HTML with regex's usually turns into a diabolic cycle of upgrading the regex to handle yet another special case, only to see it fail on the next one. It ends up with a monster that nobody can read, let alone improve.
There's a really fast solution. Do the opposite Replace before doing your own.
Let me show:
txt.replaceAll("<b>XIII</b>","XIII").replaceAll("XIII","<b>XIII</b>")
So you first turn your <b> into normal and than turn it back with <b> and it will achieve the same result without adding the new level of <b>.
What about this:
txt = txt.replaceAll ("XIII", "<b>XIII</b>").
replceAll ("<b><b>", "<b>").replaceAll ("</b></b>", "</b>");
I think <b><b> and </b></b> do not have much sense in HTML, so it is fine to remove duplicates even in other places.
I have data coming in a txt file delimited by pipes. The unfortunate thing is 2 fields can have multiple values. To separate these multiples, the sender used pipes again, but put quotes around it. My regex worked for months until a certain rare situation...
Regex currently:
([^\|]*)\|"?([^"]*)"?\|([^\|]*)\|"?([^"]*)"?
And it worked for the following situation which happens most of the time:
abc|"part1|part2"|abc|"tool1|tool2"
But this case is where the ([^"]*) jumps ahead and takes all from the blank to the end of the quotes:
abc||abc|"tool1|tool2"
So I realize I must account for when there is a pipe next instead of a quote.
Just not sure how.............
P.S. For those PIG people that might be looking at this, I removed a backslash from each escape, to make it look more like Java, but in PIG you need 2, fyi.
In your expression you need to specify that the part between |s can be either quoted or not quoted. You can do it as follows:
(("[^"]*")|((?!")[^|]*))
Now you can repeat this part several times with |s in between, as you need.
Sorry I couldn't think of a better title, but thanks for reading!
My ultimate goal is to read a .java file, parse it, and pull out every identifier. Then store them all in a list. Two preconditions are there are no comments in the file, and all identifiers are composed of letters only.
Right now I can read the file, parse it by spaces, and store everything in a list. If anything in the list is a java reserved word, it is removed. Also, I remove any loose symbols that are not attached to anything (brackets and arithmetic symbols).
Now I am left with a bunch of weird strings, but at least they have no spaces in them. I know I am going to have to re-parse everything with a . delimiter in order to pull out identifiers like System.out.print, but what about strings like this example:
Logger.getLogger(MyHash.class.getName()).log(Level.SEVERE,
After re-parsing by . I will be left with more crazy strings like:
getLogger(MyHash
getName())
log(Level
SEVERE,
How am I going to be able to pull out all the identifiers while leaving out all the trash? Just keep re-parsing by every symbol that could exist in java code? That seems rather lame and time consuming. I am not even sure if it would work completely. So, can you suggest a better way of doing this?
There are several solutions that you can use, other than hacking your-own parser:
Use an existing parser, such as this one.
Use BCEL to read bytecode, which includes all fields and variables.
Hack into the compiler or run-time, using annotation processing or mirrors - I'm not sure you can find all identifiers this way, but fields and parameters for sure.
I wouldn't separate the entire file at once according to whitespace. Instead, I would scan the file letter-by-letter, saving every character in a buffer until I'm sure an identifier has been reached.
In pseudo-code:
clean buffer
for each letter l in file:
if l is '
toggle "character mode"
if l is "
toggle "string mode"
if l is a letter AND "character mode" is off AND "string mode" is off
add l to end of buffer
else
if buffer is NOT a keyword or a literal
add buffer to list of identifiers
clean buffer
Notice some lines here hide further complexity - for example, to check if the buffer is a literal you need to check for both true, false, and null.
In addition, there are more bugs in the pseudo-code - it will find identify things like the e and L parts of literals (e in floating-point literals, L in long literals) as well. I suggest adding additional "modes" to take care of them, but it's a bit tricky.
Also there are a few more things if you want to make sure it's accurate - for example you have to make sure you work with unicode. I would strongly recommend investigating the lexical structure of the language, so you won't miss anything.
EDIT:
This solution can easily be extended to deal with identifiers with numbers, as well as with comments.
Small bug above - you need to handle \" differently than ", same with \' and '.
Wow, ok. Parsing is hard -- really hard -- to do right. Rolling your own java parser is going to be incredibly difficult to do right. You'll find there are a lot of edge cases you're just not prepared for. To really do it right, and handle all the edge cases, you'll need to write a real parser. A real parser is composed of a number of things:
A lexical analyzer to break the input up into logical chunks
A grammar to determine how to interpret the aforementioned chunks
The actual "parser" which is generated from the grammar using a tool like ANTLR
A symbol table to store identifiers in
An abstract syntax tree to represent the code you've parsed
Once you have all that, you can have a real parser. Of course you could skip the abstract syntax tree, but you need pretty much everything else. That leaves you with writing about 1/3 of a compiler. If you truly want to complete this project yourself, you should see if you can find an example for ANTLR which contains a preexisting java grammar definition. That'll get you most of the way there, and then you'll need to use ANTLR to fill in your symbol table.
Alternately, you could go with the clever solutions suggested by Little Bobby Tables (awesome name, btw Bobby).