How should I get this specific text from str? - java

I have a lot of strings in database like this : "\\LDDESKTOP\news\1455Bloomberg Document # 180784.txt". I want to get the file name after the last slash.
I do this just in a normal way :
str.substring(str.lastIndexOf("\\")+1)
But it doesn't work because the single slash is used for change meanings. Is there a way in java just like python to tell compiler to regard it as a plain string like this , str=r'.......' .
Or how to change the string to "\\\\LDDESKTOP\\news\\1455Bloomberg Document # 180784.txt". So I can pass it to File Object to read this file.
how should I do this? Or other ways to solve this.
Thanks.
The column named path(varchar(150)) in the news table is like this "\LDDESKTOP\news\1362Bloomberg Document # 180691.txt"
And I do a normal select on the path.
the code :
public List<String> getNewsFileName(String startTime,String endTime) {
List<String> newsFileNames = new ArrayList<String>();
String tableName = ConfigFile.getConfig("configuration.txt","SQLServerTable");
String sql = "select Path from [" + tableName + "] where localtime >= '" + startTime + "' and localtime <= '" + endTime + "'";
try {
if(connection==null) {
InvertedIndex.logger.log(Level.SEVERE, "Database connection has not been initialized");
System.exit(-1);
}
stmt=connection.createStatement();
ResultSet rs = stmt.executeQuery(sql);
while(rs.next()) {
String path=rs.getString(1);
newsFileNames.add(path);
}
} catch (SQLException e) {
InvertedIndex.logger.log(Level.SEVERE,"Fail to store news");
}
return newsFileNames;
}

You use Escape Sequences to specify certain special characters that also have java properties assigned to them.
In order to print a single backslash character in a string you use a set of 2 backslashes \\.
String string = new String("\\\\LDDESKTOP\\news\\1455Bloomberg Document # 180784.txt");
String str = string.substring(string.lastIndexOf("\\")+1);
System.out.println(str);
This prints
1455Bloomberg Document # 180784.txt
Edit 1:
Once you have the string, you can pass it back using the same escape character.
String string = "\\\\LDDESKTOP\\news\\" + str;
This outputs the original
\\LDDESKTOP\news\1455Bloomberg Document # 180784.txt
Edit 2:
Based on what you asked, in order to transform all single backslashes into double backslashes you must use both the escape sequence and the string "replace" method.
If you have this string:
String string = new String("\\\\LDDESKTOP\\news\\1455Bloomberg Document # 180784.txt");
You need to call this code to "double" every backslash:
String newString = string.replace("\\", "\\\\");
This produces the following:
//Note this is before we print it. This illustrates all the escape sequences.
\\\\\\\\LDDESKTOP\\\\news\\\\1455Bloomberg Document # 180784.txt
The string itself will look like this:
\\\\LDDESKTOP\\news\\1455Bloomberg Document # 180784.txt

this code :
String st = "\\LDDESKTOP\news\1455Bloomberg Document # 180784.txt";
st = st.replace("\n", "\\n");
st = st.replace("\\", "\\\\");
String str = st.substring(st.lastIndexOf("\\")+1);
test it.
"\n" is line break.

Thanks for all the efforts you have made . Finally , I think I have found the answer.
Instead of dealing with the string in java program, I process the string using sql functions directly.
Following is what I do.
SELECT * substring(path,len(path)-charindex('\',reverse(path))+2,charindex('\',reverse(path)))
FROM News
This really does a good job !!

Related

regular expression to replaceall substrings embedded in open curling brackets and followed by equal sign and digits

In the follwing String
String toBeFormatted= "[[LngLatAlt{longitude=-7.125924901999952, latitude=33.831783175000055, altitude=NaN},
LngLatAlt{longitude=-5.401396163999948, latitude=35.92213140900003, altitude=NaN}]]"
1- I need to replace all "LngLatAlt{longitude=" with open bracket "["
2- also need to replace all the intermediate ", latitude=33.831783175000055, altitude=NaN}" with ",33.831783175000055]"
That way my string result :
"[[[-7.125924901999952,33.831783175000055],[-5.401396163999948,35.92213140900003]]]"
try it the following reg exp :
String regexTarget = "(\\[\\[LngLatAlt\\{longitude=)";
toBeFormatted.replaceAll(regexTarget, "\\[\\[\\[");
String regexTarget0 = "(, altitude=NaN\\}, LngLatAlt\\{longitude=)";
toBeFormatted.replaceAll(regexTarget0, "],\\[");
String regexTarget1 = "(, latitude=)";
toBeFormatted.replaceAll(regexTarget1, " ,");
String regexTarget2 = "(, altitude=NaN\\})";
toBeFormatted.replaceAll(regexTarget2, "]");
but it seems not working.
Thank you for your help.
try something like:
String result = toBeFormatted.replaceAll("LngLatAlt\\{longitude=([^,]+), latitude=([^,]+), ([^}]+)\\}", "[$1, $2]");
System.out.println(result);

Need Regular Expression to parse multi-line environmental variables

I want to parse a file that is a list of environmental variables similar to this example:
TPS_LIB_DIR = "$DEF_VERSION_DIR\lib\ver215";
TPS_PH_DIR = "$DEF_VERSION_DIR";
TPS_SCHEMA_DIR = "~TPS_DIR\Supersedes\code;" +
"~TPR_DIR\..\Supersedes\code;" +
"~TPN_DIR\..\..\Supersedes\code;" +
"$TPS_VERSION_DIR";
TPS_LIB_DIR = "C:\prog\lib";
BASE_DIR = "C:\prog\base";
SPARS_DIR = "C:\prog\spars";
SIGNALFILE_DIR = "E:\SIGNAL_FILES";
SIGNALFILE2_DIR = "E:\SIGNAL_FILES2";
SIGNALFILE3_DIR = "E:\SIGNAL_FILES2";
I came up with this regular expression that matches the single line definitions fine, but it will not match the multi-line definitions.
(\w+)\s*=\s*(.*);[\r\n]+
Does anyone know of a regular expression which will parse all lines in this file where the environmental variable name is in group 1 and the value (on right side of =) is in group 2? Even better would be if the multiple paths were in separate groups, but I can handle that part manually.
UPDATE:
Here is what I ended up implementing. The first pattern "Pattern p" matches the individual environmental variable blocks. The second pattern, "Pattern valpattern" parses the one or more values for each environmental variable. Hope someone finds this useful.
private static void parse(File filename) {
Pattern p = Pattern.compile("(\\w+)\\s*=\\s*([\\s\\S]+?\";)");
Pattern valpattern = Pattern.compile("\\s*\"(.+)\"\\s*");
try {
String str = readFile(filename, StandardCharsets.UTF_8);
Matcher matcher = p.matcher(str);
while(matcher.find()) {
String key = matcher.group(1);
Matcher valmatcher = valpattern.matcher(matcher.group(2));
System.out.println(key);
while(valmatcher.find()) {
System.out.println("\t" + valmatcher.group(1).replaceAll(System.getProperty("line.separator"), ""));
}
}
} catch (IOException e) {
System.out.println("Error: ProcessENV.parse -- problem parsing file: " + filename + System.lineSeparator());
e.printStackTrace();
}
}
static String readFile(File file, Charset encoding) throws IOException {
byte[] encoded = Files.readAllBytes(file.toPath());
return new String(encoded, encoding);
}
It is simpler to split on '=' and '";'.
[ c.strip().split(' = ') for c in s.split('";') ]
Or with double comprehension to get the individual paths:
[ [p[0].strip(), * [x.strip() for x in p.strip().split('=')] for c in s.split('";') for p in c.split(" = ")]
Split could be done with re, adding \s* to remove the trailing spaces:
re.split(r'\s*=\s*|";\s*', text, flags=re.MULTILINE):
even elements r[::2] would be vars, odd [1::2] values
then get rid of extra white space in values
You can use the following regex:
(\w+)\s*=\s*([\s\S]+?)";
It will start by matching a Group 1 of Word character, zero or more White Spaces, an equal sign, zero or more White Space, then a Group 2 or more of any characters (non greedy), and finally a a last double quote and a semi colon.
That will match all the lines.

Parse a plain text into a Java Object

I´m parsing a plain text and trying to convert into an Object.
The text looks like(and i can´t change the format):
"N001";"2014-08-12-07.11.37.352000";" ";"some#email.com ";4847 ;"street";"NAME SURNAME ";26 ;"CALIFORNIA ";21
and The Object to convert:
String index;
String timestamp;
String mail;
Integer zipCode
...
I´ve tried with:
StringTokenizer st1 = new StringTokenizer(N001\";\"2014-08-12-07.11.37.352000\";\" \";\"some#email.com \";4847 ;\"street\";\"NAME SURNAME \";26 ;\"CALIFORNIA \";21);
while(st2.hasMoreTokens()) {
System.out.println(st2.nextToken(";").replaceAll("\"",""));
}
And the output is the correct one, i´ve thinking to have a counter and hardcoding with a case bucle and set the field deppending the counter, but the problem is that I have 40 fields...
Some idea?
Thanks a lot!
String line = "N001";"2014-08-12-07.11.37.352000";" ";"some#email.com ";4847 ;"street";"NAME SURNAME ";26 ;"CALIFORNIA ";21
StringTokenizer st1 = new StringTokenizer(line, ";");
while(st2.hasMoreTokens()) {
System.out.println(st2.nextToken().replaceAll("\"",""));
}
Or you can use split method and directly get a array of values using the delimiter ;
String []values = line.split(";");
then iterate through the array and get and cast the values they way you want
Regardless of the way you are parsing the file, you somehow need to define the mapping of column-to-field (and how to parse the text).
if this is a CVS file, you could use a library like super-csv. All you need to do is write a mapping definition.
I would first split your input String based on the semi-colon separator, then clean up the values.
For instance:
String input = "\"N001\";\"2014-08-12-07.11.37.352000\";\" " +
"\";\"some#email.com " +
"\";4847 ;\"street\";\"NAME " +
"SURNAME \";26 ;\"CALIFORNIA " +
"\";21 ";
// raw split
String[] split = input.split(";");
System.out.printf("Raw: %n%s%n", Arrays.toString(split));
// cleaning up whitespace and double quotes
ArrayList<String> cleanValues = new ArrayList<String>();
for (String s: split) {
String clean = s.replaceAll("[\\s\"]", "");
if (!clean.isEmpty()) {
cleanValues.add(clean);
}
}
System.out.printf("Clean: %n%s%n", cleanValues);
Output
Raw:
["N001", "2014-08-12-07.11.37.352000", " ", "some#email.com ", 4847 , "street", "NAME SURNAME ", 26 , "CALIFORNIA ", 21 ]
Clean:
[N001, 2014-08-12-07.11.37.352000, some#email.com, 4847, street, NAMESURNAME, 26, CALIFORNIA, 21]
Note
In order to map the values to your variables you will need to know their index in advance, and it will have to be consistent.
Then you can use the get(int i) method to retrieve them from your List - e.g. cleanValues.get(2) will get you the e-mail, etc.
Note (2)
If you do not know the indices in advance or they may vary, then you are in trouble.
You can of course try to get those indices by using regular expressions but I suspect you might end up complicating your life quite a bit.
you can use Java Reflection to automate your process.
Iterate over the fields
Field[] fields = dummyRow.getClass().getFields();
and set your values
SomeClass object = construct.newInstance();
field.set(object , value);

Replace String in Java with regex and replaceAll

Is there a simple solution to parse a String by using regex in Java?
I have to adapt a HTML page. Therefore I have to parse several strings, e.g.:
href="/browse/PJBUGS-911"
=>
href="PJBUGS-911.html"
The pattern of the strings is only different corresponding to the ID (e.g. 911). My first idea looks like this:
String input = "";
String output = input.replaceAll("href=\"/browse/PJBUGS\\-[0-9]*\"", "href=\"PJBUGS-???.html\"");
I want to replace everything except the ID. How can I do this?
Would be nice if someone can help me :)
You can capture substrings that were matched by your pattern, using parentheses. And then you can use the captured things in the replacement with $n where n is the number of the set of parentheses (counting opening parentheses from left to right). For your example:
String output = input.replaceAll("href=\"/browse/PJBUGS-([0-9]*)\"", "href=\"PJBUGS-$1.html\"");
Or if you want:
String output = input.replaceAll("href=\"/browse/(PJBUGS-[0-9]*)\"", "href=\"$1.html\"");
This does not use regexp. But maybe it still solves your problem.
output = "href=\"" + input.substring(input.lastIndexOf("/")) + ".html\"";
This is how I would do it:
public static void main(String[] args)
{
String text = "href=\"/browse/PJBUGS-911\" blahblah href=\"/browse/PJBUGS-111\" " +
"blahblah href=\"/browse/PJBUGS-34234\"";
Pattern ptrn = Pattern.compile("href=\"/browse/(PJBUGS-[0-9]+?)\"");
Matcher mtchr = ptrn.matcher(text);
while(mtchr.find())
{
String match = mtchr.group(0);
String insMatch = mtchr.group(1);
String repl = match.replaceFirst(match, "href=\"" + insMatch + ".html\"");
System.out.println("orig = <" + match + "> repl = <" + repl + ">");
}
}
This just shows the regex and replacements, not the final formatted text, which you can get by using Matcher.replaceAll:
String allRepl = mtchr.replaceAll("href=\"$1.html\"");
If just interested in replacing all, you don't need the loop -- I used it just for debugging/showing how regex does business.

String.replaceAll() with regexp problem

i have a java code where i select a record from db using Spring Hibernate native query and tried to strip HTML tags from a text.
String sql = " SELECT * FROM posts LIMIT 1 ";
SQLQuery query = getSession().createSQLQuery(sql);
query.setResultTransformer(Transformers.ALIAS_TO_ENTITY_MAP);
Map each = (Map)query.uniqueResult();
String message = (String)each.get("Message");
String content = message.replaceAll("\\<.*?\\>", "");
But why replaceAll does not work here ?
But for this code it works:
String message = "<a>blablasdddfdf</a>";
String content = message.replaceAll("\\<.*?\\>", "");
Thanks.
Both of your cases shouldn't work. In second case:
String message = "<a>blablasdddfdf</a>";
String content = content.replaceAll("\\<.*?\\>", "");
what would replaceAll method would replace in content when content hasn't been assigned any initial value?
Your last line should be:
String content = message.replaceAll("\\<.*?\\>", "");
in both of the cases to work properly.
In first case, just make sure that you have some value in message before invoking replaceAll on it.

Categories