How to select a file path using regex - java

I would like like to create a java regular expression that selects everything from file: to the last forward slash (/) in the file path. This is so I can replace it with a different path.
<!DOCTYPE "file:C:/Documentum/XML%20Applications/joesdev/goodnews/book.dtd"/>
<myBook>cool book</myBook>
Does anyone have any ideas? Thanks!!

You just want to go to the last slash before the end-quote, right? If so:
file:[^"]+/
(the string "file:", then anything but ", ending with a /)
Properly escaped:
String regex = "file:[^\"]+/";

You could try to process this yourself, but a better scheme would be to just pick out the parts between the quotes and use java.util.File to separate the directory name from the filename. That way you don't have to worry about / vs \ or various escape characters.

String newPath = "C:/Documentum/badnews";
String originalPath = "<!DOCTYPE \"file:C:/Documentum/XML%20Applications/joesdev/goodnews/book.dtd\"/>";
System.out.println(originalPath.replaceFirst("file:C:((/[/\\w%]+))", newPath));

Try this:
"file:.*/[^/]*"/>

Related

Not able to read Properties File using Backward Slash

I Am not able to read Properties File using Java.It Means In this Properties File Backward Slash is not working.It is showing like ,this destination :C:Usersxxx.a
String filename="D://Desktop//xxx.properties";
is = new FileInputStream(filename);
Properties prop=new Properties();
prop.load(is);
System.out.println("destination :"+prop.getProperty("destination"));
Property File is the :
destination=C:\Users\xxx.a\
Result is showing
destination :C:Usersxxx.a
But I want to show destination :C:\Usersxxx.a\
Can You Please suggest Me?
\ is an Escape character.
forward slash / is used as path separator in Unix environment.
Back slash \ is used as path separator in Windows environment.
So, You need to use \\ or / as path separator. You can not directly use \ in java. Since, it is an escape character.
So,You need to make changes in your properties file to make your program work.
Use either / or \\ as path separator in your properties file.
In your case you want to show as C:\Users\xxx.a\.
So, use C:\\Users\\xxx.a\\ in your properties file to get output as C:\Users\xxx.a\
The \ character is used as an "escape character" in many programming languages. It gives a special meaning to the next character in the text. For example, \n encodes the special character "new-line".
Use \\ instead of \. This indicates to the parser that you mean the actual symbol, not an escape character. For example, your property value would be:
destination=C:\\Users\\xxx.a\\
You need to add two slashes to your properties file like this:
destination=C:\\Users\\xxx.a\\
The other way is to swap the slashes in the properties file:
destination=C:/Users/xxx.a/
A \ is an escape character so it is removed. Adding two slashes escapes the first so only one is left.
You can store it in D:/Desktop/xxx.properties as
destination=C:/Users/xxx.a/
and show it with a single backslash
String fileName = prop.getProperty("destination");
System.out.println("destination: " + fileName); // shows: C:/Users/xxx.a/
System.out.println("destination: " + Paths.get(fileName)); // shows: C:\Users\xxx.a

Java how to handle a \ on input

I am currently trying to split a String folder. I get the value from a file system and it usually looks something like EAM\Testing.
String folder = "EAM\Testing"
String[] parts = folder.split("\\");
I know \ has special rules to it in java.
String folder = "EAM\\Testing"
String[] parts = folder.split("\\\\");
(I know the code above would work if I could control what the input looked like)
My problem is that I can not control what string folder is as input from a location of a file.
Is there a way to get this to work where folder only has one \ in it?
This is for a recycle bin component I am writing for Documentum a enterprise management system. When a document is deleted and the folder doesn't exist anymore I want to recreate it and inorder to recreate it the folder names must be seperate as I have to create them one at a time.
Here is how I get the name of the folder.
File f = new File(relationRecord.getRepeatingString(
"dp_original_folder_paths",
i));
(This gives an input such as \EAM\testing
String folder1 = f.toString();
I then get rid of the first \ by
String folder = folder1.substring(1);
Which gives me EAM\testing
Well if this is literally a file path, you should consider using the Path class, it'll make your life easier.
Path path = Paths.get("C:\\home\\joe\\foo");
System.out.format("toString: %s%n", path.toString());
System.out.format("getFileName: %s%n", path.getFileName());
System.out.format("getName(0): %s%n", path.getName(0));
System.out.format("getNameCount: %d%n", path.getNameCount());
System.out.format("subpath(0,2): %s%n", path.subpath(0,2));
System.out.format("getParent: %s%n", path.getParent());
System.out.format("getRoot: %s%n", path.getRoot());
Your second option
String[] parts = folder.split("\\\\");
Should work fine for your input string. When you write a string literal like "EAM\\Testing", the resulting string has only one slash. You can read some details on escape sequences in Java there.
The reason you need four slashes in split is because \ is an escape character both for string literals and regular expressions (String#split accepts regular expression as its argument)
You should be doing something like this -
String s = "EAM\\testing";
String a[] = s.split("\\\\");
Here you duplicate the backslash once for the String (since \ is an escape character for String) and again for the regex for the same reason.
Your question seems to be "how can I remove a leading \ from a string:
folder = folder.replaceAll("^\\\\", "");
This searches for a back slash at the start if the string, and if found replaces it with nothing (ie deletes it).
Regarding backslash vs forward slash characters in paths, java handles both.

How to replace double slash with single slash for an url

For the given url like "http://google.com//view/All/builds", i want to replace the double slash with single slash. For example the above url should display as "http://google.com/view/All/builds"
I dint know regular expressions. Can any one help me, how can i achieve this using regular expressions.
To avoid replacing the first // in http:// use the following regex :
String to = from.replaceAll("(?<!http:)//", "/");
PS: if you want to handle https use (?<!(http:|https:))// instead.
Is Regex the right approach?
In case you wanted this solution as part of an exercise to improve your regex skills, then fine. But what is it that you're really trying to achieve? You're probably trying to normalize a URL. Replacing // with / is one aspect of normalizing a URL. But what about other aspects, like removing redundant ./ and collapsing ../ with their parent directories? What about different protocols? What about ///? What about the // at the start? What about /// at the start in case of file:///?
If you want to write a generic, reusable piece of code, using a regular expression is probably not the best appraoch. And it's reinventing the wheel. Instead, consider java.net.URI.normalize().
java.net.URI.normalize()
java.lang.String
String inputUrl = "http://localhost:1234//foo//bar//buzz";
String normalizedUrl = new URI(inputUrl).normalize().toString();
java.net.URL
URL inputUrl = new URL("http://localhost:1234//foo//bar//buzz");
URL normalizedUrl = inputUrl.toURI().normalize().toURL();
java.net.URI
URI inputUri = new URI("http://localhost:1234//foo//bar//buzz");
URI normalizedUri = inputUri.normalize();
Regex
In case you do want to use a regular expression, think of all possibilities. What if, in future, this should also process other protocols, like https, file, ftp, fish, and so on? So, think again, and probably use URI.normalize(). But if you insist on a regular expression, maybe use this one:
String noramlizedUri = uri.replaceAll("(?<!\\w+:/?)//+", "/");
Compared to other solutions, this works with all URLs that look similar to HTTP URLs just with different protocols instead of http, like https, file, ftp and so on, and it will keep the triple-slash /// in case of file:///. But, unlike java.net.URI.normalize(), this does not remove redundant ./, it does not collapse ../ with their parent directories, it does not other aspects of URL normalization that you and I might have forgotten about, and it will not be updated automatically with newer RFCs about URLs, URIs, and such.
String to = from.replaceAll("(?<!(http:|https:))[//]+", "/");
will match two or more slashes.
Here is the regexp:
/(?<=[^:\s])(\/+\/)/g
It finds multiple slashes in url preserving ones after protocol regardless of it.
Handles also protocol relative urls which start from //.
#Test
public void shouldReplaceMultipleSlashes() {
assertEquals("http://google.com/?q=hi", replaceMultipleSlashes("http://google.com///?q=hi"));
assertEquals("https://google.com/?q=hi", replaceMultipleSlashes("https:////google.com//?q=hi"));
assertEquals("//somecdn.com/foo/", replaceMultipleSlashes("//somecdn.com/foo///"));
}
private static String replaceMultipleSlashes(String url) {
return url.replaceAll("(?<=[^:\\s])(\\/+\\/)", "/");
}
Literally means:
(\/+\/) - find group: /+ one or more slashes followed by / slash
(?<=[^:\s]) - which follows the group (*posiive lookbehind) of this (*negated set) [^:\s] that excludes : colon and \s whitespace
g - global search flag
I suggest you simply use String.replace which documentation is http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#replace(java.lang.CharSequence, java.lang.CharSequence)
Something like
`myString.replace("//", "/");
If you want to remove the first occurence:
String[] parts = str.split("//", 2);
str = parts[0] + "//" + parts[1].replaceAll("//", "/");
Which is the simplest way (without regular expression). I don't know the regular expression corresponding, if there is an expert looking at the thread.... ;)

Search and replace "/" at end of url's using regular expressions in java

Below is my regular expression :-
\\bhttps?://[-a-zA-Z0-9+&##/%?=~_|!:,.;]*[-a-zA-Z0-9+&##/%=~_|]\\b
when the request url is of type http://www.example.com/ , the last character is not replaced in my shortner url and / is appended at end.
The regex is not able to find the last /.
Please help with this.
I think that / would be a word boundary, so maybe it works better if you add a ? to the and, so it reads:
\\bhttps?://[-a-zA-Z0-9+&##/%?=~_|!:,.;]*[-a-zA-Z0-9+&##/%=~_|]\\b?
what about:
if(url.endsWith("/"))
url = url.substring(0,url.length()-1);
or if you need to use regular expressions you can do something like this:
url = url.replaceAll("(\\bhttps?://[-a-zA-Z0-9+&##/%?=~_|!:,.;]*)/(\\b?)","$1$2");
If all you want is to replace the trailing / (which is what your question directly asks), you can simply do:
url = url.substring(0, url.lastIndexOf('/'));
Remember to KISS often.
You could simply use:
url = url.replaceAll("\/+$","");

Java property file

I have a property file and under that I have define a property called:
config.folder = C:\myfolder\configfolder
now the problem is that when loading properties, this property returns me the vale like this:
C:myfolderconfigfolder
I want to replace this single forward slash with back slash so it return me the correct directory path. I know this is not compliance with Java.String. If the user use double forward slash I am able to convert but how can I convert single slash.
A better approach is to change the slash from backslash to forward slash, like so:
config.folder = C:/myfolde/configfolder
Java knows how to interpret this structure.
Change it to: config.folder = C:\\myfolder\\configfolder
I will suggest that you start using System Properties for this i.e. file.separator
String fileSeparator = System.getProperty("file.separator");
Now say you got the path as :
String str = "C:/myfolder/configfolder";
String fileSeparator = System.getProperty("file.separator");
str= str.replace("/", fileSeparator);
System.out.println(str);
OUTPUT is :
C:\myfolder\configfolder
This approach might help you implement your program in any OS For Example UNIX with "/" as the file separator for different components of the file path, and for WINDOWS with "\" as the file separator for components of the file path.
Hope this might help in some way.
Regards
the best way to play with the file path literal is to use the system properties i.e.string file separator =System.getProperty ("file.separator") then you can replace it with ur slash to get the file path regards

Categories