Arabic characters in String.Format("%d",1,Locale.US) - java

I have the following code:
private static final String PATTERN = "file_%d.txt";
int no; // 1-3
String filename = String.format(PATTERN, no ,Locale.US);
and later on I get an exception saying that
java.io.FileNotFoundException: file_٣.txt
which indicates that %d got replaced with an arabic number. How can that be if I explicitely specify Locale.US?

The locale needs to be the first parameter:
String.format(Locale.US,PATTERN, no);

Related

Extracting a substring before one of two characters using regex

So I have an initial file set:
file1.txt
file2.txt
When I make a change to these files and save them, I append a time stamp to them, so they'd become:
fileN_DD-Mon-YYYY_HHMMSS.txt
But if I was to make any additional saves, the timestamps would begin stacking:
fileN_DD-Mon-YYYY_HHMMSS_DD-Mon-YYYY_HHMMSS.txt
I need a way to get the substring that occurs before the first occurrence of either "." or "_" to get the string that is before them (i.e., actual file name ("fileN")).
I've gotten to this point with
int lastDot = fileName.getName().lastIndexOf('.');
String renamed = fileName.getName().substring(0,lastDot) + getDateTime() + fileName.getName().substring(lastDot);
I've tried using Scanner::useDelimiter to get the first occurrance of a "." or "_" using regexes but no luck.
String renamed = savedFileName(fileName)
public static String savedFileName(String fileName) {
final String TXT = ".txt";
Scanner s = new Scanner(fileName);
s.useDelimiter(<regex>);
String trueFileName = s.next();
s.close();
return trueFileName + getDateTime() + TXT;
for the regex, I've tried "\\W", but that returns just the latest timestamp:
_DD-Mon-YYYY_HHMMSS.txt
, and ".|_" but that returns this monstrosity:
fileN.txt_DD-Mon-YYYY.txt_(more timestamps).txt.
You can use String's split method with regex pattern \.|_:
String longFile = "fileN_DD-Mon-YYYY_HHMMSS.txt";
String shortFile = "file1.txt ";
String pattern = "\\.|_"; // need to escape backslash
System.out.println(longFile.split(pattern)[0]);
System.out.println(shortFile.split(pattern)[0]);
Or, equivalently, regex [._].
Output:
fileN
file1

Error while replace string with symbol in Java

I'm solving this problem:
problem
And what I did is this:
import java.io.*;
import static java.lang.System.exit;
import java.util.*;
//Driver for Abbreviations
public class AbbreviationsDriver {
//string of message
private static String message = "";
//List of Abbreviations
private static String[] AbbreviationsList;
//Abbreviations list file
private static File AbbreviationsListFile = new File("abbreviations.txt");
//message file
private static File inputMessageFile = new File("sample_msg.txt");
//output message file
private static File outputMessageFile = new File("sample_output.txt");
//main method
public static void main(String[] args) throws FileNotFoundException {
setAbbreviations(readFileList(AbbreviationsListFile));
System.out.println("list of abbriviations:\n" + Arrays.toString(AbbreviationsList));
setMessage(readFile(inputMessageFile));
System.out.println("\nMessage in input file:\n" + message);
writeFile(outputMessageFile,addTags(message, AbbreviationsList));
System.out.println("\nMessage with tag in output file:\n" + addTags(message, AbbreviationsList));
}
//method to add tags
public static String addTags(String toTag, String[] abbreviations){
for(String abbreviation:abbreviations)
if(toTag.contains(abbreviation)){
toTag = toTag.replaceAll(abbreviation, "<" + abbreviation + ">");
}
return toTag;
}
//method to read the file list
public static String[] readFileList(File fileInput){
String input = "";
try{
Scanner inputStream = new Scanner(fileInput);
while(inputStream.hasNextLine()){
input = input + inputStream.nextLine()+ "<String>";
}
inputStream.close();
// System.out.println("list in string: " + input);
return input.split("<String>");
}
catch(Exception exception){
System.out.println("error in getting string array from file:\t" + exception.getMessage());
exit(0);
return new String[] {""};
}
}
//method to read the file
public static String readFile(File fileInput){
String inputFile = "";
try{
Scanner inputStatement = new Scanner(fileInput);
while(inputStatement.hasNextLine()){
inputFile = inputFile + inputStatement.nextLine();
}
inputStatement.close();
return inputFile;
}
catch(Exception exception){
System.out.println("error in getting message from file:\t" + exception.getMessage());
exit(0);
return "";
}
}
//method to write the output file
public static void writeFile(File fileName, String outString){
try{
PrintWriter outputStatement = new PrintWriter(fileName);
outputStatement.print(outString);
outputStatement.close();
}
catch(Exception exception){
System.out.println("error in setting message of file:\t" + exception.getMessage());
exit(0);
}
}
//method to set abbreviations
public static void setAbbreviations(String[] newAbbreviationsList){
AbbreviationsList = newAbbreviationsList;
}
//setter to set message
public static void setMessage(String newMessage){
message = newMessage;
}
//input string
public static String inputString(){
return new Scanner(System.in).nextLine();
}
}
abbreviations.txt is here:
lol
:)
iirc
4
u
ttfn
and sample_msg.txt is here:
How are u today? Iirc, this is your first free day. Hope you are having fun! :)
but when I compile and run, the error message comes out:
list of abbriviations:
[lol, :), iirc, 4, u, ttfn]
Message in input file:
How are u today? Iirc, this is your first free day. Hope you are having fun! :)
Exception in thread "main" java.util.regex.PatternSyntaxException: Unmatched closing ')' near index 0
:)
^
at java.util.regex.Pattern.error(Pattern.java:1969)
at java.util.regex.Pattern.compile(Pattern.java:1706)
at java.util.regex.Pattern.<init>(Pattern.java:1352)
at java.util.regex.Pattern.compile(Pattern.java:1028)
at java.lang.String.replaceAll(String.java:2223)
at AbbreviationsDriver.addTags(AbbreviationsDriver.java:44)
at AbbreviationsDriver.main(AbbreviationsDriver.java:36)
Process finished with exit code 1
I don't know how to solve this error because I've never seen this error before.
Please help me!
You pass wrong parameter into replaceAll(). First parameter must be a regex. For your purpose, regex is not needed, so use replace() method instead.
You faced the error because ) is treated as a metacharacter in regex and therefore either it needs to be escaped or must be paired with its closing counterpart.
Solution
You need to treat abbreviations with metacharacters and strings without metacharacters differently. For strings with metacharacters (e.g. :) where ) is a metacharacter), you should use String#replace while for the strings without metacharacter you should use String#replaceAll.
When you use String#replaceAll, you should create a capturing group which includes word boundaries e.g. (\bu\b) so that only those u will be processed which appear as a word. Finally, you should replace the capturing group with <$1> where $1 refers to the first (in the code given below, there is only one capturing group) capturing group e.g. (\bu\b) will be replaced by <u>.
Demo:
public class Main {
public static void main(String[] args) {
String[] abbrWithoutMetaChars = { "lol", "iirc", "4", "u", "ttfn" };
String[] abbrWithMetaChars = { ":)" };
// Test string
String str = "How are u today? iirc, this is your first free day. Hope you are having fun! :)";
// Replace all abbr. without meta chars
for (String abbreviation : abbrWithoutMetaChars) {
str = str.replaceAll("(\\b" + abbreviation + "\\b)", "<$1>");
}
// Replace all abbr. with meta chars
for (String abbreviation : abbrWithMetaChars) {
str = str.replace(abbreviation, "<" + abbreviation + ">");
}
System.out.println(str);
}
}
Output:
How are <u> today? <iirc>, this is your first free day. Hope you are having fun! <:)>
The problem is actually tricky. For example, in the list of abbreviations, u should be interpreted as a word and not a letter, since in your expected output you don't surround the letter u in the word your with angle brackets but only the u that appears by itself. Hence your code needs to locate the abbreviation as a single word in the input.
Also, iirc appears in the abbreviations list but in the input you have Iirc (with a capital I) and in the expected output it should appear as <Iirc> and not as <iirc>. In other words you should ignore case when locating the abbreviation but you need to keep the case after surrounding the abbreviation with angle brackets.
Then you have :) in the abbreviations list but ) has special meaning in regular expression syntax so your code also needs to handle that situation.
All the above implies that you need to analyze the contents of the abbreviations list file in order to turn a raw abbreviation into a valid regular expression that you can then use to locate the abbreviation in the input text.
If you assume that the abbreviations list may contain every possible abbreviation, you would probably need a large amount of code to handle each one properly. Rather than do that, I just concentrated on your sample list which divides easily into two groups:
simple words
punctuation only
Note that the second group is also known as emoticons and some emoticons contain both letters and punctuation which my code, below, does not handle. As I said, my solution only pertains to your sample list of abbreviations.
Here is the code and below the code are some notes regarding it. Please not that I took the liberty of not just fixing your code, but refactoring it as well.
import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.PrintWriter;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;
//Driver for Abbreviations
public class AbbreviationsDriver {
//Abbreviations list file
private static Path abbreviationsListPath = Paths.get("abbreviations.txt");
//message file
private static Path inputPath = Paths.get("sample_msg.txt");
//output message file
private static File outputMessageFile = new File("sample_output.txt");
//main method
public static void main(String[] args) throws FileNotFoundException {
List<String> abbreviationsList = readFileList(abbreviationsListPath);
System.out.println("List of abbreviations: " + abbreviationsList);
String message = readFile(inputPath);
System.out.println("\nMessage in input file:\n" + message);
String result = addTags(message, abbreviationsList);
writeFile(outputMessageFile, result);
System.out.println("\nMessage with tag in output file:\n" + result);
}
//method to add tags
public static String addTags(String toTag, List<String> abbreviations) {
for (String abbreviation : abbreviations) {
String regex;
if (abbreviation.contains(")")) {
regex = "(\\Q" + abbreviation + "\\E)";
}
else {
regex = "(?i)(\\b" + abbreviation + "\\b)";
}
toTag = toTag.replaceAll(regex, "<$1>");
}
return toTag;
}
//method to read the file list
public static List<String> readFileList(Path path) {
List<String> list;
try {
list = Files.readAllLines(path);
}
catch (IOException exception) {
list = List.of();
System.out.println("Failed to load: " + path);
exception.printStackTrace();
}
return list;
}
//method to read the file
public static String readFile(Path path) {
String inputFile;
try {
inputFile = Files.readString(path);
}
catch (IOException exception) {
System.out.println("Failed to read: " + path);
exception.printStackTrace();
inputFile = "";
}
return inputFile;
}
//method to write the output file
public static void writeFile(File fileName, String outString) {
try {
PrintWriter outputStatement = new PrintWriter(fileName);
outputStatement.print(outString);
outputStatement.close();
}
catch (Exception exception) {
System.out.println("Failed to write file: " + fileName);
exception.printStackTrace();
}
}
}
I use interface Path rather that class File so that I can use methods of class Files to read the text files that contain the abbreviations list and the input. Hence my code works with interface List rather than with an array of String.
Passing class members to methods as method parameters defeats the purpose of having a class member in the first place. Hence I removed the members message and AbbreviationsList.
The actual work of locating the abbreviations in the input and surrounding them with angle brackets, all occurs in method addTags. Here I handle each separate group of abbreviations. If the abbreviation contains the character ), I quote it by surrounding it with quote markers \Q and \E. (Refer to javadoc of class Pattern). Otherwise the abbreviation is a regular word, so I surround it with the word boundary marker \b. I also enclose each regular expression in parentheses so as to make it a capturing group. Note that the second regular expression begins with (?i) which means to ignore case. Hence iirc will match Iirc.
The replacement string is <$1>. The $1 is replaced with the string that was actually matched so any abbreviation found in the input will be replaced by the matched string surrounded with angle brackets.
Finally, here is the output when running the above code and using your sample data.
List of abbreviations: [lol, :), iirc, 4, u, ttfn]
Message in input file:
How are u today? Iirc, this is your first free day. Hope you are having fun! :)
Message with tag in output file:
How are <u> today? <Iirc>, this is your first free day. Hope you are having fun! <:)>
There are several ways to do this. Either you use regular expressions, or you do things the old-fashioned way by parsing word-by-word. Others have pointed out problems with your current code, due to using strings that contain regular expression metacharacters. In particular,
String doesNotWork = "I am :)".replaceAll(":)", "happy"); // invalid regex
This can be solved by quoting the string, so that metacharacters are converted into literals (it returns the string that would be written as "\\Q:)\\E", because \Q and \E are used as delimiters for quoting whole substrings, as opposed to \, which quotes the next only if it is non-alphabetical; and is otherwise used for a host of regex classes):
String worksAsExpected = "I am :)".replaceAll(Pattern.quote(":)"), "happy");
The most efficient way to process text is to do a single pass. This can be achieved by combining literal expressions with |s:
String regex = Stream.of("lol iirc 4".split(" "))
.map(s -> Pattern.quote(s)) // quotes each emoticon
.collect(Collectors.joining("|")); // joins with |
Matcher m = Pattern.compile(regex).matcher(input);
This yields surprisingly compact code, with nothing hardcoded. Finished code:
import java.util.regex.*;
import java.util.stream.*;
public class T {
public static String mark(
String[] needles, String startMark, String endMark, String input) {
String regex = Stream.of(needles)
.map(s -> s.matches("\\p{Alpha}+") ? // quotes each
"\\b" + Pattern.quote(s) + "\\b" : // to avoid yo<u>r
Pattern.quote(s)) // to handle emoticons
.collect(Collectors.joining("|")); // joins with |
Matcher m = Pattern.compile(regex).matcher(input);
StringBuffer output = new StringBuffer();
while (m.find()) {
m.appendReplacement(output, startMark + m.group() + endMark);
}
m.appendTail(output);
return output.toString();
}
public static void main(String ... args) {
System.out.println(mark(
"lol iirc 4 u ttfn :)".split(" "), // abbreviations
"<", ">", // markers to mark them with
"How are u today? iirc, this is your first free day. "
+ "Hope you are having fun! :)"));
}
}
I used #Arvind's trick of placing word-boundary metacharacters (\\b) only on alphabetical needles. This fixes all us in words being marked; but may yield strange results for 4s: writing a number with 4s in it will get it marked. Ultimately, natural language processing is hard. Regular expressions are great for very regular inputs.

replacing the carriage return with white space in java

I am having the below string in a string variable in java.
rule "6"
no-loop true
when
then
String prefix = null;
prefix = "900";
String style = null;
style = "490";
String grade = null;
grade = "GL";
double basePrice = 0.0;
basePrice = 837.00;
String ruleName = null;
ruleName = "SIVM_BASE_PRICE_006
Rahul Kumar Singh";
ProductConfigurationCreator.createFact(drools, prefix, style,grade,baseprice,rulename);
end
rule "5"
no-loop true
when
then
String prefix = null;
prefix = "800";
String style = null;
style = "481";
String grade = null;
grade = "FL";
double basePrice = 0.0;
basePrice = 882.00;
String ruleName = null;
ruleName = "SIVM_BASE_PRICE_005";
ProductConfigurationCreator.createFact(drools, prefix, style,grade,baseprice,rulename);
end
I need to replace this the carriage return between "THEN" and "END" keyword with white space so that it becomes like below code:
rule "6"
no-loop true
when
then
String prefix = null;
prefix = "900";
String style = null;
style = "490";
String grade = null;
grade = "GL";
double basePrice = 0.0;
basePrice = 837.00;
String ruleName = null;
ruleName = "SIVM_BASE_PRICE_006 Rahul Kumar Singh";
ProductConfigurationCreator.createFact(drools, prefix, style,grade,baseprice,rulename);
end
rule "5"
no-loop true
when
then
String prefix = null;
prefix = "800";
String style = null;
style = "481";
String grade = null;
grade = "FL";
double basePrice = 0.0;
basePrice = 882.00;
String ruleName = null;
ruleName = "SIVM_BASE_PRICE_005";
ProductConfigurationCreator.createFact(drools, prefix, style,grade,baseprice,rulename);
end
In the above two example of string set, the second is correct format that I need. However, in the first set, I am getting this :
ruleName = "SIVM_BASE_PRICE_006
Rahul Kumar Singh";
This perticulerly needs to be like this:
ruleName = "SIVM_BASE_PRICE_006 Rahul Kumar Singh";
and I also need to ensure that this doesn't effect any thing else in the string.
Thus I need to replace this "carriage return" with a white space and make in one line. This is my requirment. I tried with replace and replaceAll method of string but not works properly.
Problem:
I need to look in between string "then" and "end" and in that whenever
there is any carriage return in between two double quaotes "" ""; I
need to replace this carriage return with white space and make it in
one line.
Thanks
EDIT:
DRT:
template header
Prefix
Style
Product
package com.xx
import com.xx.drools.ProductConfigurationCreator;
template "ProductSetUp"
rule "Product_#{row.rowNumber}"
no-loop true
when
then
String prefix = null;
prefix = "#{Prefix}";
String style = null;
prefix = "#{Style}";
String product = null;
product = "#{Product}";
ProductConfigurationCreator.createProductFact(drools,prefix,style,product);
end
end template
The excel and drt are for only demostration purpose.
In the Image, in Product column, there is "SOFAS \rkumar shorav". Actually this is creating problem. This will generate like below:
product = "SOFAS
kumar shorav";
I need this like below:
product = "SOFAS kumar shorav";
Then Excel data :
attached image.
Instead of regex I would probably write my own formatter which will
check if cursor is inside quote
replace each \r with space
replace each \n with space, unless it was placed right after \r which means that space was already placed for that \r
write rest of characters without change.
Only possible problem is that this formatter will not care about where string is placed so if you want to format some specific part of the string you will need to provide only that part.
Code implementing such formatter can look like:
public static String format(String text){
StringBuilder sb = new StringBuilder();
boolean insideQuote = false;
char previous = '\0';//to track `\r\n`
for (char ch : text.toCharArray()) {
if (insideQuote &&
(ch == '\r' ||
ch == '\n' && previous != '\r') ) {
sb.append(" ");//replace `\r` or `\n` with space
}else {
if (ch == '"') {
insideQuote = !insideQuote;
}
sb.append(ch); //write other characters without change
}
previous = ch;
}
return sb.toString();
}
helper utility method
public static String format(File file, String encoding) throws IOException {
String text = new String(Files.readAllBytes(file.toPath()), encoding);
return format(text);
}
Usage:
String formatted = format(new File("input.txt"), "utf-8");
System.out.println(formatted);
You might say that there is a bug in org.drools.template.parser.StringCell, method
public void addValue(Map<String, Object> vars) {
vars.put(column.getName(), value);
}
Here, the value is added to the Map as a String but this does not take into account that string values are usually expanded into string literals. Therefore, an embedded newline should be converted to the escape sequence \n. You might try this patch:
public void addValue(Map<String, Object> vars) {
String h = value.replaceAll( "\n", "\\\\n" );
vars.put(column.getName(), h);
}
Take the source file, put it into a suitable subdirectory, compile it to a class file and make sure that the root directory precedes drools-templates-6.2.0.Final-sources.jar in the class path. You should then see
ruleName = "SIVM_BASE_PRICE_006\nRahul Kumar Singh";
in the generated DRL file. Obviously, this is not a space, but it is what is written in the spreadsheet cell!
I suggest (urgently) that you do not follow this approach. The reason is simply this that strings are not always expanded between quotes, and then the replacement would result almost certainly in invalid code. There is simply no remedy as the template compiler is "dumb" and does not really "know" what it is expanding.
If a String in a spreadsheet contains a line break, template expansion must render this faithfully, and break the line just there. If this produces invalid (Java) code: why was the line break entered in the first place? There is absolutely no reason not to have a space in that cell if that's what you want.
s = s.replaceAll("(?m)^([^\"]*(\"[^\"]*\")*[^\"]*\"[^\"]*)\r?\n\\s*", "$1 ");
This replaces lines with an unpaired quotes to one with the line ending replaced.
^.... means starting at the line begin
[^\"] means not quote
\r?\n catches both CR+LF (Windows) as LF (Rest) line endings
not-quotes,
repetition of " not-quotes ",
not quotes, quote, not-quotes, newline
Mind this does not cover backslash+quote, escapes them-selves.
Use the "multi line" flag:
str = str.replaceAll("(?m)^\\s+", "");
The multi-line flag (?m) makes ^ and $ match start/end of each line (rather than start/end of input). \s+ means "one or more whitespace characters".

Number format issue, newbie project

I am trying to write a program that loads a movie data base file, and then splits up that information into the movie title, year, and all of the associated actors. I split up all of the info, but I am having issues converting the year, which is in a string, to an int. The format of the year string is (****) with the * being a year, such as 1999. When I try to use parse I get a number format exception. I have tried replacing the parentheses, but it just gave me more errors! Any ideas?
public class MovieDatabase {
ArrayList<Movie> allMovie = new ArrayList<Movie>();
//Loading the text file and breaking it apart into sections
public void loadDataFromFile( String aFileName) throws FileNotFoundException{
Scanner theScanner = new Scanner(aFileName);
theScanner = new Scanner(new FileInputStream("cast-mpaa.txt"));
while(theScanner.hasNextLine()){
String line = theScanner.nextLine();
String[] splitting = line.split("/" );
String movieTitleAndYear = splitting[0];
int movieYearIndex = movieTitleAndYear.indexOf("(");
String movieYear = movieTitleAndYear.substring(movieYearIndex);
System.out.println(movieYear);
//this is where I have issues
int theYear = Integer.parseInt(movieYear);
String movieTitle = movieTitleAndYear.substring(0, movieYearIndex);
ArrayList<Actor> allActors = new ArrayList<Actor>();
for ( int i = 1; i < splitting.length; i++){
String[] names = splitting[i].split(",");
String firstName = names[0];
Actor theActor = new Actor(firstName);
ArrayList<Actor> allActor = new ArrayList<Actor>();
allActor.add(theActor);
}
Movie theMovie = new Movie(movieTitle, theYear, allActors);
allMovie.add(theMovie);
}
theScanner.close();
}
output:
(1967)
Here is the errors I am getting:
Exception in thread "main" java.lang.NumberFormatException: For input string: "(1967)"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:481)
at java.lang.Integer.parseInt(Integer.java:527)
at MovieDatabase.loadDataFromFile(MovieDatabase.java:27)
You have brackets around the numbers. You could either correct your file or you could remove brackets using:
String str = "(1967)";
System.out.println(str.substring(1, str.length()-1));
Output:
1967
In your code, you used:
int movieYearIndex = movieTitleAndYear.indexOf("(");
String movieYear = movieTitleAndYear.substring(movieYearIndex);
So if my movieTitleAndYear string is "hi (1947)", indexOf will give me index of "(" as 3 and substring will start reading string from index 3 which includes "(". One way you could avoid opening bracket is to change your substring line to:
String movieYear = movieTitleAndYear.substring(movieYearIndex + 1);//but still you have closing bracket.
If you are sure it's always going to be of four digit, then you could do something like:
String movieYear = movieTitleAndYear.substring(movieYearIndex + 1, movieYearIndex + 5);
You need to add indexof for ")".
Code snippet:
int movieYearOpenBracesIndex = movieTitleAndYear.indexOf("(");
int movieYearCloseBracesIndex = movieTitleAndYear.indexOf(")");
String movieYear = movieTitleAndYear.substring(movieYearOpenBracesIndex + 1, movieYearCloseBracesIndex);
System.out.println(movieYear);
This will give the exact year. e.g. 1967
Your substring call currently gets a year enclosed by brackets, e.g., (1967). You can avoid this by calling the substring variant that accepts an endIndex, and just get the year's four digits:
String movieYear =
movieTitleAndYear.substring(movieYearIndex + 1, // to get rid of "("
movieYearIndex + 5 // to get rid of ")"
);

How to obtain the last path segment of a URI

I have as input a string that is a URI. how is it possible to get the last path segment (that in my case is an id)?
This is my input URL:
String uri = "http://base_path/some_segment/id"
and I have to obtain the id I have tried with this:
String strId = "http://base_path/some_segment/id";
strId = strId.replace(path);
strId = strId.replaceAll("/", "");
Integer id = new Integer(strId);
return id.intValue();
but it doesn't work, and surely there must be a better way to do it.
is that what you are looking for:
URI uri = new URI("http://example.com/foo/bar/42?param=true");
String path = uri.getPath();
String idStr = path.substring(path.lastIndexOf('/') + 1);
int id = Integer.parseInt(idStr);
alternatively
URI uri = new URI("http://example.com/foo/bar/42?param=true");
String[] segments = uri.getPath().split("/");
String idStr = segments[segments.length-1];
int id = Integer.parseInt(idStr);
import android.net.Uri;
Uri uri = Uri.parse("http://example.com/foo/bar/42?param=true");
String token = uri.getLastPathSegment();
Here's a short method to do it:
public static String getLastBitFromUrl(final String url){
// return url.replaceFirst("[^?]*/(.*?)(?:\\?.*)","$1);" <-- incorrect
return url.replaceFirst(".*/([^/?]+).*", "$1");
}
Test Code:
public static void main(final String[] args){
System.out.println(getLastBitFromUrl(
"http://example.com/foo/bar/42?param=true"));
System.out.println(getLastBitFromUrl("http://example.com/foo"));
System.out.println(getLastBitFromUrl("http://example.com/bar/"));
}
Output:
42
foo
bar
Explanation:
.*/ // find anything up to the last / character
([^/?]+) // find (and capture) all following characters up to the next / or ?
// the + makes sure that at least 1 character is matched
.* // find all following characters
$1 // this variable references the saved second group from above
// I.e. the entire string is replaces with just the portion
// captured by the parentheses above
I know this is old, but the solutions here seem rather verbose. Just an easily readable one-liner if you have a URL or URI:
String filename = new File(url.getPath()).getName();
Or if you have a String:
String filename = new File(new URL(url).getPath()).getName();
If you are using Java 8 and you want the last segment in a file path you can do.
Path path = Paths.get("example/path/to/file");
String lastSegment = path.getFileName().toString();
If you have a url such as http://base_path/some_segment/id you can do.
final Path urlPath = Paths.get("http://base_path/some_segment/id");
final Path lastSegment = urlPath.getName(urlPath.getNameCount() - 1);
In Android
Android has a built in class for managing URIs.
Uri uri = Uri.parse("http://base_path/some_segment/id");
String lastPathSegment = uri.getLastPathSegment()
If you have commons-io included in your project, you can do it without creating unecessary objects with org.apache.commons.io.FilenameUtils
String uri = "http://base_path/some_segment/id";
String fileName = FilenameUtils.getName(uri);
System.out.println(fileName);
Will give you the last part of the path, which is the id
In Java 7+ a few of the previous answers can be combined to allow retrieval of any path segment from a URI, rather than just the last segment. We can convert the URI to a java.nio.file.Path object, to take advantage of its getName(int) method.
Unfortunately, the static factory Paths.get(uri) is not built to handle the http scheme, so we first need to separate the scheme from the URI's path.
URI uri = URI.create("http://base_path/some_segment/id");
Path path = Paths.get(uri.getPath());
String last = path.getFileName().toString();
String secondToLast = path.getName(path.getNameCount() - 2).toString();
To get the last segment in one line of code, simply nest the lines above.
Paths.get(URI.create("http://base_path/some_segment/id").getPath()).getFileName().toString()
To get the second-to-last segment while avoiding index numbers and the potential for off-by-one errors, use the getParent() method.
String secondToLast = path.getParent().getFileName().toString();
Note the getParent() method can be called repeatedly to retrieve segments in reverse order. In this example, the path only contains two segments, otherwise calling getParent().getParent() would retrieve the third-to-last segment.
You can also use replaceAll:
String uri = "http://base_path/some_segment/id"
String lastSegment = uri.replaceAll(".*/", "")
System.out.println(lastSegment);
result:
id
You can use getPathSegments() function. (Android Documentation)
Consider your example URI:
String uri = "http://base_path/some_segment/id"
You can get the last segment using:
List<String> pathSegments = uri.getPathSegments();
String lastSegment = pathSegments.get(pathSegments.size() - 1);
lastSegment will be id.
I'm using the following in a utility class:
public static String lastNUriPathPartsOf(final String uri, final int n, final String... ellipsis)
throws URISyntaxException {
return lastNUriPathPartsOf(new URI(uri), n, ellipsis);
}
public static String lastNUriPathPartsOf(final URI uri, final int n, final String... ellipsis) {
return uri.toString().contains("/")
? (ellipsis.length == 0 ? "..." : ellipsis[0])
+ uri.toString().substring(StringUtils.lastOrdinalIndexOf(uri.toString(), "/", n))
: uri.toString();
}
you can get list of path segments from the Uri class
String id = Uri.tryParse("http://base_path/some_segment/id")?.pathSegments.last ?? "InValid URL";
It will return id if the url is valid, if it is invalid it returns "Invalid url"
Get URL from URI and use getFile() if you are not ready to use substring way of extracting file.

Categories