split an already split array string in Java - java
I have a csv that has been read and split in 3 different csvs. The csv was pipe separated and the split variable is saved in a string variable. I want to split the new string as comma separated string but as soon as I do that, it gives an exception.`
try(BufferedReader br1 = new BufferedReader(new FileReader(newcsvCategory))){
String line;
while ((line = br1.readLine()) != null) {
String[] value1 = line.split("\\|,",-1);
String Id = value1[0];
String CatId=value1[1];
["Active Catalog Detail (Network Id "|" Category Ids "]
["209"|"4900,10368,11093,11581,10082,10206,10431,10119,11622,10358,11094,2,10342,5193,10738,11744,10039,10840,5132,10011,11132,5233,10792"]
["174"|"4900,10082,10119,10358,10039,5132,10011"]
["200"|"4900,10368,11093,11581,10082,10206,10431,10119,11622,10358,11094,2,5193,10738,11623,10039,10840,5132,10011,11132,5233,10792"]
["181"|"4900,10358,10011"]
["240"|"4900,10368,11093,11581,10082,10206,10431,10119,11622,10358,11094,2,10342,5193,10738,11744,10039,10840,5132,10011,11132,5233,10792"]
["206"|"4900,10368,11093,11581,10082,10206,10431,10119,11622,10358,11094,2,5193,10738,11623,10039,10840,5132,10011,11132,5233,10792"]
["255"|"4900,10368,11093,11581,10082,10206,11621,10431,10119,11622,10358,11094,2,10342,5193,10738,11744,10039,10840,5132,10011,11132,5233,10792"]
["251"|"4900,10368,11093,11581,10082,10206,11621,10431,10119,11622,10358,11094,2,10342,5193,10738,11744,10039,10840,5132,10011,11132,5233,10792"]
["231"|"4900,10368,11093,11581,10082,10206,10431,10119,11622,10358,11094,2,10342,5193,10738,11744,10039,10840,5132,10011,11132,5233,10792"]
["179"|"4900,10368,11618,11093,11581,10082,10206,10431,10119,11622,10358,11094,2,5193,10738,11623,10039,10840,5132,10011,11132,5233,10792"]
["184"|"4900,10368,11093,11581,10082,10206,10431,10119,11622,10358,11094,2,5193,10738,11623,10039,10840,5132,10011,11132,5233,10792"]
["187"|"4900,10368,11093,11581,10082,10206,10431,10119,11622,10358,11094,2,5193,10738,11623,10039,10840,5132,10011,11132,5233,10792"]
["247"|"4900,10368,11093,11581,10082,10206,11621,10431,10119,11622,10358,11094,2,10342,5193,10738,11744,10039,10840,5132,10011,11132,5233,10792"]
["215"|"10358"]
["216"|"4900,10368,11093,11581,10082,10206,10431,10119,11622,10358,11094,2,10342,5193,10738,11744,10039,10840,5132,10011,11132,5233,10792"]
["238"|"4900,10368,11093,11581,10082,10206,10431,10119,11622,10358,11094,2,10342,5193,10738,11744,10039,10840,5132,10011,11132,5233,10792"]
["224"|"4900,10368,11093,11581,10082,10206,10431,10119,11622,10358,11094,2,10342,5193,10738,11744,10039,10840,5132,10011,11132,5233,10792"]
I want split the first column and second column as pipe separated and then further separate the second column as comma separated.
I'd appreciate any help as I'm a newbie.
added code that is splitting CatId:
String[] temp = CatId.split(",",-1);
System.out.println(temp[1]);
Really, can't realise the questions, but give some notes.
// this source string: serveral columsn with different separators
String str = "209|4900,10368,11093,11581";
According to your code, you try to put all separate number into string array, with two steps:
String[] arr = str.split("\\|"); // not line.split("\\|,",-1)
// arr[0] = 209
// arr[1] = [4900,10368,11093,11581]
String[] tmp = arr[1].split(",")
// tmp[0] = 4900
// tmp[1] = 10368
// tmp[2] = 11093
// tmp[3] = 11581
If so, you can do it with one step:
String[] arr = str.split("[\\|,]");
// arr[0] = 209
// arr[1] = 4900
// arr[2] = 10368
// arr[3] = 11093
// arr[4] = 11581
You want to set the Limit of .split(..) to 2.
while ((line = br1.readLine()) != null) {
String[] value1 = line.split("\\|",2);
String Id = value1[0];
String CatId=value1[1]
};
To further split the contet of "CatId" use:
// if you need to replace unwanted chars first, you could just use the simple .replace:
CatId = CatId.replace("\"", "").replace("[", "").replace("]", "");
// Then, split the array just by ,
String[] catIdArray = CatId.split("\\,");
Related
Java Split String and Combine
I would like to split a string and combine them. String value = "1,A 2,B 3,C" outputs [1,2 A,B] [1,3 A,C] [2,3 B,C] If I do String[] tokens = value.split("[,\\s]+"); tokens[0] = "1" tokens[1] = "A" tokens[2] = "2" tokens[3] = "B" and so on. But then how can I combine it that becomes the output? Thank you.
You can split and combine it by doing this: String a = value.charAt(0)+","+value.charAt(4)+" "+value.charAt(2)+","+value.charAt(6); String b = value.charAt(0)+","+value.charAt(8)+" "+value.charAt(2)+","+value.charAt(10); String c = value.charAt(4)+","+value.charAt(8)+" "+value.charAt(6)+","+value.charAt(10);
Getting the last word of a line passed to a mapper in hadoop
If I have a dataset with lines like this 199.72.81.55 - - [01/Jul/1995:00:00:01 -0400] "GET /history/apollo/ HTTP/1.0" 200 6245 and I am running a map reduce job with hadoop, how can I get the last element in each line? I have tried all the obvious answers, such as String lastWord = test.substring(test.lastIndexOf(" ")+1); but this gives me the - character. I have tried splitting it based on a space, and getting the last element, but the last character is still a -. Can I not expect that the data will be delivered to me line by line. In other words, can I not expect a file in the form a b c d \n e f g h\n to be delivered line by line? And does anyone have any tips on how to get the last word in this line? This is a snippet from my map function, where I try to get the data: public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String test = value.toString(); StringTokenizer tokenizer = new StringTokenizer(test); //String lastWord = test.substring(test.lastIndexOf(" ")+1); <--first try //String [] array = test.split(" ");//<--second try //one.set(Integer.valueOf(array[8])); int i = 0; String candidate = null; while (tokenizer.hasMoreTokens()) { candidate = tokenizer.nextToken(); if (i == 3) { //this works to get the date field String wholeDate = candidate; String[] dateArray = wholeDate.split(":"); String date = dateArray[0].substring(1); // get rid of '[' String hour = dateArray[1]; word.set(date + " " + hour); } else if (i == 7) { // <-- third try String replySizeString = candidate; one.set(Integer.valueOf(replySizeString)); } } i++;
Instead of using a StringTokenizer you could just use the String[] String.split(String regex) method to return an array of Strings for each line. Then, assuming that each line of your data has the same number of fields, separated by spaces, you can just look at that array element. String line = value.toString(); String[] lineArray = line.split(" "); String lastWord = lineArray[9]; Or if you know that you always want the last token you could see how long the array is and then just grab the last element. String lastWord = lineArray[lineArray.length - 1];
Android: split a string considering 2 separating characters
I have a string containing messages. The string looks like this: bill:hello;tom:hi;bill:how are you?;tommy:hello!; ... I need to split the string into several srings, on the characters : and ;. For now, I have split the string on ; and i could add the results in list elements. List<Message> listMessages = new ArrayList<Message>(); StringTokenizer tokenizer = new StringTokenizer(messages, ";"); String result = null; String uname = ""; String umess = ""; while (tokenizer.hasMoreTokens()) { result = tokenizer.nextToken(); listMessages.add(new Message(result, "")); } I still have to do this on the : to have the two resulting strings in my list element, and I tried something like that: List<Message> listMessages = new ArrayList<Message>(); StringTokenizer tokenizer = new StringTokenizer(messages, ";"); String result = null; String uname = ""; String umess = ""; while (tokenizer.hasMoreTokens()) { result = tokenizer.nextToken().split(":"); uname = result[0]; umess = result[1]; listMessages.add(new Message(result[0], result[1])); } But I got this error, that I don't understand? 01-23 17:12:19.168: E/AndroidRuntime(711): java.lang.RuntimeException: Unable to start activity ComponentInfo{com.example.appandroid/com.example.appandroid.ListActivity}: java.lang.ArrayIndexOutOfBoundsException: length=1; index=1 Thanks in advance to look at my problem.
Instead of using StringTokenizer, you can use String.split(regex) to split based on two delimiters like below: String test="this: bill:hello;tom:hi;bill:how are you?;tommy:hello!;"; String[] arr = test.split("[:;]"); for(String s: arr){ System.out.println(s); } Output: this bill hello tom hi bill how are you? tommy hello! EDIT: from #njzk2 comments if you just wanna use StringTokenizer you can use one of its overloaded constructor which takes 2 args . StringTokenizer str = new StringTokenizer(test, ":;");
Reading Strings from lines in Java
I have a txt file formatted like: Name 'Paul' 9-years old How can I get from a "readline": String the_name="Paul" and int the_age=9 in Java, discarding all the rest? I have: ... BufferedReader bufferedReader = new BufferedReader(fileReader); StringBuffer stringBuffer = new StringBuffer(); String line; while ((line = bufferedReader.readLine()) != null) { //put the name value in the_name //put age value in the_age } ... Please suggest, thanks.
As you're using BufferedReader and everything is on the one line, you would have to split it to extract the data. Some additional formatting is then required to remove the quotes & extract the year part of age. No need for any fancy regex: String[] strings = line.split(" "); if (strings.length >= 3) { String the_name= strings[1].replace("'", ""); String the_age = strings[2].substring(0, strings[2].indexOf("-")); } I notice you have this functionality in a while loop. For this to work, make sure that every line keeps the format: text 'Name' digit-any other text ^^ ^^ ^ Important chars are Spaces: min of 3 tokens needed for split array Single quotes - Hyphen character
use java.util.regex.Pattern: Pattern pattern = Pattern.compile("Name '(.*)' (\d*)-years old"); for (String line : lines) { Matcher matcher = pattern.matcher(line); if (matcher.matches()) { String theName = matcher.group(1); int theAge = Integer.parseInt(matcher.group(2)); } }
You can use the String.substring, String.indexOf, String.lastIndexOf, and Integer.parseInt methods as follows: String line = "Name 'Paul' 9-years old"; String theName = line.substring(line.indexOf("'") + 1, line.lastIndexOf("'")); String ageStr = line.substring(line.lastIndexOf("' ") + 2, line.indexOf("-years")); int theAge = Integer.parseInt(ageStr); System.out.println(theName + " " + theAge); Output: Paul 9
Extracting values from file
I've got around 10 lines of data in a text file below containing the following X-Value = -0.525108, Y-Value = 7.746691, Z-Value = 5.863008, Timestamp(milliseconds) = 23001 X-Value = -0.755030, Y-Value = 7.861651, Z-Value = 6.016289, Timestamp(milliseconds) = 23208 The code I have right now uses a BufferedReader reading every line of the file but what I really want to do is extract the X-Value, Y-Value, Z-Value and Timestamp(milliseconds) values from each line. Could this be done with using simple String methods such as substring or would this suit the use of regular expressions?
You can first split the strings by ,s, then split each part by =, then trim leading/trailing spaces as necessary. You can use String.split() or java.util.StringTokenizer for this.
You can use String.split to split your string on , and = String str = "X-Value = -0.525108, Y-Value = 7.746691, Z-Value = 5.863008, Timestamp(milliseconds) = 23001"; ArrayList<String> final_data = new ArrayList<String>(); String[] data = str.split(","); for(String S : data) final_data.add(S.trim().split("=")[1]); for(String s : final_data) System.out.println(s.trim()); Output = -0.525108 7.746691 5.863008 23001
You can use scanner like this to extract your values: String str = "X-Value = -0.525108, Y-Value = 7.746691, Z-Value = 5.863008, Timestamp(milliseconds) = 23001"; Scanner scanner = new Scanner(str); if (scanner.findInLine("^X-Value\\s*=\\s*([^,]*),\\s*Y-Value\\s*=\\s*([^,]*),\\s*Z-Value\\s*=\\s*([^,]*),\\s*Timestamp\\(milliseconds\\)\\s+=\\s+([^,]*)\\s*$") != null) { MatchResult result = scanner.match(); System.out.printf("x=[%s]; y=[%s]; z=[%s]; ts=[%s]%n", result.group(1), result.group(2), result.group(3), result.group(4)); } scanner.close(); OUTPUT: x=[-0.525108]; y=[7.746691]; z=[5.863008]; ts=[23001]
String s = "X-Value = -0.525108, Y-Value = 7.746691, Z-Value = 5.863008, Timestamp(milliseconds) = 23001"; s = s.replaceAll(" ", ""); String[] split = s.split("=|,"); BigDecimal x = new BigDecimal(split[1]); BigDecimal y = new BigDecimal(split[3]); BigDecimal z = new BigDecimal(split[5]); String ts = split[7];
Why play around with split(), just go for a regex! X-Value\s*=\s*([\d.+-]*).*Y-Value([\d.+-]*).*Z-Value\s*=\s*([\d.+-]*).*Timestamp\(milliseconds\)\s*=\s*(\d*)