I'm wondering how I could grab each nth lines from a String, say each 100, with the lines in the String being seperated with a '\n'.
This is probably a simple thing to do but I really can't think of how to do it, so does anybody have a solution?
Thanks much,
Alex.
UPDATE:
Sorry I didn't explain my question very well.
Basically, imagine there's a 350 line file. I want to grab the start and end of each 100 line chunk. Pretending each line is 10 characters long, I'd finish with a 2 seperate arrays (containing start and end indexes) like this:
(Lines 0-100) 0-1000
(Lines 100-200) 1000-2000
(Lines 200-300) 2000-3000
(Lines 300-350) 3000-3500
So then if I wanted to mess around with say the second set of 100 lines (100-200) I have the regions for them.
You can split the string into an array using split() and then just get the indexes you want, like so:
String[] strings = myString.split("\n");
int nth = 100;
for(int i = nth; i < strings.length; i + nth) {
System.out.println(strings[i]);
}
String newLine = System.getProperty("line.separator");
String lines[] = text.split(newLine);
Where text is string with your whole text.
Now to get nth line, do e.g.:
System.out.println(lines[nth - 1]); // Minus one, because arrays in Java are zero-indexed
One approach is to create a StringReader from the string, wrap it in a BufferedReader and use that to read lines. Alternatively, you could just split on \n to get the lines, of course...
String[] allLines = text.split("\n");
List<String> selectedLines = new ArrayList<String>();
for (int i = 0; i < allLines.length; i += 100)
{
selectedLines.add(allLines[i]);
}
This is simpler code than using a BufferedReader, but it does mean having the complete split string in memory (as well as the original, at least temporarily, of course). It's also less flexible in terms of being adapted to reading lines from other sources such as a file. But if it's all you need, it's pretty straightforward :)
EDIT: If the start indexes are needed too, it becomes slightly more complicated... but not too bad. You probably want to encapsulate the "start and line" in a single class, but for the sake of brevity:
String[] allLines = text.split("\n");
List<String> selectedLines = new ArrayList<String>();
List<Integer> selectedIndexes = new ArrayList<Integer>();
int index = 0;
for (int i = 0; i < allLines.length; i++)
{
if (i % 100 == 0)
{
selectedLines.add(allLines[i]);
selectedIndexes.add(index);
}
index += allLines[i].length + 1; // Add 1 for the trailing "\n"
}
Of course given the start index and the line, you can get the end index just by adding the line length :)
Related
I've got some text files I need to extract data from. The file itself contains around a hundred lines and the interesting part for me is:
AA====== test==== ====================================================/
AA normal low max max2 max3 /
AD .45000E+01 .22490E+01 .77550E+01 .90000E+01 .47330E+00 /
Say I need to extract the double values under "normal", "low" and "max". Is there any efficient and not-too-error-prone solution other than regexing the hell out of the text file?
If you really want to avoid regexes, and assuming you'll always have this same basic format, you could do something like:
HashMap<String, Double> map = new HashMap<>();
Scanner scan = new Scanner(filePath); //or your preferred input mechanism
assert (scan.nextLine().startsWith("AA====:); //remove the top line, ensure it is the top line
while (scan.hasNextLine()){
String[] headings = scan.nextLine().split("\\s+"); //("\t") can be used if you're sure the delimiters will always be tabs
String[] vals = scan.nextLine().split("\\s+");
assert headings[0].equals("AA"); //ensure
assert vals[0].equals("AD");
for (int i = 1; i< headings.length; i++){ //start with 1
map.put(headings[i], Double.parseDouble(vals[i]);
}
}
//to make sure a certain value is contained in the map:
assert map.containsKey("normal");
//use it:
double normalValue = map.get("normal");
}
Code is untested as I don't have access to an IDE at the moment. Also, I obviously don't know what's variable and what will remain constant here (read: the "AD", "AA", etc.), but hopefully you get the gist and can modify as needed.
If each line will always have this exact form you can use String.split()
String line; // Fill with one line from the file
String[] cols = line.split(".")
String normal = "."+cols[0]
String low = "."+cols[1]
String max = "."+cols[2]
If you know what index each value will start, you can just do substrings of the row. (The split method technically does a regex).
i.e.
String normal = line.substring(x, y).trim();
String low = line.substring(z, w).trim();
etc.
I am trying to split up my info first into a String[] by using "\n" as delimiter and than afterwards ,split them again into a String[] but this time using ";" as delimiter.
I however fail at getting info out of the second split.
public static void initHashMap(){
String[] lijnen = readDefinitioncsv(definitioncsv).split("\n");
for (int i =2;i<lijnen.length;i++){
String[] detaillijn = lijnen[i].split(";");
// on the line below I get
//java.lang.ArrayIndexOutOfBoundsException: 1
int rood = Integer.parseInt(detaillijn[1]);
int groen = Integer.parseInt(detaillijn[2]);
int blauw = Integer.parseInt(detaillijn[3]);
String provincieNaam = detaillijn[4];
RGBProvince.put(new Color(rood,groen,blauw), provincieNaam);
}
}
Thank you for your time
String[] lijnen = readDefinitioncsv(definitioncsv).split("\r?\n");
for (int i =2;i<lijnen.length;i++){
String[] detaillijn = lijnen[i].split("[,;\t]");
if (detaillijn.length < 5) {
throw new IllegalArgumentException("Weiniger als 5 elementen: "
+ lijnen[i]);
}
This handles Windows line endings (\r\n aka CR+LF) and also other forms of CSV - as ; did not seem to function.
May the file ends with an empty line, in which case you need to skip that with a continue.
For good order, indices start at 0; you seem to be skipping 2 header lines, and the first column.
Why is your for loop initializing at the third position?
for (int i =2;i<lijnen.length;i++){
presumably you want i=0?
also you array might not have 2 elements:
int rood = Integer.parseInt(detaillijn[1]);
this is the second position. but it might have nothing because of the loop above.
You should probably check you have at least those many lines / columns before you hop straight to one in an array.
You're initializing i to 2 in your for loop. If your array has less than three items, you will get an index out of bounds trying to access a location that does not exist.
Why not start with int i = 0?
I want to export pattern of bit stream in a String varilable. Assume our bit stream is something like bitStream="111000001010000100001111". I am looking for a Java code to save this bit stream in a specific array (assume bitArray) in a way that all continous "0"s or "1"s be saved in one array element. In this example output would be somethins like this:
bitArray[0]="111"
bitArray[1]="00000"
bitArray[2]="1"
bitArray[3]="0"
bitArray[4]="1"
bitArray[5]="0000"
bitArray[6]="1"
bitArray[7]="0000"
bitArray[8]="1111"
I want to using bitArray to calculate the number of bit which is stored in each continous stream. For example in this case the final output would be, "3,5,1,1,1,4,1,4,4". I figure it out that probably "split" method would solve this for me. But I dont know what splitting pattern would do that for me, if i Using bitStream.split("1+") it would split on contious "1" pattern, if i using bitStream.split("0+") it will do that base on continous"0" but how it could be based on both?
Mathew suggested this solution and it works:
var wholeString = "111000001010000100001111";
wholeString = wholeString.replace('10', '1,0');
wholeString = wholeString.replace('01', '0,1');
stringSplit = wholeString.split(',');
My question is "Is this solution the most efficient one?"
Try replacing any occurrence of "01" and "10" with "0,1" and "1,0" respectively. Then once you've injected the commas, split the string using the comma as the delimiting character.
String wholeString = "111000001010000100001111"
wholeString = wholeString.replace("10", "1,0");
wholeString = wholeString.replace("01", "0,1");
String stringSplit[] = wholeString.split(",");
You can do this with a simple regular expression. It matches 1s and 0s and will return each in the order they occur in the stream. How you store or manipulate the results is up to you. Here is some example code.
String testString = "111000001010000100001111";
Pattern pattern = Pattern.compile("1+|0+");
Matcher matcher = pattern.matcher(testString);
while (matcher.find())
{
System.out.print(matcher.group().length());
System.out.print(" ");
}
This will result in the following output:
3 5 1 1 1 4 1 4 4
One option for storing the results is to put them in an ArrayList<Integer>
Since the OP wanted most efficient, I did some tests to see how long each answer takes to iterate over a large stream 10000 times and came up with the following results. In each test the times were different but the order of fastest to slowest remained the same. I know tick performance testing has it's issues like not accounting for system load but I just wanted a quick test.
My answer completed in 1145 ms
Alessio's answer completed in 1202 ms
Matthew Lee Keith's answer completed in 2002 ms
Evgeniy Dorofeev's answer completed in 2556 ms
Hope this helps
I won't give you a code, but I'll guide you to a possible solution:
Construct an ArrayList<Integer>, iterate on the array of bits, as long as you have 1's, increment a counter and as soon as you have 0, add the counter to the ArrayList. After this procedure, you'll have an ArrayList that contain numbers, etc: [1,2,2,3,4] - Representing a serieses of 1's and 0's.
This will represent the sequences of 1's and 0's. Then you construct an array of the size of the ArrayList, and fill it accordingly.
The time complexity is O(n) because you need to iterate on the array only once.
This code works for any String and patterns, not only 1s and 0s. Iterate char by char, and if the current char is equal to the previous one, append the last char to the last element of the List, otherwise create a new element in the list.
public List<String> getArray(String input){
List<String> output = new ArrayList<String>();
if(input==null || input.length==0) return output;
int count = 0;
char [] inputA = input.toCharArray();
output.add(inputA[0]+"");
for(int i = 1; i <inputA.length;i++){
if(inputA[i]==inputA[i-1]){
String current = output.get(count)+inputA[i];
output.remove(count);
output.add(current);
}
else{
output.add(inputA[i]+"");
count++;
}
}
return output;
}
try this
String[] a = s.replaceAll("(.)(?!\\1)", "$1,").split(",");
I tried to implement #Maroun Maroun solution.
public static void main(String args[]){
long start = System.currentTimeMillis();
String bitStream ="0111000001010000100001111";
int length = bitStream.length();
char base = bitStream.charAt(0);
ArrayList<Integer> counts = new ArrayList<Integer>();
int count = -1;
char currChar = ' ';
for (int i=0;i<length;i++){
currChar = bitStream.charAt(i);
if (currChar == base){
count++;
}else {
base = currChar;
counts.add(count+1);
count = 0;
}
}
counts.add(count+1);
System.out.println("Time taken :" + (System.currentTimeMillis()-start ) +"ms");
System.out.println(counts.toString());
}
I believe it is more effecient way, as he said it is O(n) , you are iterating only once. Since the goal to get the count only not to store it as array. i woul recommen this. Even if we use Regular Expression ( internal it would have to iterate any way )
Result out put is
Time taken :0ms
[1, 3, 5, 1, 1, 1, 4, 1, 4, 4]
Try this one:
String[] parts = input.split("(?<=1)(?=0)|(?<=0)(?=1)");
See in action here: http://rubular.com/r/qyyfHNAo0T
I have a String which always looks like this:
data
data
data
data
non-data
non-data
And I need to delete the 2 last lines from it. The lenght of these lines can be different. How I can do that fast (String = ~1000 lines)?
I'd say something along the lines of:
String[] lines = input.split("\n");
String[] dataLines = Arrays.copyOfRange(lines, 0, lines.length - 2);
int lastNewLineAt = string.lastIndexOf("\n");
string.subString(0, string.lastIndexOf("\n", lastNewLineAt));
You can use constant for new line character reading system property
This Code will split your text by "\n" 's which means your lines in to a String Array.
Than you will get that array's length..
And in a for loop you will set and append your text till your length-1 element.
This may be a long approach but I was searching this and I couldn't find anything.
This was my easiest way.
String[] lines = YourTextViev.getText().toString().split("\n");
YourTextView.setText(""); // clear your TextView
int Arraylength = lines.length-1; // Changing "-1" will change which lines will be deleted
for(int i=0;i<Arraylength;i++){
YourTextView.append(lines[i]+"\n");
}
How would I remove the chars from the data in this file so I could sum up the numbers?
Alice Jones,80,90,100,95,75,85,90,100,90,92
Bob Manfred,98,89,87,89,9,98,7,89,98,78
I want to do this so for every line it will remove all the chars but not ints.
The following code might be useful to you, try running it once,
public static void main(String ar[])
{
String s = "kasdkasd,1,2,3,4,5,6,7,8,9,10";
int sum=0;
String[] spl = s.split(",");
for(int i=0;i<spl.length;i++)
{
try{
int x = Integer.parseInt(spl[i]);
sum = sum + x;
}
catch(NumberFormatException e)
{
System.out.println("error parsing "+spl[i]);
System.out.println("\n the stack of the exception");
e.printStackTrace();
System.out.println("\n");
}
}
System.out.println("The sum of the numbers in the string : "+ sum);
}
even the String of the form "abcd,1,2,3,asdas,12,34,asd" would give you sum of the numbers
You need to split each line into a String array and parse the numbers starting from index 1
String[] arr = line.split(",");
for(int i = 1; i < arr.length; i++) {
int n = Integer.parseInt(arr[i]);
...
try this:
String input = "Name,2,1,3,4,5,10,100";
String[] strings = input.split(",");
int result=0;
for (int i = 1; i < strings.length; i++)
{
result += Integer.parseInt(strings[i]);
}
You can make use of the split method of course, supplying "," as the parameter, but that's not all.
The trick is to put each text file's line into an ArrayList. Once you have that, move forwars the Pseudocode:
1) Put each line of the text file inside an ArrayList
2) For each line, Split to an array by using ","
3) If the Array's size is bigger than 1, it means there are numbers to be summed up, else only the name lies on the array and you should continue to the next line
4) So the size is bigger than 1, iterate thru the strings inside this String[] array generated by the Split function, from 1 to < Size (this will exclude the name string itself)
5) use Integer.parseInt( iterated number as String ) and sum it up
There you go
Number Format Exception would occur if the string is not a number but you are putting each line into an ArrayList and excluding the name so there should be no problem :)
Well, if you know that it's a CSV file, in this exact format, you could read the line, execute string.split(',') and then disregard the first returned string in the array of results. See Evgenly's answer.
Edit: here's the complete program:
class Foo {
static String input = "Name,2,1,3,4,5,10,100";
public static void main(String[] args) {
String[] strings = input.split(",");
int result=0;
for (int i = 1; i < strings.length; i++)
{
result += Integer.parseInt(strings[i]);
}
System.out.println(result);
}
}
(wow, I never wrote a program before that didn't import anything.)
And here's the output:
125
If you're not interesting in parsing the file, but just want to remove the first field; then split it, disregard the first field, and then rejoin the remaining fields.
String[] fields = line.split(',');
StringBuilder sb = new StringBuilder(fields[1]);
for (int i=2; i < fields.length; ++i)
sb.append(',').append(fields[i]);
line = sb.toString();
You could also use a Pattern (regular expression):
line = line.replaceFirst("[^,]*,", "");
Of course, this assumes that the first field contains no commas. If it does, things get more complicated. I assume the commas are escaped somehow.
There are a couple of CsvReader/Writers that might me helpful to you for handling CSV data. Apart from that:
I'm not sure if you are summing up rows? columns? both? in any case create an array of the target sum counters int[] sums(or just one int sum)
Read one row, then process it either using split(a bit heavy, but clear) or by parsing the line into numbers yourself (likely to generate less garbage and work faster).
Add numbers to counters
Continue until end of file
Loading the whole file before starting to process is a not a good idea as you are doing 2 bad things:
Stuffing the file into memory, if it's a large file you'll run out of memory (very bad)
Iterating over the data 2 times instead of one (probably not the end of the world)
Suppose, format of the string is fixed.
String s = "Alice Jones,80,90,100,95,75,85,90,100,90,92";
At first, I would get rid of characters
Matcher matcher = Pattern.compile("(\\d+,)+\\d+").matcher(s);
int sum = 0;
After getting string of integers, separated by a comma, I would split them into array of Strings, parse it into integer value and sum ints:
if (matcher.find()){
for (String ele: matcher.group(0).split(",")){
sum+= Integer.parseInt(ele);
}
}
System.out.println(sum);