Java .split() out of bounds - java

I have a problem with my code.
I'm trying to extract the name of the channels from a .txt file.
I can't understand why the method line.split() give me back an array with 0 length:
Someone can help me?
This is the file .txt:
------------[channels.txt]---------------------
...
#CH id="" tvg-name="Example1" tvg-logo="http...
#CH id="" tvg-name="Example2" tvg-logo="http...
#CH id="" tvg-name="Example3" tvg-logo="http...
#CH id="" tvg-name="Example4" tvg-logo="http...
...
This is my code:
try {
FileInputStream VOD = new FileInputStream("channels.txt");
BufferedReader buffer_r = new BufferedReader(new InputStreamReader(VOD));
String line;
ArrayList<String> name_channels = new ArrayList<String>();
while ((line = buffer_r.readLine()) != null ) {
if (line.startsWith("#")) {
String[] first_scan = line.split(" tvg-name=\" ", 2);
String first = first_scan[1]; // <--- out of bounds
String[] second_scan = first.split(" \"tvg-logo= ", 2);
String second = second_scan[0];
name_channels.add(second);
} else {
//...
}
}
for (int i = 0; i < name_channels.size(); i++) {
System.out.println("Channel: " + name_channels.get(i));
}
} catch(Exception e) {
System.out.println(e);
}

So you have examples like this
#CH id="" tvg-name="Example1" tvg-logo="http...
And are trying to split on these strings
" tvg-name=\" "
" \"tvg-logo= "
Neither of those strings are in the example. There's a spurious space appended, and the space at the start of the second is in the wrong place.
Fix the strings and here's a concise but complete program to demonstrate
interface Split {
static void main(String[] args) {
String line = "#CH id=\"\" tvg-name=\"Example1\" tvg-logo=\"http...";
String[] first_scan = line.split(" tvg-name=\"", 2);
String first = first_scan[1]; // <--- out of bounds
String[] second_scan = first.split("\" tvg-logo=", 2);
String second = second_scan[0];
System.err.println(second);
}
}
Of course, if you have any lines that start with '#' but don't match, you'll have a similar problem.
This sort of thing is probably done better with regexs and capturing groups.

There is a whitespace after the last double quote in tvg-name=\" which does not match the data in your example.
When you use split with line.split(" tvg-name=\"", 2) then the first item in the returned array will be #CH id="" and the second part will be Example1" tvg-logo="http..."
If you want to get the value of tvg-name= you might use a regex with a capturing group where you would capture not a double quote using a negated character class [^"]+
tvg-name="([^"]+)"
try {
FileInputStream VOD = new FileInputStream("channels.txt");
BufferedReader buffer_r = new BufferedReader(new InputStreamReader(VOD));
String line;
ArrayList<String> name_channels = new ArrayList<String>();
while((line = buffer_r.readLine()) != null ){
if(line.startsWith("#")){
String regex = "tvg-name=\"([^\"]+)\"";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(line);
while (matcher.find()) {
name_channels.add(matcher.group(1));
}
} else {
// ...
}
}
for(int i = 0; i < name_channels.size(); i++){
System.out.println("Channel: " + name_channels.get(i));
}
}catch(Exception e){
System.out.println(e);
}

Related

How can I scope three different conditions using the same loop in Java?

I would like to count countX and countX using the same loop instead of creating three different loops. Is there any easy way approaching that?
public class Absence {
private static File file = new File("/Users/naplo.txt");
private static File file_out = new File("/Users/naplo_out.txt");
private static BufferedReader br = null;
private static BufferedWriter bw = null;
public static void main(String[] args) throws IOException {
int countSign = 0;
int countX = 0;
int countI = 0;
String sign = "#";
String absenceX = "X";
String absenceI = "I";
try {
br = new BufferedReader(new FileReader(file));
bw = new BufferedWriter(new FileWriter(file_out));
String st;
while ((st = br.readLine()) != null) {
for (String element : st.split(" ")) {
if (element.matches(sign)) {
countSign++;
continue;
}
if (element.matches(absenceX)) {
countX++;
continue;
}
if (element.matches(absenceI)) {
countI++;
}
}
}
System.out.println("2. exerc.: There are " + countSign + " rows int the file with that sign.");
System.out.println("3. exerc.: There are " + countX + " with sick note, and " + countI + " without sick note!");
} catch (FileNotFoundException ex) {
Logger.getLogger(Absence.class.getName()).log(Level.SEVERE, null, ex);
}
}
}
text file example:
# 03 26
Jujuba Ibolya IXXXXXX
Maracuja Kolos XXXXXXX
I think you meant using less than 3 if statements. You can actually so it with no ifs.
In your for loop write this:
Countsign += (element.matches(sign)) ? 1 : 0;
CountX += (element.matches(absenceX)) ? 1 : 0;
CountI += (element.matches(absenceI)) ? 1 : 0;
Both answers check if the word (element) matches all regular expressions while this can (and should, if you ask me) be avoided since a word can match only one regex. I am referring to the continue part your original code has, which is good since you do not have to do any further checks.
So, I am leaving here one way to do it with Java 8 Streams in "one liner".
But let's assume the following regular expressions:
String absenceX = "X*";
String absenceI = "I.*";
and one more (for the sake of the example):
String onlyNumbers = "[0-9]*";
In order to have some matches on them.
The text is as you gave it.
public class Test {
public static void main(String[] args) throws IOException {
File desktop = new File(System.getProperty("user.home"), "Desktop");
File txtFile = new File(desktop, "test.txt");
String sign = "#";
String absenceX = "X*";
String absenceI = "I.*";
String onlyNumbers = "[0-9]*";
List<String> regexes = Arrays.asList(sign, absenceX, absenceI, onlyNumbers);
List<String> lines = Files.readAllLines(txtFile.toPath());
//#formatter:off
Map<String, Long> result = lines.stream()
.flatMap(line-> Stream.of(line.split(" "))) //map these lines to words
.map(word -> regexes.stream().filter(word::matches).findFirst()) //find the first regex this word matches
.filter(Optional::isPresent) //If it matches no regex, it will be ignored
.collect(Collectors.groupingBy(Optional::get, Collectors.counting())); //collect
System.out.println(result);
}
}
The result:
{X*=1, #=1, I.=2, [0-9]=2}
X*=1 came from word: XXXXXXX
#=1 came from word: #
I.*=2 came from words: IXXXXXX and Ibolya
[0-9]*=2 came from words: 03 and 06
Ignore the fact I load all lines in memory.
So I made it with the following lines to work. It escaped my attention that every character need to be separated from each other. Your ternary operation suggestion also nice so I will use it.
String myString;
while ((myString = br.readLine()) != null) {
String newString = myString.replaceAll("", " ").trim();
for (String element : newString.split(" ")) {
countSign += (element.matches(sign)) ? 1 : 0;
countX += (element.matches(absenceX)) ? 1 : 0;
countI += (element.matches(absenceI)) ? 1 : 0;

how to delete up extra line breakers in string

I have got a text like this in my String s (which I have already read from txt.file)
trump;Donald Trump;trump#yahoo.eu
obama;Barack Obama;obama#google.com
bush;George Bush;bush#inbox.com
clinton,Bill Clinton;clinton#mail.com
Then I'm trying to cut off everything besides an e-mail address and print out on console
String f1[] = null;
f1=s.split("(.*?);");
for (int i=0;i<f1.length;i++) {
System.out.print(f1[i]);
}
and I have output like this:
trump#yahoo.eu
obama#google.com
bush#inbox.com
clinton#mail.com
How can I avoid such output, I mean how can I get output text without line breakers?
Try using below approach. I have read your file with Scanner as well as BufferedReader and in both cases, I don't get any line break. file.txt is the file that contains text and the logic of splitting remains the same as you did
public class CC {
public static void main(String[] args) throws IOException {
Scanner scan = new Scanner(new File("file.txt"));
while (scan.hasNext()) {
String f1[] = null;
f1 = scan.nextLine().split("(.*?);");
for (int i = 0; i < f1.length; i++) {
System.out.print(f1[i]);
}
}
scan.close();
BufferedReader br = new BufferedReader(new FileReader(new File("file.txt")));
String str = null;
while ((str = br.readLine()) != null) {
String f1[] = null;
f1 = str.split("(.*?);");
for (int i = 0; i < f1.length; i++) {
System.out.print(f1[i]);
}
}
br.close();
}
}
You may just replace all line breakers as shown in the below code:
String f1[] = null;
f1=s.split("(.*?);");
for (int i=0;i<f1.length;i++) {
System.out.print(f1[i].replaceAll("\r", "").replaceAll("\n", ""));
}
This will replace all of them with no space.
Instead of split, you might match an email like format by matching not a semicolon or a whitespace character one or more times using a negated character class [^\\s;]+ followed by an # and again matching not a semicolon or a whitespace character.
final String regex = "[^\\s;]+#[^\\s;]+";
final String string = "trump;Donald Trump;trump#yahoo.eu \n"
+ " obama;Barack Obama;obama#google.com \n"
+ " bush;George Bush;bush#inbox.com \n"
+ " clinton,Bill Clinton;clinton#mail.com";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
final List<String> matches = new ArrayList<String>();
while (matcher.find()) {
matches.add(matcher.group());
}
System.out.println(String.join("", matches));
[^\\s;]+#[^\\s;]+
Regex demo
Java demo
package com.test;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
String s = "trump;Donald Trump;trump#yahoo.eu "
+ "obama;Barack Obama;obama#google.com "
+ "bush;George Bush;bush#inbox.com "
+ "clinton;Bill Clinton;clinton#mail.com";
String spaceStrings[] = s.split("[\\s,;]+");
String output="";
for(String word:spaceStrings){
if(validate(word)){
output+=word;
}
}
System.out.println(output);
}
public static final Pattern VALID_EMAIL_ADDRESS_REGEX = Pattern.compile(
"^[A-Z0-9._%+-]+#[A-Z0-9.-]+\\.[A-Z]{2,6}$",
Pattern.CASE_INSENSITIVE);
public static boolean validate(String emailStr) {
Matcher matcher = VALID_EMAIL_ADDRESS_REGEX.matcher(emailStr);
return matcher.find();
}
}
Just replace '\n' that may arrive at start and end.
write this way.
String f1[] = null;
f1=s.split("(.*?);");
for (int i=0;i<f1.length;i++) {
f1[i] = f1[i].replace("\n");
System.out.print(f1[i]);
}

write to separate columns in csv

I am trying to write 2 different arrays to a csv. The first one I want in the first column, and second array in the second column, like so:
array1val1 array2val1
array1val2 array2val2
I am using the following code:
String userHomeFolder2 = System.getProperty("user.home") + "/Desktop";
String csvFile = (userHomeFolder2 + "/" + fileName.getText() + ".csv");
FileWriter writer = new FileWriter(csvFile);
final String NEW_LINE_SEPARATOR = "\n";
FileWriter fileWriter;
CSVPrinter csvFilePrinter;
CSVFormat csvFileFormat = CSVFormat.DEFAULT.withRecordSeparator(NEW_LINE_SEPARATOR);
fileWriter = new FileWriter(fileName.getText());
csvFilePrinter = new CSVPrinter(fileWriter, csvFileFormat);
try (PrintWriter pw = new PrintWriter(csvFile)) {
pw.printf("%s\n", FILE_HEADER);
for(int z = 0; z < compSource.size(); z+=1) {
//below forces the result to get stored in below variable as a String type
String newStr=compSource.get(z);
String newStr2 = compSource2.get(z);
newStr.replaceAll(" ", "");
newStr2.replaceAll(" ", "");
String[] explode = newStr.split(",");
String[] explode2 = newStr2.split(",");
pw.printf("%s\n", explode, explode2);
}
}
catch (Exception e) {
System.out.println("Error in csvFileWriter");
e.printStackTrace();
} finally {
try {
fileWriter.flush();
fileWriter.close();
csvFilePrinter.close();
} catch (IOException e ) {
System.out.println("Error while flushing/closing");
}
}
However I am getting a strange output into the csv file:
[Ljava.lang.String;#17183ab4
I can run
pw.printf("%s\n", explode);
pw.printf("%s\n", explode2);
Instead of : pw.printf("%s\n", explode, explode2);
and it prints the actual strings but all in one same column.
Does anyone know how to solve this?
1.Your explode and explode2 are actually String Arrays. You are printing the arrays and not the values of it. So you get at the end the ADRESS of the array printed.
You should go through the arrays with a loop and print them out.
for(int i = 0; i<explode.length;++i) {
pw.printf("%s%s\n", explode[i], explode2[i]);
}
2.Also the method printf should be look something like
pw.printf("%s%s\n", explode, explode2);
because youre are printing two arguments, but in ("%s\n", explode, explode2) is only one printed.
Try it out and say if it worked
After these lines:
newStr.replaceAll(" ", "");
newStr2.replaceAll(" ", "");
String[] explode = newStr.split(",");
String[] explode2 = newStr2.split(",");
Use this code:
int maxLength = Math.max(explode.length, explode2.length);
for (int i = 0; i < maxLength; i++) {
String token1 = (i < explode.length) ? explode[i] : "";
String token2 = (i < explode2.length) ? explode2[i] : "";
pw.printf("%s %s\n", token1, token2);
}
This also cover the case that the arrays are of different length.
I have removed all unused variables and made some assumptions about content of compSource.
Moreover, don't forget String is immutable. If you just do "newStr.replaceAll(" ", "");", the replacement will be lost.
public class Tester {
#Test
public void test() throws IOException {
// I assumed compSource and compSource2 are like bellow
List<String> compSource = Arrays.asList("array1val1,array1val2");
List<String> compSource2 = Arrays.asList("array2val1,array2val2");
String userHomeFolder2 = System.getProperty("user.home") + "/Desktop";
String csvFile = (userHomeFolder2 + "/test.csv");
try (PrintWriter pw = new PrintWriter(csvFile)) {
pw.printf("%s\n", "val1,val2");
for (int z = 0; z < compSource.size(); z++) {
String newStr = compSource.get(z);
String newStr2 = compSource2.get(z);
// String is immutable --> store the result otherwise it will be lost
newStr = newStr.replaceAll(" ", "");
newStr2 = newStr2.replaceAll(" ", "");
String[] explode = newStr.split(",");
String[] explode2 = newStr2.split(",");
for (int k = 0; k < explode.length; k++) {
pw.println(explode[k] + "\t" + explode2[k]);
}
}
}
}
}

How can i split String in java with custom pattern

I am trying to get the location data from this string using String.split("[,\\:]");
String location = "$,lat:27.980194,lng:46.090199,speed:0.48,fix:1,sats:6,";
String[] str = location.split("[,\\:]");
How can i get the data like this.
str[0] = 27.980194
str[1] = 46.090199
str[2] = 0.48
str[3] = 1
str[4] = 6
Thank you for any help!
If you just want to keep the numbers (including dot separator), you can use:
String[] str = location.split("[^\\d\\.]+");
You will need to ignore the first element in the array which is an empty string.
That will only work if the data names don't contain numbers or dots.
String location = "$,lat:27.980194,lng:46.090199,speed:0.48,fix:1,sats:6,";
Matcher m = Pattern.compile( "\\d+\\.*\\d*" ).matcher(location);
List<String> allMatches = new ArrayList<>();
while (m.find( )) {
allMatches.add(m.group());
}
System.out.println(allMatches);
Quick and Dirty:
String location = "$,lat:27.980194,lng:46.090199,speed:0.48,fix:1,sats:6,";
List<String> strList = (List) Arrays.asList( location.split("[,\\:]"));
String[] str = new String[5];
int count=0;
for(String s : strList){
try {
Double d =Double.parseDouble(s);
str[count] = d.toString();
System.out.println("In String Array:"+str[count]);
count++;
} catch (NumberFormatException e) {
System.out.println("s:"+s);
}
}

Unable to find string in Java File

I have a Java program that works without issue for searching most strings but for some reason I am unable to have it find the below ina file which I know appears in the file. I am obviously trying to locate a certain element that has the value of 999 but i am unable to do so. Again this works for other strings just not the one below.
for(int i=0;i< inputFile.length;i++)
try {
br = new BufferedReader(new FileReader(inputFile[i]));
try {
while((line = br.readLine()) != null)
{
countLine++;
//System.out.println(line);
String[] words = line.split(" ");
for (String word : words) {
if (word.equals(inputSearch)) {
count++;
countBuffer++;
}
}
if(countBuffer > 0)
{
countBuffer = 0;
lineNumber += countLine + ",";
}
}
br.close();
If I understand your question, you could use a Pattern and Matcher and something like -
String toMatch = "<element>999</element>";
Pattern pattern = Pattern.compile(">\\s*999\\s*<");
Matcher match = pattern.matcher(toMatch);
int count = 0;
int start = 0;
while (start < toMatch.length() && match.find(start)) {
// Pattern found.
start = match.regionEnd() + 1;
count++;
}

Categories