Split command on a nextElement - java

I am making a java servlet and am trying to make it display a preview of 3 different articles. I want it to preview the first sentence of each article, but can't seem to get split to work properly since I am reading the articles in with tokenizer. So I have something like:
while ((s = br.readLine()) != null) {
out.println("<tr>");
StringTokenizer s2 = new StringTokenizer(s, "|");
while (s2.hasMoreElements()) {
if (index == 0) {
out.println("<td class='first'>" + s2.nextElement() + "</td>");
}
out.println("</tr>");
}
index = 0;
}
How do I make s2.nextElement print out only the first sentence instead of the whole article? I imagine I could do split with a delimiter of ".", but can't get the code to work right. Thanks.

Try
s2.nextElement().split("\\.")[0];
to get the first sentence in the paragraph.

It would be better to use a Scanner:
Scanner scanner = new Scanner(new File("articles.txt"));
while (scanner.hasNext()) {
String article = scanner.next();
String[] parts = article.split("\\s*\\|\\s*");
String title = parts[0];
String text = parts[1];
String date = parts[2];
String image = parts[3];
String firstSentence = text.replaceAll("\\..*", ".");
// Output what you like about the article using the extracted parts
}
Scanner.next() reads in the whole line (the default delimiter is the newline char(s)).
split("\\s*\\|\\s*") splits the line on pipe chars (which have to be escaped because the pipe char has special regex meaning) and the \s* consumes any whitespace that may surround the pipe chars.

What I did was change hasMoreElements() to hasMoreTokens(). I then found the first occurrence of a ".". and created an int value. I then printed out a substring. here is what my code looked like:
while((s = br.readLine()) != null){
out.println("<tr>");
StringTokenizer s2 = new StringTokenizer(s, "|");
while (s2.hasMoreTokens()){
if (index == 0){
String one = s2.nextToken();
int i = one.indexOf(".");
out.println("<td>"+one.substring(0 , i)+"."+"</td>");
}

Related

String split gives array out of bounds

I am required to split a string which has been read from a external file. I have managed to split the string using this code;
String[] parts = line.split("\\.");
String part1 = parts[0];
String part2 = parts[1];
Now when I attempt to access the data, part1 at index[0] works fine, however trying to get index [1] throw an index out of bounds exception. The data I'm trying to split looks like so
886.0452586206898 27115740907.871643
888.0387931034484 26218442896.246094
890.032327586207 25301777157.154663
892.0258620689656 24365534070.686035
894.0193965517242 23409502709.11487
Am I meant to remove white space before doing the string split?
Since i highly doubt that the index is getting lost. You might want to try this code to find out if the data is completly valid. If the error still occurs you might have the cause of the error at some different place, and want to show your actuall stacktrace.
while ((line = br.readLine()) != null) {
if(line.contains(".")) {
String[] parts = line.split("\\.");
String part1 = parts[0];
String part2 = parts[1];
} else {
System.out.println("Corrupted data as: " + line);
}
}
I guess you are reading the file line by line, then you should split first against a "space" and then again the dot, otherwise you will get corrupted data...
886.0452586206898 27115740907.871643
as you can see, there are 2 elements in each line that can be split by dot
String[] parts = null;
String part1 = null;
String part2 = null;
System.out.println(parts[2]);
while ((line = br.readLine()) != null) {
System.out.println(line);
parts = line.split("\\.");
part1 = parts[0];
part2 = parts[1];
}
It is not working because you are trying to reach a variable that you declarated inside the while. Try to declarate outside it.
Do you want to extract each number from the input, then split on the decimal?
In that case, use Scanner to read in each number first, then split:
Scanner s = new Scanner(System.in);
while (s.hasNext()) {
String num = s.next();
String[] parts = num.split("\\.");
...
}

How to replace all special characters with another character in java?

I want to replace all 'special characters' with a special character in java
For example 'cash&carry' will become 'cash+carry' and also 'cash$carry' will become 'cash+carry'
I have a sample CSV file as
Here the CSV headers are 'What' and 'Where'
What,Where
salon,new+york+metro
pizza,los+angeles+metro
crate&barrel,los+angeles+metro
restaurants,los+angeles+metro
gas+station,los+angeles+metro
persian+restaurant,los+angeles+metro
car+wash,los+angeles+metro
book store,los+angeles+metro
garment,los+angeles+metro
"cash,carry",los+angeles+metro
cash&carry,los+angeles+metro
cash carry,los+angeles+metro
The expected output
What,Where
salon,new+york+metro
pizza,los+angeles+metro
crate+barrel,los+angeles+metro
restaurants,los+angeles+metro
gas+station,los+angeles+metro
persian+restaurant,los+angeles+metro
car+wash,los+angeles+metro
book+store,los+angeles+metro
garment,los+angeles+metro
cash+carry,los+angeles+metro
cash+carry,los+angeles+metro
cash+carry,los+angeles+metro
The sample code is as follows
String csvfile="BidAPI.csv";
try{
// create the 'Array List'
ArrayList<String> What=new ArrayList<String>();
ArrayList<String> Where=new ArrayList<String>();
BufferedReader br=new BufferedReader(new FileReader(csvfile));
StringTokenizer st=null;
String line="";
int linenumber=0;
int columnnumber;
int free=0;
int free1=0;
while((line=br.readLine())!=null){
linenumber++;
columnnumber=0;
st=new StringTokenizer(line,",");
while(st.hasMoreTokens()){
columnnumber++;
String token=st.nextToken();
if("What".equals(token)){
free=columnnumber;
System.out.println("the value of free :"+free);
} else if("Where".equals(token)){
free1=columnnumber;
System.out.println("the value of free1 :"+free1);
}
if(linenumber>1){
if (columnnumber==free){
What.add(token);
} else if(columnnumber==free1){
Where.add(token);
}
}
}
}
// converting the 'What' Array List to array
String[] what=What.toArray(new String[What.size()]);
// converting the 'Where' Array List to array
String[] where = Where.toArray(new String[Where.size()]);
for(int i=0;i<what.length;i++){
String data = what[i].replaceAll("[^A-Za-z0-9\",]| (?!([^\"]*\"){2}[^\"]*$)", "+").replace("\"", "");
System.out.println(data);
System.out.println(where[i]);
String finaldata = data+where[i];
String json = readUrl(desturl);
br.close();
}catch(Exception e){
System.out.println("There is an error :"+e);
}
All the special characters, all the spaces and the double quotes should be removed and replaced as in the desired output.
I am using value.replaceAll("[^A-Za-z0-9 ]", "+") , but it is not working.
Error
cash
carry"
Any help is appreciated. new to regex.
You need to:
replace all commas within quotes with +
replace non-whitelist (and you need to add commas to your whitelist)
+
remove double quotes
Try this:
line = line.replaceAll("[^A-Za-z0-9\",]|,(?!(([^\"]*\"){2})*[^\"]*$)", "+").replace("\"", "");
I think your regex is pretty close. Add an exception for comma's as well and get rid of the space and you are good.
BufferedReader r = new BufferedReader(new InputStreamReader(System.in));
String line;
while ((line = r.readLine()) != null)
{
String replaced = line.replace("\"", "");
replaced = replaced.replaceAll("[^A-Za-z0-9,]", "+");
System.out.println(replaced);
}
Of course, Strings are immutable in Java. Keep that in mind. replaceAll() returns a new String and does not modify the original instance.
Demo here.
You need to first find quote and replace , inside it with +. Next you can just use replaceAll("[^A-Za-z0-9,]", "+") so you will replace all non alphanumeric characters or , with +. Your code for that can use
Pattern p = Pattern.compile("\"([^\"]*)\"");
pattern to locate quotations and appendReplacement, appendTail from Matcher class to replace founded quotations with its new version.
So in short your code can look something like
Scanner scanner = new Scanner(new File(csvfile));
Pattern p = Pattern.compile("\"([^\"]*)\"");
StringBuffer sb = new StringBuffer();
while(scanner.hasNextLine()){
String line = scanner.nextLine();
Matcher m = p.matcher(line);
while (m.find()){//find quotes
//and replace their content with content with replaced `,` by `+`
//BTW group(1) holds part of quotation without `"` marsk
m.appendReplacement(sb, m.group(1).replace(',', '+'));
}
m.appendTail(sb);//we need to also add rest of unmatched data to buffer
//now we can just normally replace special characters with +
String result = sb.toString().replaceAll("[^A-Za-z0-9,]", "+");
//after job is done we can use result, so lest print it
System.out.println(result);
//lets not forget to reset buffer for next line
sb.delete(0, sb.length());
}
Answer to the question
String csvfile="BidAPI.csv";
try{
// create the 'Array List'
ArrayList<String> What=new ArrayList<String>();
ArrayList<String> Where=new ArrayList<String>();
BufferedReader br=new BufferedReader(new FileReader(csvfile));
StringTokenizer st=null;
String line="";
int linenumber=0;
int columnnumber;
int free=0;
int free1=0;
while((line=br.readLine())!=null){
line =line.replaceAll("[^A-Za-z0-9\",]|,(?!(([^\"]*\"){2})*[^\"]*$)", "+").replace("\"", "");
linenumber++;
columnnumber=0;
st=new StringTokenizer(line,",");
while(st.hasMoreTokens()){
columnnumber++;
String token=st.nextToken();
if("What".equals(token)){
free=columnnumber;
System.out.println("the value of free :"+free);
} else if("Where".equals(token)){
free1=columnnumber;
System.out.println("the value of free1 :"+free1);
}
if(linenumber>1){
if (columnnumber==free){
What.add(token);
} else if(columnnumber==free1){
Where.add(token);
}
}
}
}
// converting the 'What' Array List to array
String[] what=What.toArray(new String[What.size()]);
// converting the 'Where' Array List to array
String[] where = Where.toArray(new String[Where.size()]);
for(int i=0;i<what.length;i++){
String data = what[i].replaceAll("[^A-Za-z0-9\",]| (?!([^\"]*\"){2}[^\"]*$)", "+").replace("\"", "");
System.out.println(data);
System.out.println(where[i]);
String finaldata = data+where[i];
String json = readUrl(desturl);
br.close();
}catch(Exception e){
System.out.println("There is an error :"+e);
}

Java Read How to read first word then the rest of an input

I'm having trouble figuring out how to read the rest of a input line. I need to token the first word then possibly create the rest of the input line as one whole token
public Command getCommand()
{
String inputLine; // will hold the full input line
String word1 = null;
String word2 = null;
System.out.print("> "); // print prompt
inputLine = reader.nextLine();
// Find up to two words on the line.
Scanner tokenizer = new Scanner(inputLine);
if(tokenizer.hasNext()) {
word1 = tokenizer.next(); // get first word
if(tokenizer.hasNext()) {
word2 = tokenizer.next(); // get second word
// note: just ignores the rest of the input line.
}
}
// Now check whether this word is known. If so, create a command
// with it. If not, create a "null" command (for unknown command).
if(commands.isCommand(word1)) {
return new Command(word1, word2);
}
else {
return new Command(null, word2);
}
}
The input:
take spinning wheel
Output:
spinning
Desired Output:
spinning wheel
Use split()
String[] line = scan.nextLine().split(" ");
String firstWord = line[0];
String secondWord = line[1];
It means that you need to split the line at space and that will convert it into the array. Now using yhe index you can get any word you want
OR -
String inputLine =//Your Read line
String desiredOutput=inputLine.substring(inputLine.indexOf(" ")+1)
You can try like this also...
String s = "This is Testing Result";
System.out.println(s.split(" ")[0]);
System.out.println(s.substring(s.split(" ")[0].length()+1, s.length()-1));
Use split(String regex, int limit)
String[] line = scan.nextLine().split(" ", 2);
String firstWord = line[0];
String rest= line[1];
Refer to doc here

How to add delimiters from the StringTokenizers to a seperate string?

I am inputting a string and I want to add the delimeters in that string to a different string and I was wondering how you would do that. This is the code I have at the moment.
StringTokenizer tokenizer = new StringTokenizer(input, "'.,><-=[]{}+!##$%^&*()~`;/?");
while (tokenizer.hasMoreTokens()){
//add delimeters to string here
}
Any help would be greatly appreciated(:
If you want StringTokenizer to return the delimiters it parses, you would need to add a flag to the constructor as shown here
StringTokenizer tokenizer = new StringTokenizer(input, "'.,><-=[]{}+!##$%^&*()~`;/?", true);
But if you are searching only for delimiters I dont think this is the right approach.
I don't think StringTokenizer is good for this task, try
StringBuilder sb = new StringBuilder();
for(char c : input.toCharArray()) {
if ("'.,><-=[]{}+!##$%^&*()~`;/?".indexOf(c) >= 0) {
sb.append(c);
}
}
I'm guessing you want to extract all the delimiters from the string and process them
String allTokens = "'.,><-=[]{}+!##$%^&*()~`;/?";
StringTokenizer tokenizer = new StringTokenizer(input, allTokens, true);
while(tokenizer.hasMoreTokens()) {
String nextToken = tokenizer.nextToken();
if(nextToken.length()==1 && allTokens.contains(nextToken)) {
//this token is a delimiter
//append to string or whatever you want to do with the delimiter
processDelimiter(nextToken);
}
}
Create a processDelimiter method in which you add the delimiter to a different string or perform any action you want.
This would even take care of repeated usage of delimeters
String input = "adfhkla.asijdf.';.akjsdhfkjsda";
String compDelims = "'.,><-=[]{}+!##$%^&*()~`;/?";
String delimsUsed = "";
for (char a : compDelims.toCharArray()) {
if (input.indexOf(a) > 0 && delimsUsed.indexOf(a) == -1) {
delimsUsed += a;
}
}
System.out.println("The delims used are " + delimsUsed);

I want to search for a string using StringTokenizer but the string I'm looking for has a delimiter in it - Java

I have an external file named quotes.txt and I'll show you some contents of the file:
1 Everybody's always telling me one thing and out the other.
2 I love criticism just so long as it's unqualified praise.
3 The difference between 'involvement' and 'commitment' is like an eggs-and-ham
breakfast: the chicken was 'involved' - the pig was 'committed'.
I used this: StringTokenizer str = new StringTokenizer(line, " .'");
This is the code for the searching:
String line = "";
boolean wordFound = false;
while((line = bufRead.readLine()) != null) {
while(str.hasMoreTokens()) {
String next = str.nextToken();
if(next.equalsIgnoreCase(targetWord) {
wordFound = true;
output = line;
break;
}
}
if(wordFound) break;
else output = "Quote not found";
}
Now, I want to search for strings "Everybody's" and "it's" in line 1 and 2 but it won't work since the apostrophe is one of the delimiters. If I remove that delimiter, then I won't be able to search for "involvement", "commitment", "involved" and "committed" in line 3.
What suitable code can I do with this problem? Please help and thanks.
I would suggest using regular expressions (the Pattern class) rather than StringTokenizer for this. For example:
final Pattern targetWordPattern =
Pattern.compile("\\b" + Pattern.quote(targetWord) + "\\b",
Pattern.CASE_INSENSITIVE);
String line = "";
boolean wordFound = false;
while((line = bufRead.readLine()) != null) {
if(targetWordPattern.matcher(line).find()) {
wordFound = true;
break;
}
else
output = "Quote not found";
}
Tokenize by whitespace, then trim by the ' character.

Categories