How to split a string by tabs and newlines? - java

I got a tab-delimited file that I want to split by tabs and by newlines where a tab represents the delimiter between fields and a newline represents a new object that should be created. The file can look like this:
Peter\tpeter#example.com\tpeterpassword\nBob\tbob#bobby.com\tbobbypassword\n...
where \t is a tab and \n is a newline.
I want to enable uploading this file to my program that creates a new user for every line in the file with the fields on the line. But how can I use two tokens - both tab and newline? My code would look something like the following:
String everything = "";
BufferedReader br = null;
try {
br = new BufferedReader(new InputStreamReader(file.getInputStream()));
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while (line != null) {
//now create object according to the string
StringTokenizer st = new StringTokenizer(line , "\t");
String name = st.nextToken();
String email = st.nextToken();
String password = st.nextToken();
User.createNewUser(name, email, password);
sb.append(line);
sb.append('\n');
line = br.readLine();
}
everything = sb.toString();
br.close();
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("Everything: " + everything);
Would code like the above work?

I would do a String.split("\\n") for each line. Then you have all the information you need for each user. Do another String.split("\\t") and construct your object using the resulting array.
From the Java Doc:
StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.
http://docs.oracle.com/javase/7/docs/api/java/util/StringTokenizer.html

Related

String Tokenizer Not Registering String on Second Line as a Token When Reading File Despite Using .nextToken()

I am trying to read a file. The file in question has two strings, one on its own line, like this:
COMETQ
HVNGAT
I am trying to assign each string to its own String variable. However, when I run my code (below), I get a NoSuchElementException for the second .nextToken().
BufferedReader f = new BufferedReader(new FileReader("ride.in"));
PrintWriter out = new PrintWriter(new BufferedWriter(new FileWriter("ride.out")));
StringTokenizer st = new StringTokenizer(f.readLine());
String comet = st.nextToken();
String group = st.nextToken();
Can someone help me figure out what's wrong? Thank you!
Note: this is a USACO training page problem. I am just trying to seek help to debug the file reading, not solve the problem.
You only gave it one line:
new StringTokenizer(f.readLine());
You'll have to read all the lines from the file first, then pass the resulting string to the constructor.
Note: in this case, you don't even have to use the StringTokenizer. Just use the BufferedReader
StringTokenizer should be used when text contains delimiters and you want to split. You can also use split() method also.
Syntax :
StringTokenizer stringTokenizer = new StringTokenizer(text, delimiter);
For example :
StringTokenizer stringTokenizer = new StringTokenizer("abc, def", ",");
But in your file, there is no such delimiter present in the string. So, StringTokenizer is of no use here.
I have tested with this :
BufferedReader bufferedReader = new BufferedReader(new FileReader(new File("F:/test.txt")));
String line;
String extracted = "";
while ((line = bufferedReader.readLine()) != null) {
StringTokenizer stringTokenizer = new StringTokenizer(line);
while (stringTokenizer.hasMoreElements()) {
extracted = extracted + stringTokenizer.nextElement().toString() +",";
}
}
bufferedReader.close();
String[] splits = extracted.split(",");
String comet = splits[0];
String group = splits[1];
System.out.println(comet + " " + group);
Output :
COMETQ HVNGAT
Hope this helps you :)

Combined Xml String Split Java

I am trying to split a combined text file. The combined text file has multiple xml files inside. I want to split on <?xml version='1.0'?> which is the start of every new xml inside the combined text file. Not sure what is the best way to do this. Currently this is what I have which does not split correctly.
Updated Code Working (fixed quotation in quotes problem added Pattern.quote):
Scanner scanner = new Scanner( new File("src/main/resources/Flume_Sample"), "UTF-8" );
String combinedText = scanner.useDelimiter("\\A").next();
scanner.close(); // Put this call in a finally block
String delimiter = "<?xml version=\"1.0\"?>";
String[] xmlFiles = combinedText.split("(?="+Pattern.quote(delimiter)+")");
for (int i = 0; i < xmlFiles.length; i++){
File file = new File("src/main/resources/output_"+i);
FileWriter writer = new FileWriter(file);
writer.write(xmlFiles[i]);
System.out.println(xmlFiles[i]);
writer.close();
}
The split method takes a regular expression string, so you may want to escape your delimiter String to a valid regex :
String[] xmlFiles = combinedText.split(Pattern.quote(delimiter));
See the Pattern.quote method .
Be also aware that you will load the entire initial file in memory if you proceed this way.
A streamed approach would perform better if the input file is large...
I would use something like this if you want to parse the data manually.
public static void parseFile(File file) throws AttributeException, LineException{
BufferedReader br = null;
String s = "";
int counter = 0;
if(file != null){
try{
br = new BufferedReader(new FileReader(file));
while((s = br.readLine()) != null){
if(s.contains("<?xml version='1.0'?>")){
//Write in new file with Stringbuffer and Filewritter.
}
}
br.close();
}catch (IOException e){
System.out.println(e);
}
}
}

Adding single quotes to character after a particular string

I have a text file which consists of a string ,I am parsing the file for my further purpose ,I want to parse by adding a single quote to a character after a particular string ,How to do that??
Text file data:
{Name:{ID:12342,age:32},type:s},{Name:{ID:12345,age:42},type:t},{Name:{ID:12348,age:35},type:s},{Name:{ID:12349,age:55},type:t}
Here I want to add a single quote to character after type:''
Expected o/p:
{Name:{ID:12342,age:32},type:'s'},{Name:{ID:12345,age:42},type:'t'},{Name: {ID:12348,age:35},type:'s'},{Name:{ID:12349,age:55},type:'t'}
My java code:
BufferedReader br = new BufferedReader(new FileReader("D:/Workspace/JAVA/Sample/EMP.txt"));
try {
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while (line != null)
{
sb.append(line);
sb.append(System.lineSeparator());
line = br.readLine();
}
String value = sb.toString();
You could use the below string.replceAll function.
string.replaceAll("(?<=:)([a-zA-Z]+)", "'$1'");
This would add single quotes around the word(only letters) which exists next to the colon.
DEMO
(?<=type:)([^,}]*)
Try this.Replace by '$1'.See demo.
https://regex101.com/r/sJ9gM7/89

Cannot read first line of a file

I want to read the content of /etc/passwd file and get some data:
public void getLinuxUsers()
{
try
{
// !!! firstl line of the file is not read
BufferedReader in = new BufferedReader(new FileReader("/etc/passwd"));
String str;
str = in.readLine();
while ((str = in.readLine()) != null)
{
String[] ar = str.split(":");
String username = ar[0];
String userID = ar[2];
String groupID = ar[3];
String userComment = ar[4];
String homedir = ar[5];
System.out.println("Usrname " + username +
" user ID " + userID);
}
in.close();
}
catch (IOException e)
{
System.out.println("File Read Error");
}
}
I noticed two problems:
first line of the file is not read with root account information. I starts this way:
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
And how I can modify the code to use Java 8 NIO? I want to check first the existing of the file and then to proceed with reading the content.
The problem is that the first readLine() is outside the loop where the string is being processed, you should delete this:
str = in.readLine();
… Because in the next line (the one with the while) you're reassigning the str variable, that's why the first line is lost: the loop's body starts processing from the second line. Finally, to use Java nio, do something like this:
if (new File("/etc/passwd").exists()) {
Path path = Paths.get("/etc/passwd");
List<String> lines = Files.readAllLines(path, Charset.defaultCharset());
for (String line : lines) {
// loop body, same as yours
}
}
with nio:
Path filePath = Paths.get("/etc/passwd");
List<String> fileLines = Files.readAllLines(filePath);
Note that Files.readAllLines without 2nd parameter treats the file encoding as UTF-8, instead of system encoding (property "file.encoding")

Read XML, Replace Text and Write to same XML file via Java

Currently I am trying something very simple. I am looking through an XML document for a certain phrase upon which I try to replace it. The problem I am having is that when I read the lines I store each line into a StringBuffer. When I write the it to a document everything is written on a single line.
Here my code:
File xmlFile = new File("abc.xml")
BufferedReader br = new BufferedReader(new FileReade(xmlFile));
String line = null;
while((line = br.readLine())!= null)
{
if(line.indexOf("abc") != -1)
{
line = line.replaceAll("abc","xyz");
}
sb.append(line);
}
br.close();
BufferedWriter bw = new BufferedWriter(new FileWriter(xmlFile));
bw.write(sb.toString());
bw.close();
I am assuming I need a new line character when I prefer sb.append but unfortunately I don't know which character to use as "\n" does not work.
Thanks in advance!
P.S. I figured there must be a way to use Xalan to format the XML file after I write to it or something. Not sure how to do that though.
The readline reads everything between the newline characters so when you write back out, obviously the newline characters are missing. These characters depend on the OS: windows uses two characters to do a newline, unix uses one for example. To be OS agnostic, retrieve the system property "line.separator":
String newline = System.getProperty("line.separator");
and append it to your stringbuffer:
sb.append(line).append(newline);
Modified as suggested by Brel, your text-substituting approach should work, and it will work well enough for simple applications.
If things start to get a little hairier, and you end up wanting to select elements based on their position in the XML structure, and if you need to be sure to change element text but not tag text (think <abc>abc</abc>), then you'll want to call in in the cavalry and process the XML with an XML parser.
Essentially you read in a Document using a DocuemntBuilder, you hop around the document's nodes doing whatever you need to, and then ask the Document to write itself back to file. Or do you ask the parser? Anyway, most XML parsers have a handful of options that let you format the XML output: You can specify indentation (or not) and maybe newlines for every opening tag, that kinda thing, to make your XML look pretty.
Sb would be the StringBuffer object, which has not been instantiated in this example. This can added before the while loop:
StringBuffer sb = new StringBuffer();
Scanner scan = new Scanner(System.in);
String filePath = scan.next();
String oldString = "old_string";
String newString = "new_string";
String oldContent = "";
BufferedReader br = null;
FileWriter writer = null;
File xmlFile = new File(filePath);
try {
br = new BufferedReader(new FileReader(xmlFile));
String line = br.readLine();
while (line != null) {
oldContent = oldContent + line + System.lineSeparator();
line = br.readLine();
}
String newContent = oldContent.replaceAll(oldString, newString);
writer = new FileWriter(xmlFile);
writer.write(newContent);
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
scan.close();
br.close();
writer.close();
} catch (IOException e) {
e.printStackTrace();
}
}

Categories