I'm trying to read a large text file in the form of:
datadfqsjmqfqs+dataqfsdqjsdgjheqf+qsdfklmhvqziolkdsfnqsdfmqdsnfqsdf+qsjfqsdfmsqdjkgfqdsfqdfsqdfqdfssdqdsfqdfsqdsfqdfsqdfs+qsfddkmgqjshfdfhsqdflmlkqsdfqdqdf+
I want to read this string in the text file as one big java String. Is this possible? I know the use of the split method.
It worked to read it line by line, but what I really need is to split this long text-string at the '+' sign. Afterwards I want to store it as an array, arraylist, list,...
Can anyone help me with this? Because every information on the internet is just about reading a file line by line.
Thanks in advance!
String inpStr = "datadfqsjmqfqs+dataqfsdqjsdgjheqf+qsdfklmhvqziolkdsfnqsdfmqdsnfqsdf+qsjfqsdfmsqdjkgfqdsfqdfsqdfqdfssdqdsfqdfsqdsfqdfsqdfs+qsfddkmgqjshfdfhsqdflmlkqsdfqdqdf+";
String[] inpStrArr = inpStr.split("+");
Hope this is what you need.
You can read file using BufferedReader or any IO-classes.suppose you have that String in testing.txt file then by reading each line from file you can split it by separator (+). and iterate over array and print.
BufferedReader br = null;
try {
String sCurrentLine;
br = new BufferedReader(new FileReader("C:\\testing.txt"));//file name with path
while ((sCurrentLine = br.readLine()) != null) {
String[] strArr = sCurrentLine.split("\\+");
for(String str:strArr){
System.out.println(str);
}
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (br != null)br.close();
} catch (IOException ex) {
ex.printStackTrace();
}
}
It seems to me like your problem is that you don't want to read the file line by line. So instead, try reading it in parts (say 20 characters each time and building your string):
char[] c = new char[20]; //best to save 20 as a final static somewhere
ArrayList<String> strings = new ArrayList<String>();
StringBuilder sb = new StringBuilder();
BufferedReader br = new BufferedReader(new FileReader(filename));
while (br.read(c) == 20) {
String str = new String(c);
if (str.contains("+") {
String[] parts = str.split("\\+");
sb.append(parts[0]);
strings.add(sb.toString());
//init new StringBuilder:
sb = new StringBuilder();
sb.add(parts[1]);
} else {
sb.append(str);
}
}
You should be able to get a String of length Integer.MAX_VALUE (always 2147483647 (231 - 1) by the Java specification, the maximum size of an array, which the String class uses for internal storage) or half your maximum heap size (since each character is two bytes), whichever is smaller
How many characters can a Java String have?
Try this one:
private static void readLongString(File file){
ArrayList<String> list = new ArrayList<String>();
StringBuilder builder = new StringBuilder();
int r;
try{
InputStream in = new FileInputStream(file);
Reader reader = new InputStreamReader(in);
while ((r = reader.read()) != -1) {
if(r=='+'){
list.add(builder.toString());
builder = new StringBuilder();
}
builder.append(r);
}
}catch (IOException ex){
ex.printStackTrace();
}
for(String a: list){
System.out.println(a);
}
}
Here is one way, caveat being you can't load more than the max int size (roughly one GB)
FileReader fr=null;
try {
File f=new File("your_file_path");
fr=new FileReader(f);
char[] chars=new char[(int)f.length()];
fr.read(chars);
String s=new String(chars);
//parse your string here
} catch (Exception e) {
e.printStackTrace();
}finally {
if(fr!=null){
try {
fr.close();
} catch (IOException e) {
}
}
}
Related
I have an xml-base .tbx file containing code like this:
<descripGrp>
<descrip type="subjectField">406001</descrip>
</descripGrp>
<langSet xml:lang="en">
<tig>
<term>competence of the Member States</term>
<termNote type="termType">fullForm</termNote>
<descrip type="reliabilityCode">3</descrip>
</tig>
</langSet>
<langSet xml:lang="pl">
<tig>
<term>kompetencje państw członkowskich</term>
<termNote type="termType">fullForm</termNote>
<descrip type="reliabilityCode">3</descrip>
</tig>
</langSet>
</termEntry>
<termEntry id="IATE-290">
<descripGrp>
<descrip type="subjectField">406001</descrip>
</descripGrp>
I want to search and replace within entire (almost 50 MiB) file for codes from the field "subjectField" and replace the with proper text, eg.
406001 is for Political ideology, 406002 for Political institution.
I have a table with codes and corresponding names:
406001 Political ideology
406002 Political institution
406003 Political philosophy
There's five hundred of such codes so doing it by hand would take like forever.
I'm not a programmer (I'm learnig) but I know a little java so I made some little app which, I supposed, would help me, however the result is discouraging (luckily I'm not discouraged :))
That's what I wrote, the result is that it works extremely slow, doesn't replace those codes at all. It processed 1/5 of the file in 15 minutes (!). Additionally there are no new line characters in the output file so the entire xml code is in one line.
Any tips on which way I should go?
File log= new File("D:\\IATE\\export_EN_PL_2017-03-07_All_Langs.tbx"); // TBX file to be processed
File newe = new File("D:\\IATE\\now.txt"); // output file
String search = "D:\\IATE\\org.txt"; // file containing codes "40600" etc
String replace = "D:\\IATE\\rplc.txt"; // file containing names
try {
FileReader fr = new FileReader(log);
String s;
String s1;
String s2;
String totalStr = "";
String tot1 = "";
String tot2 = "";
FileReader fr1 = new FileReader(search);
FileReader fr2 = new FileReader(replace);
try (BufferedReader br = new BufferedReader(fr)) {
try (BufferedReader br1 = new BufferedReader(fr1)) {
try (BufferedReader br2 = new BufferedReader(fr2)) {
while ((s = br.readLine()) != null) {
totalStr += s;
while((s1 = br1.readLine()) != null){
tot1 += s1;
while ((s2 = br2.readLine()) != null){
tot2 += s2;
}
}
totalStr = totalStr.replaceAll(tot1, tot2);
FileWriter fw = new FileWriter(newe);
fw.write(totalStr);
fw.write("\n");
fw.close();
}
} catch (Exception e) {
e.printStackTrace();
}
} catch (Exception e) {
e.printStackTrace();
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
Its going to take a lot of redundant work to traverse 2 files to get matching values. Before you replace values in the .tbx files you should set up a properties file to read from. Here's a function that would do that:
public static Properties getProps(String pathToNames, String pathToNumbers){
Properties prop = new Properties();
try{
File names = new File(pathToNames);
BufferedReader theNames = new BufferedReader( new InputStreamReader (new FileInputStream(names)));
File numbers = new File(pathToNumbers);
BufferedReader theNumbers = new BufferedReader( new InputStreamReader (new FileInputStream(numbers)));
String name;
String number;
while(((name = theNames.readLine())!= null)&&((number = theNumbers.readLine())!= null)){
prop.put(number, name);
}
theNames.close();
theNumbers.close();
}catch(Exception e){
e.printStackTrace();
}
return prop;
}
Assuming you are using Java 8, you can check that the function is working with this:
thePropertiesFile.forEach((Object key, Object value) ->{
System.out.println(key+ " " +value);
});
Now you can write a function that will convert properly. Use a PrintStream to achieve the output functionality you want.
static String workingDir = System.getProperty("user.dir");
public static void main(String[] args){
Properties p = getProps(workingDir+"path/to/names.txt",workingDir+"path/to/numbers.txt");
File output = new File(workingDir+"path/to/output.txt");
try {
PrintStream ps = new PrintStream(output);
BufferedReader tbx = new BufferedReader(new InputStreamReader (new FileInputStream(new File(workingDir+"path/to/the.tbx"))));
String currentLine;
String theNum;
String theName;
int c; //temp index
int start;
int end;
while((currentLine = tbx.readLine()) != null){
if(currentLine.contains("subjectField")){
c = currentLine.indexOf("subjectField");
start = currentLine.indexOf(">", c)+1;
end = currentLine.indexOf("<", c);
theNum = currentLine.substring(start, end);
theName = p.getProperty(theNum);
currentLine = currentLine.substring(0,start)+theName+currentLine.substring(end);
}
ps.println(currentLine);
}
ps.close();
tbx.close();
} catch (IOException e) {
e.printStackTrace();
}
}
For numbers that don't exist, this will replace them with a null string. You can update that for your specific use.
If theNum contains multiple values, split into an array:
theName = "";
if(theNum.contains(","){
int[] theNums = theNum.split(",");
for (int num : theNums) {
theName += p.getProperty(num);
theName += ",";
}
theName = theName.replaceAll(",$", ""); //get rid of trailing comma
}
else
theName = p.getProperty(theNum);
I have a text file with state-city values:-
These are the contents in my file:-
Madhya Pradesh-Bhopal
Goa-Bicholim
Andhra Pradesh-Guntur
I want to split the state and the city... Here is my code
FileInputStream fis= new FileInputStream("StateCityDetails.txt");
BufferedInputStream bis = new BufferedInputStream(fis);
int h=0;
String s;
String[] str=null;
byte[] b= new byte[1024];
while((h=bis.read(b))!=-1){
s= new String(b,0,h);
str= s.split("-");
}
for(int i=0; i<str.length;i++){
System.out.println(str[1]); ------> the value at 1 is Bhopal Goa
}
}
Also I have a space between Madhya Pradesh..
So i want to Remove spaces between the states in the file and also split the state and city and obtain this result:-
str[0]----> MadhyaPradesh
str[1]----> Bhopal
str[2]-----> Goa
str[3]----->Bicholim
Please Help..Thank you in advance :)
I would use a BufferedReader here, rather than the way you are doing it. The code snippet below reads each line, split on hyphen (-), and removes all whitespace from each part. Each component is entered into a list, in left to right (and top to bottom) order. The list is converted to an array at the end in case you need this.
List<String> names = new ArrayList<String>();
BufferedReader br = null;
try {
String currLine;
br = new BufferedReader(new FileReader("StateCityDetails.txt"));
while ((currLine = br.readLine()) != null) {
String[] parts = currLine.split("-");
for (int i=0; i < parts.length; ++i) {
names.add(parts[i].replaceAll(" ", ""));
}
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (br != null) br.close();
} catch (IOException ex) {
ex.printStackTrace();
}
}
// convert the List to an array of String (if you require it)
String[] nameArr = new String[names.size()];
nameArr = names.toArray(nameArr);
// print out result
for (String val : nameArr) {
System.out.println(val);
}
Don't have the value pre known. lastIndexOf hence cannot be used (Since lastIndexOf(WHAT)? WHAT is unknown.
I want to get the last saved value of the file to be compared to a search value. Hence I need the Index of the last value in the file and want to get the value at this last index.
public static void searchInFile(double search) throws IOException, FileNotFoundException{
try{
double fl = Math.floor(search/10000);
int floor = (int)(fl);
int key = (int)(search);
String searchValue = String.valueOf(key);
String s = null;
String fname = "TextFile"+(floor+1)+".txt";
File f = new File(fname);
do{
if(f.exists()){
FileReader fr = new FileReader(fname);
BufferedReader br = new BufferedReader(fr);
while((s=br.readLine())!=null){
if(s.contains(searchValue)){
p(""+s.contains(","+searchValue+","));
p(search+" found in file "+fname);
}
else if(s.contains(","+searchValue+",")==false){
int last = s.lastIndexOf(s);
}
}
}
else{
write(key);
}
}while(true);
}
catch(Exception e){
e.printStackTrace();
}
}
Not sure If You are looking for the index of last word in .txt file If you do, then this might help .
public static void checkIndex() {
File file = new File("path");
BufferedReader reader = null;
StringBuffer buffer = new StringBuffer();
String strReading = "";
try {
reader = new BufferedReader(new FileReader(file));
while ((strReading = reader.readLine()) != null) {
buffer.append(strReading);
}
int index = buffer.toString().lastIndexOf(" ") + 1;
System.out.println(buffer.substring(index));
} catch (IOException e) {
e.printStackTrace();
}
}
What I could understand is only that you have multiple index files in which you have to search a value. but what do you means by "Last Index of". The problem is not clear very well. even the logic of your program is not explainable enough to catch the problem meaning very well.
Could you please update the problem statement and share your snippet in more details so that people can understand this and answer.
I was working a little bit with config files and file reader classes in java.
I always read/wrote in the files with arrays because I was working with objects.
This looked a little bit like this:
public void loadUserData(ArrayList<User> arraylist) {
try {
List<String> lines = Files.readAllLines(path, Charset.defaultCharset());
for(String line : lines) {
String[] userParams = line.split(";");
String name = userParams[0];
String number= userParams[1];
String mail = userParams[2];
arraylist.add(new User(name, number, mail));
}
} catch (IOException e) {
e.printStackTrace();
}
}
This works fine, but how can I save the content of a file as only one single string?
When I read a file, the string I use should be the exact same as the content of the file (without the use of arrays or line splits).
how can I do that?
Edit:
I try to read a SQL-Statement out of a file to use it with JDBC later on. That's why I need the content of the File as a single String
This method will work
public static void readFromFile() throws Exception{
FileReader fIn = new FileReader("D:\\Test.txt");
BufferedReader br = new BufferedReader(fIn);
String line = null;
StringBuilder sb = new StringBuilder();
while ((line = br.readLine()) != null) {
sb.append(line);
sb.append("\n");
}
String text = sb.toString();
System.out.println(text);
}
I hope this is what you need:
public void loadUserData(ArrayList<User> arraylist) {
StringBuilder sb = new StringBuilder();
try {
List<String> lines = Files.readAllLines(path, Charset.defaultCharset());
for(String line : lines) {
// String[] userParams = line.split(";");
//String name = userParams[0];
//String number= userParams[1];
//String mail = userParams[2];
sb.append(line);
}
String jdbcString = sb.toString();
System.out.println("JDBC statements read from file: " + jdbcString );
} catch (IOException e) {
e.printStackTrace();
}
}
or maybe this:
String content = new Scanner(new File("filename")).useDelimiter("\\Z").next();
System.out.println(content);
Just do that:
final FileChannel fc;
final String theFullStuff;
try (
fc = FileChannel.open(path, StandardOpenOptions.READ);
) {
final ByteBuffer buf = ByteBuffer.allocate(fc.size());
fc.read(buf);
theFullStuff = new String(buf.array(), theCharset);
}
nio for the win! :p
You could always create a Buffered reader e.g.
File anInputFile = new File(/*input path*/);
FileReader aFileReader = new FileReader(anInputFile);
BufferedReader reader = new BufferedReader(aFileReader)
String yourSingleString = "";
String aLine = reader.readLine();
while(aLine != null)
{
singleString += aLine + " ";
aLine = reader.readLine();
}
I am getting a really long string as the response of the web service I am collecting it in the using the StringBuilder but I am unable to obtain the full value I also used StringBuffer but had no success.
Here is the code I am using:
private static String read(InputStream in ) throws IOException {
//StringBuilder sb = new StringBuilder(1000);
StringBuffer sb = new StringBuffer();
String s = "";
BufferedReader r = new BufferedReader(new InputStreamReader( in ), 1000);
for (String line = r.readLine(); line != null; line = r.readLine()) {
sb.append(line);
s += line;
} in .close();
System.out.println("Response from Input Stream Reader >>>" + sb.toString());
System.out.println("Response from Input S >>>>>>>>>>>>" + s);
return sb.toString();
}
Any help is appreciated.
You can also split the string in array of strings in order to see all of them
String delimiter = "put a delimiter here e.g.: \n";
String[] datas=sb.toString().split(delimiter);
for(String string datas){
System.out.println("Response from Input S >>>>>>>>>>>>" + string);
}
The String may not print entirely to the console, but it is actually there. Save it to a file in order to see it.
I do not think that your input is too big for a String, but only not shown to the console because it doesn't accept too long lines. Anyways, here is the solution for a really huge input as characters:
private static String[] readHugeStream(InputStream in) throws IOException {
LinkedList<String> dataList = new LinkedList<>();
boolean finished = false;
//
BufferedReader r = new BufferedReader(new InputStreamReader(in), 0xFFFFFF);
String line = r.readLine();
while (!finished) {
int lengthRead = 0;
StringBuilder sb = new StringBuilder();
while (!finished) {
line = r.readLine();
if (line == null) {
finished = true;
} else {
lengthRead += line.length();
if (lengthRead == Integer.MAX_VALUE) {
break;
}
sb.append(line);
}
}
if (sb.length() != 0) {
dataList.add(sb.toString());
}
}
in.close();
String[] data = dataList.toArray(new String[]{});
///
return data;
}
public static void main(String[] args) {
try {
String[] data = readHugeStream(new FileInputStream("<big file>"));
} catch (IOException ex) {
Logger.getLogger(StackoverflowStringLong.class.getName()).log(Level.SEVERE, null, ex);
} catch (OutOfMemoryError ex) {
System.out.println("out of memory...");
}
}
System.out.println() does not print all the characters , it can display only limited number of characters in console. You can create a file in SD card and copy the string there as a text document to check your exact response.
try
{
File root = new File(Environment.getExternalStorageDirectory(), "Responsefromserver");
if (!root.exists())
{
root.mkdirs();
}
File gpxfile = new File(root, "response.txt");
FileWriter writer = new FileWriter(gpxfile);
writer.append(totalResponse);
writer.flush();
writer.close();
}
catch(IOException e)
{
System.out.println("Error:::::::::::::"+e.getMessage());
throw e;
}