In my application i have result set containing more than 20000 rows. I want to save it to an Array List. I am using the below code for this.
Class.forName("com.mysql.jdbc.Driver") ;
Connection conn = DriverManager.getConnection("jdbc:mysql://localhost:3306/Data", "root", "root") ;
Statement stmt = conn.createStatement() ;
String query = "select * from temp ;" ;
ResultSet rs = stmt.executeQuery(query) ;
ArrayList<String> varList = new ArrayList<String>();
while(rs.next()){
varList.add(rs.getString(1));
}
When i use the query it takes more time to fetch the data from the resultset and too slow if the table contains more than 20000 entries. How can this be solved ? Any suggestions will be very greatful.
I strongly suspect that you're trying to solve the wrong problem. Whatever you're doing with those 20,000 rows, you should do it at the database and only return the results of the operation (which, hopefully, will take much less space) to the client.
If you told us exactly what you're trying to do with the data, we might be able to offer more specific suggestions on how to do that.
(Alternatively, if you really do need all the 20,000 rows at the client, you might want to skip the database entirely and just store them at the client.)
20000 is not a big number in terms of database rows.
You problem could be in ArrayList.
By default, an ArrayList has a default size, I'm not sure how many, say 100. And when the inserted items are more than that number, Java will create a new arraylist with size incremented by some value, I'm not quite sure how many, say 100 also. Then the content in previous list will be copied to the new list. These are all string operations in your case. So you can see why it is slow. Do the following may solve your problem.
int rsSize = getResultSetSize(connection,query); //return the size of the result set first
ArrayList<String> varList = new ArrayList<String>(rsSize);
I suspect that your issue isn't the database, it's likely the catenation of items to the list. You should provide java with more memory. If it is the list.add then you could speed that up by allocating an array the size of the result and then using indexing to insert the data into the array.
What you want to do with those 20000 records after fetching it.? What ever you want to do see if you can do it at DB level using PL-SQL code. Many programmers don't use power of PL-SQL. Whatever you can do with other high level programming language, most of things among these can be done with PL-SQL too.
Related
How to get data from resultset to a list in java from oracle without loop ?
You can't do that. The commonly used practise is to loop thus:
while (rs.next()) {
int value = rs.getInteger(1);
}
etc. The only practical way not to loop is to accept you only have one row in your result set.
Frameworks such as Spring JDBC can call your class back for each result, which means you don't explicitly loop, but instead the framework is doing that for you.
Because formation of resultset object didn't take time but the iteration took huge amount of time and the memory usage also took a giant leap.
You're misunderstanding the mechanism. When you execute the query the statement is parsed, a cursor is opened, and the ResultSet object is returned; but that doesn't mean all of the data is returned immediately. (The default behaviour varies by database vendor, I believe, but I'm only talking about Oracle). That's why the executeQuery part can be very fast for what will ultimately be a huge result set.
The rows in the result set will be sent from the database to the driver in batches, based on the configured fetch size, which by default is 10. Those 10 rows are buffered by the driver, and each time you call next() a row is returned from the buffer; if the buffer is empty it will fetch the next 10 rows into the buffer and then return one to you.
Your loop isn't intrinsically slow, inefficient or memory-hungry; it's the batched fetches from the database to the buffer that are relatively slow, and you can improve the performance of that by adjusting the fetch size. That is a trade-off between network and memory utilisation - the higher the fetch size the more memory has to be allocated for that buffer, and setting it too big can also be detrimental, especially if you are already memory-constrained.
What you don't have is all one million rows sitting in your ResultSet object, with the loop duplicating them one-by-one into your List, which seems to be what your question is implying. Only a small number of rows exist in the ResultSet buffer at any time, so that does not keep consuming more and more memory as you loop.
The memory usage growth you see is from all the items you're adding to your List. If you really do need all million items in your List, and at the same time, then you have to have enough memory to accommodate them. That's in the Java side, not in the database, query or JDBC. If you have spare memory on your server you can change the JVM memory allocation, and if not you can add more physical memory first.
But you might want reconsider whether you actually do have to hold all of them at once, or whether you could - for example - process the items in batches, so you only have to hold a smaller subset at any time. It entirely depends on what you're doing with them though.
You can use Spring JDBC, eg JdbcTemplate.queryForList(String sql, Object... args) returns List<Map<String, Object>>
ResultSet resultset = null;
ArrayList<String> arrayList = new ArrayList<String>();
int i = 1;
while (resultset.next()) {
arrayList.add(resultset.getString("col 1"));
System.out.println(resultset.getString("Col 1"));
System.out.println(resultset.getString("Col 2"));
System.out.println(resultset.getString("Col n"));
}
Recursion like this?
private void getListFromRS(ResultSet rs, List res) {
if (rs.next()) {
res.add(rs.getObject(1));
getListFromRS(rs, res);
}
}
and call
ArrayList list=new ArrayList();
getListFromRS(rs, list);
I have a table with group and permission column. I want to find the max permission from a list of group. I am using java and oracle database, I thought of two ways to do this:
Way 1:
in java loop through the group list
result = select permission from table where group = currentgroup
if result > max, max = result
Way 2:
max = select max(permission) from table where group in (group list)
I thought way 2 would be faster, but then group list can be very long and I dont know if it is a good idea to have long list in a single sql query.
From the information you've given, the second approach is by far the best. Databases are optimised directly for these kinds of tasks, so within reason, its always best to narrow the data down with the database. The first approach means the database needs to return all values anyway, increasing processing time, bandwidth and using up memory within your java application.
I'm new to java but my experience with Matlab and C trained me to ALWAYS pre-allocate memory for an array variable before filling that variable inside a loop (e.g. For loop, While loop, etc).
I find myself retrieving data from a database in Java using ResultSet. A quick search shows there's no way to obtain the number of rows in the ResultSet without stepping through it. Thus the basic question: how to pre-allocate array length of a Java variable intended to store the results of the ResultSet query, without knowing the number of rows in that ResultSet?
What's the conventional wisdom? For example, if the ResultSet contains two columns of data, how to place each column into an separate Java array variable?
UPDATE 1: Some background -- I need to place everything returned by the ResultSet into an object so that I may pass that object to a non-Java (e.g. ActionScript) program that communicates with the Java program via this object's contents.
UPDATE 2: Here's the documentation on the conversion rules from Java to non-Java (e.g. ActionScript). Perhaps
http://help.adobe.com/en_US/LiveCycleDataServicesES/3.1/Developing/WSc3ff6d0ea77859461172e0811f00f6eab8-7ffdUpdate.html
Why are you adding it to arrays? You can easily iterate through the ResultSet, transform the results to the appropriate Objects, and add them to an ArrayList... gives you much more flexibility than adding to an array.
But if you really need the number of rows, I think you'll need to run a count query before running your original one.
EDIT: From the documentation you linked, it would seem that if you use a Java ArrayList you'd end up with an ActionScript mx.collections.ArrayCollection object instead of the ActionScript Array object you'd get if you used a Java array. Your choice which one to use, just convert List -> array if you can't change your ActionScript code...:
List<MyObject> myList = new ArrayList<MyObject>();
... populate myList from ResultSet ...
MyObject[] array = myList.toArray(new MyObject[myList.size()]);
as what Marcelo said, ArrayList is a better option for this problem, but instead of executing a COUNT query to know how many rows is returned, you can trick it by using:
rs.last();
rs.getRow();
then gat back to the first row after,.
I have a resultSet that I am pulling from my database:
String selStmt = "SELECT code_id, internal_code, representation, pos, decode(substr(internal_code, 5, 1), 'Q', 2, 1) as sorter, "
+ " to_char(term_start, 'MM-DD-YYYY') as sDate "
+ " FROM table1, terms WHERE code_id = 'SEARCH.TERMS' AND internal_code = terms.terms_id (+) ORDER BY 5, 4 ";
stmt = conn.prepareStatement(selStmt);
result = stmt.executeQuery();
What I want to do now is put these results into an array, I know that I can use a while loop to loop and get the results, but I can only use those once, I need to continue to use them throughout the rest of the page. I figured this could be done using an array, but if there is an easier way, let me know, also if more information is needed, please let me know.
Since you do not know the number of rows beforehand I would actually opt for an ArrayList. You can then insert custom objects as you like (build from your resultset).
There is no DataSet (.NET) equivalent for Java that I know of. Hence, consider something like the following:
ArrayList<Integer> data = new ArrayList<Integer>();
ResultSet rs = ....;
// For each record in the result...
while (rs.next()) {
// Want more values? Create a custom Type representing the Row!
// The following is just an example taking the first column (as an int).
// This would be done manually for each column... and possibly loadeded
// in a custom object which is then added to the list... blah blah.
// (Alternatively each row could be represented as Object[] or List<Object>
// at the expense of losing static typing.)
int id = rs.getInt(0);
data.add(id);
}
// Then later, if you *really* want an array...
// (Java is such a backwards language and lacks a trivial way
// to go to int[] from Integer[] but I digress...)
Integer[] array = data.toArray(new Integer[0]);
Happy coding
You could just store the data in a 2D array. But a more OO friendly way would be to create an object to represent what a row of data represents and create a List of these objects from the ResultSet.
To pull the data, just loop through the ResultSet like you have been then for each loop create a new instance of your custom object and add it to the list.
I once used the below code to solve a similar challenge:
Vector rows = new Vector();
Vector nrow;
int cnt = 0;
while(result.next())
{
nrow=new Vector();
cnt+=1;
nrow.addElement(String.valueOf(cnt));
for(int i=1;i<=3;i++) //replace 3 with the length of the columns
{
nrow.addElement(result.getObject(i));
}
rows.addElement(nrow);
}
Now you can loop through rows and make use of the data as you like. Each object returned on each iteration contains an entire row data.
The standard way is to loop over the ResultSet once and (possibly) store each record in an array -which you can later loop over and over as you like. That's what most answers here suggest, and that's the correct way.
The relevant point is this: you'd (apparently) like to keep the "raw" records that a ResultSet returns in each next() call (so that you can store each record on an ArrayList and treat it as a scrollable ResultSet or sort of). I think can't be done, and I strongly believe that, in most cases, that's a bad idea. The ResultSet belongs to the JDBC layer, which is a rather low-level layer, and which has its problems and pitfalls. In general, the ResultSet should be consummed and closed as quickly as you can, and should not leak to upper layers. By "consummed" I mean to loop it over and build for each curson position your own Java objects (using the rs.getXXX() methods) not related to the JDBC api. That's what most people and frameworks (eg iBatis) do, and, in most cases, doing otherwise is bad practice.
For example, I've once seen some pseudo-dao implemented as (pseudo code) :
public ResultSet getUsers() {
conn = openConnection();
stmt = conn.prepareStatement(...);
result = stmt.executeQuery();
stmt.close();
conn.close();
return result;
}
and the caller would then loop over the resultset. This is horrid, can fail, as the resultset may try to get the data from the db (but the connection has been closed!).
If you really want to do something like this, you should look into the RowSet class. But I don't think it's much used.
An array is probably the way to do it. Just initialize it before your while-loop, fill it during your while-loop, and use it afterwards :)
Another possibility is to use a HashMap to store your rows if you need to constantly reference them throughout the page. For the index use the primary key from the db.
This way you can quickly reference specific rows as needed. If you just need to loop over the result set every time than an arraylist as has been said should work pretty well here.
Currently, I have a table being read from like so:
ps = con.prepareStatement("SELECT * FROM testrow;");
rs = ps.executeQuery();
rs.next();
String[] skills = rs.getString("skills").split(";");
String[] skillInfo;
for (int i = 0; i < skills.length; i++) {
skillInfo = skills[i].split(",");
newid.add(Integer.parseInt(skillInfo[0]));
newspamt.add(Byte.parseByte(skillInfo[1]));
mastery.add(Byte.parseByte(skillInfo[2]));
}
rs.close();
ps.close();
The information is saved to the database by using StringBuilder to form a string of all the numbers that need to be stored, which would be in the format of number1,number2,number3;
I had written a test project to see if that method would be faster than using MySQL's batch method, and it beat MySQL by roughly 3 seconds. The only problem I'm facing now is when I go to read the information, MySQL completes the job in a few milliseconds, where as calling the information using String[] to split the data by the ";" character, and then also using String[] to split information within a loop by the "," character, takes about 3 to 5 seconds.
Is there anyway I can reduce the amount of time it takes to load the data using the String[], or possibly another method?
Do not store serialized arrays in database fields. Use 3NF?
Do you read the information more often than you write it ? If so (most likely) then optimising the write seems to be emphasising the wrong end of the operation. Why not store the info in separate columns and thus avoid splitting (i.e. normalise your data)?
If you can't do that, can you load the data in one thread, and hand off to another thread for splitting/storing the info. i.e. you read the data in one thread, and for each line, pass it through (say) a BlockingQueue to another thread that splits/stores.
in the format of number1,number2,number3
consider normalising the table, giving one number per row.
String.split uses a regular expression for its algorithm. I'm not how it's implemented, but the chance is that is quite cpu heavy. Try implementing your own split method, using a char value instead of a regular expression.
Drop the index while inserting, that'll make it faster.
Of course this is only an option for a batch load, not for 500-per-second transactions.
The most obvious alternative method is to have a separate skills table with rows and columns instead of a single field of concatenated values. I know it looks like a big change if you've already got data to migrate but it's worth the effort for so many reasons.
I recommend that instead of using the split method, you use a precompiled regular expression, especially in the loop.