I am currently trying to create a timeline in phpmyadmin from a mysql database, using java and jdbc. In the database I store IPs that I get from a pcap file, and I get the time I want from the packet (using packet.getCaptureHeader().nanos()). Every time an IP occurs I increment a counter. What I want is to create a timeline showing the progress of the sum of the counter of each IP. I tried something but I think I am in the wrong way. Any suggestions?
long timer=packet.getCaptureHeader().nanos();
Class.forName("com.mysql.jdbc.Driver");
connect = DriverManager.getConnection("jdbc:mysql://localhost/thesis?"
+ "user=sqluser&password=sqluserpw");
preparedStatement = connect.prepareStatement("INSERT INTO thesis.ICMP values (?, ?, ?, ?) ON DUPLICATE KEY UPDATE counter=counter+1;");
preparedStatement.setString(1, xIP);
preparedStatement.setString(2, "ICMP");
preparedStatement.setInt(3, 1);
preparedStatement.setLong(4, timer);
preparedStatement.executeUpdate();
I noticed that if I use DATE type variable, I can create easy timelines, but DATE doesn't support that kind of accuraccy. Please feel free to think out of the box, even suggest a new approach, I won't mind.
MySQL doesn't support nanosecond resolution. Since MySQL 5.6.4, there is support for fractional seconds with microsecond precision, but for further precision (or if you have an old MySQL version), you'll have to come up with something on your own.
Probably what I'd do in this particular case is store it as a date up to second resolution, then store the fractional portion of the second, converted to nanoseconds, as an unsigned INT (a signed integer would work, but you'd never want a negative value anyway). The downside is that searching and sorting becomes more difficult. This person is discussing storing the nanoseconds as a decimal, which I don't understand, but has some good thoughts on the issue aside from that. Another possibility is to use a BIGINT to store the number of nanoseconds since an epoch, but that only gives about 500 years of possible values. For your use, that may be fine, but is a limitation to keep in mind.
Related
I have a java application which sets up a jdbc connection to an Orcale database. I am attempting to insert data into the database but am confused when it comes to the oracle NUMBER type. I have three columns in my table which are of these types respectively.
NUMBER(38,0)
NUMBER(20,0)
NUMBER(16,0)
My first question is what type of java type should I put the data in, in order to use it in a prepared statement.
My second question is what set operation can I use in a prepared statement in order to insert the data.
Lets just assume we are working with NUMBER(38,0). Would I set the java type to a BigInteger? If I had an integer 1 would it be
BigInteger one = new BigInteger(1);
Then in my preparedStatement it would be
PreparedStatement pstmt = conn.prepareStatement("INSERT INTO TABLE(bigInt) VALUES(?)");
pstmt.setLong(1, one);
This seems to not work, so I assume that this is not correct. Any help would be appreciated.
setLong() cannot take a BigInteger. If you truly have values exceeding the range of long in your database, then you may need to use setBigDecimal(), as there is no setBigInteger(), and your variable would have to be of type BigDecimal. If long encompasses the range of values in your database, then just use long and setLong().
you can try this way:
oracle.sql.NUMBER numberValue = new oracle.sql.NUMBER(bigIntegerValue);
cs.setObject(id, numberValue, OracleTypes.NUMBER);
where bigIntegerValue is an instance of java.math.BigInteger, it works for me
During execution of a program that relies on the oracle.sql package there is a large performance hit for persisting > 200 million Timestamps when compared to persisting the same number of longs.
Basic Schema
Java to persist:
Collection<ARRAY> longs = new ArrayList<ARRAY>(SIZE);
Collection<ARRAY> timeStamps = new ArrayList<ARRAY>(SIZE);
for(int i = 0; i < SIZE;i++)
{
longs.add(new ARRAY(description, connection, i));
timeStamps.add(new ARRAY(description,connection,new Timestamp(new Long(i)));
}
Statement timeStatement = conn.createStatement();
statement.setObject(1,timeStamps);
statement.execute(); //5 minutes
Statement longStatement = conn.createStatement();
statement.setObject(1,longs);
statement.execute(); //1 minutes 15 seconds
My question is what does Oracle do to Timestamps that make them so awful to insert in a bulk manner?
Configuration:
64 bit RHEL 5
jre 6u16
ojdbc14.jar
64 GB dedicated to the JVM
UPDATE
java.sql.Timestamp is being used
Number takes 4 bytes, Timestamp takes 11 bytes. In addition, Timestamp has metadata associated with it. For each Timestamp, Oracle seems to compute the metadata and store with the field.
Oracle timestamps are not stored as absolute value since epoc like a java.sql.Timestamp internally holds. It's a big bitmask containing values for the various "human" fields, centuries, months, etc.
So each one of your nanosecond-since-epoch timestamps is getting parsed into a "human" date before storage.
Adding to Srini's post, for documentation on memory use by data type:
Oracle Doc on Data Types: http://docs.oracle.com/cd/E11882_01/timesten.112/e21642/types.htm#autoId31 (includes memory size for Number and Timestamp)
The docs state that Number takes 5-22 bytes, Timestamp takes 11 bytes, Integer takes 4 bytes.
Also - to your point on querying against a date range - could you insert the dates as long values instead of timestamps and then use a stored procedure to convert when you are querying the data? This will obviously impact the speed of the queries, so it could be kicking the problem down the road, but.... :)
I have a performance issue with calling getInt inside a ResultSetExtractor. GetInt is called 20000 times. One call cost 0.15 ms, the overall cost is 24 seconds, while running inside the profiler. The execution of the SQL statements takes around 8 seconds (access over Primary-Key). I use the mysql driver version 5.1.13, mysql server 5.1.44 and spring-jdbc-3.1.1
Have you an idea to improve the performance?
mut.getMutEffect()[0]=(rs.getInt("leffect_a") != 0);
mut.getMutEffect()[1]=(rs.getInt("leffect_c") != 0);
...
mut.getMutEffect()[19]=(rs.getInt("leffect_y") != 0);
mut.getMutReliability()[0]=rs.getInt("lreliability_a");
...
mut.getMutReliability()[19]=rs.getInt("lreliability_y");
My scheme looks like this
CREATE TABLE mutation (
...
leffect_a BIT NOT NULL,
lreliability_a TINYINT UNSIGNED NOT NULL,
...
leffect_y BIT NOT NULL,
lreliability_y TINYINT UNSIGNED NOT NULL,
...
) ENGINE=MyISAM;
Edit: Within getInt the methode getIntWithOverflowCheck is called which seems to be expensive. Is it possible to turn of this checks?
Here are some suggestions:
Set fetch size to a fairly large number: Statement.setFetchSize(). This should reduce the round-trips to the database server while processing the resultset.
Ensure the select statement is optimal by profiling
General table optimization, e.g. are you using correct datatypes? It looks like you could change the leffect_a to a BOOLEAN
Make sure you aren't returning any unnecessary columns in your SELECT statement.
Use PreparedStatement
Avoid scrollable and updatable resultsets (neither are the default)
Two suggestions:
Store the result of getMutEffect() and getMutReliability() in local variables as they are used repeatedly. The hotspot jit might inline and remove the duplicate expressions but I think its clearer to not rely on this.
It might be faster to retrieve the values of the ResultSet using their indizes instead of the column names. You could even create a local map of names to index, strangely for some jdbc drivers this is faster than letting ResultSet do the mapping.
my site needs to store the ip and timestamp of every visit on mysql. i am concerned that very quickly i will have 1e6 rows in my database.
what is the best way to compress a date on mysql or java? does mysql already compress dates? ideally, I would like to un-compress the date values rather quickly to generate reports.
Update: sorry i meant a mil per day. but I guess that is still minuscule.
Mate, one million rows is a tiny database. I wouldn't be worrying too much about that. In any case, MySQL uses a pretty compressed format (3 bytes) anyway as per this page:
DATE: A three-byte integer packed as DD + MM×32 + YYYY×16×32
In other words, at bit level (based on the 1000-01-01 thru 9999-12-31 range):
00000yyy yyyyyyym mmmddddd
Use the built in MySQL datetime type. A million rows isn't that many.
A mysql timestamp would be only 4 bytes. An integer representing the timestamp would be the same. It would be efficient to save it as a mysql type since you'd be able to index and/or query based on that column efficiently.
Any "compressed" form not a mysql type would be inefficient to query.
i have a table containing 15+ million records in oracle. its sort of a log table which has a created_ts column of type "date" . i have a simple "non-unique" type index on created_ts column.
i have a simple range query :
select * from table1 where created_ts >= ? and created_ts <= ?;
when i run this query from SQLPlus or SQL Developer etc like this :
select * from table1
where created_ts >= TO_DATE( '2009-11-10 00:00:00', 'YYYY-MM-DD HH24:MI:SS')
and created_ts <= TO_DATE( '2009-11-10 23:59:59', 'YYYY-MM-DD HH24:MI:SS');
the query returns within 1-2 second max.
but when I run the exact same query in java over JDBC and set the corresponding "?" params using java.sql.Timestamp object . the query takes long time . Analyzing the oracle process it goes for full table scan and doesnt use the index.
the jdbc driver i am using is ojdbc5 11.1.0.7.0
Can any one please help .. how to create the index correctly so that it uses the index.
My problem was resolved when i used "oracle.sql.DATE" objects to set the bind variables instead of "java.sql.timestamp" . The query used the index and executed almost within 1-2 seconds.
Thanks to all who replied and helped.
But its problematic for me as this solution is DB dependent and my app receives DB connection and query as param and load and process data in a generic way. The DB connection can be of any RDBMS like oracle, mysql, etc.
This is classic behaviour for an implicit datatype conversion. Because the database is having to convert the datatype of the column it cannot use any index on that column.
In your case I suspect this is due to your use of java.sql.Timestamp. Would it be possible to use the equivalent type from the Oracle datatypes package, oracle.sql.Timestamp? Obviously that may have some knock-on effects but I think you should at least test it, to see whether that solves your problem.
The difference may because of bind variables vs. literal values. You are not comparing the same things.
Try this in SQL*Plus:-
explain plan for
select * from table1 where created_ts >= :1 and created_ts <= :2;
set markup html preformat on
set linesize 100
set pagesize 0
select plan_table_output
from table(dbms_xplan.display('plan_table',null,'serial'));
This will show you the plan Oracle will pick when using bind variables. In this scenario, Oracle has to make up a plan before you have provided values for your date range. It does not know if you are selecting only a small fraction of the data or all of it. If this has the same plan (full scan?) as your plan from java, at least you konw what is happening.
Then, you could consider:-
Enabling bind peeking (but only after testing this does not cause anything else to go bad)
Carefully binding literal values from java in a way that does not allow SQL injection
Putting a hint in the statement to indicate it should use the index you want it to.
You should try a hint of the form /*+ USE_INDEX(table_name, index_name) */
My guess is that the optimizer is choosing a full table scan because it sees that as the best option in absence of knowing the bind values.