I have a String which is of the form: 2013-10-20 15:18:39.954. I am trying to figure out the best way to store this data so that it would use the least possible memory. Currently I am storing it as a Date object in Java. From this link I found out that the object uses about 32 bytes of memory. Is there some way to use less memory to store this data? I am trying to use as low memory as possible, so even 1 byte lesser would be fine.
I was thinking that I could use a String but this link says Strings also use a lot of memory. Any help would be appreciated!
You can convert the Date to a long with the getTime() method (milliseconds since epoch) and back, and a long is 8 bytes.
A java.util.Date is just a long under the hood. You could represent the same information with a long and build Dates as necessary.
Just beware that storing information this way, either java.util.Date or long, is not fully portable across platforms if you convert it back to human readable representation.
The number of milliseconds since epoch that, for example, translates to 'Midnight June 18th, 2019' can change if a platform upgrade introduces new leap seconds, or daylight savings rules change.
Related
I am using Joda-Time and I have noticed that DateTime is stored as a java.util.Date in the Google App Engine Datastore for Java, while LocalDateTime is stored as a ISO 8601 compliant String.
http://code.google.com/p/objectify-appengine/source/browse/src/com/googlecode/objectify/impl/translate/opt/joda/?r=2d48a85eae3a679c0dc0d01631de99f3b4775b29
I know that java.util.Date is a native type of the Datastore.
Is there any particular advantages is storing date/times as a java.util.Date as compared to a ISO 8601 compliant String or is it all the same. When I say advantage I might consider differences in regards to ...
Inequality queries
Storage size
Read/write cost
etc.
The accepted answer is not wrong, but I would like to add extra details which give a more balanced view.
a) Stable queries: ISO-8601 is stable as long as you assert that
you only use one date format for storage (ISO defines three: calendar date, ordinal date and week date)
and that you always use one precision degree for the time part (for example always in milliseconds)
and that you always use UTC with respect to global timestamps (that is zero offset with symbol Z).
Confirming this kind of stability can be application-dependent while java.util.Date does not require the same care.
b) Precision: ISO-8601 can express more precision beyond milliseconds while java.util.Date and Joda-Time are limited here. This is particularly true if you might later think of other new time libraries like JSR-310 in Java 8 or my own one which provide nanosecond precision. Then you will have precision issues with all JDBC types, java.util.Date and the database columns as far as they are not CHAR or VARCHAR.
A striking example is the JDBC-type java.sql.Time whose precision is limited to seconds, not better. This is pretty much in contrast to the new Java8-type java.time.LocalTime which offers nanoseconds. And worse: This aspect is also relevant for Joda-Time and java.util.Date in your application layer.
For rather academic purposes: Leapseconds can only be stored in ISO-8601-format, not with java.util.Date or similar.
c) Storage size: Of course, java.util.Date has a more compact representation, but else I have to say that disk space is cheap nowadays, so this is not an item to worry about so much.
d) Read-Write-costs: This item is in favor of compact data types like java.util.Date. But you have also to consider that even in this case you have to represent it in a human-readable format in any other layer sooner or later (most in logging or in representation layer). So for data exchange with other proprietary applications which expect java.util.Date this native type is okay, but for logging purposes or XML-data exchange ISO-8601 is probably the better format.
And if you really care so much about performance costs, you might even consider a number type (long - 64 bit) to avoid garbage burdens by unnecessary object creation (in extreme edge cases). Remember: java.util.Date is just a wrapper around a long.
The advantage of java.util.Date: Stable queries (both inequality and equality), storage size, and interoperability with other GAE languages that have native date representations.
I'm creating a custom data type that needs to store a set of exact timestamps (to millisecond accuracy) in a way that is both efficient and correct. I'm not particularly famiilar with the intricacies of timestamp handling so thought I would ask here for some wise advice.
I can see many options:
Store a Joda Instant for every timestamp.
Store a Joda DateTime for every timestamp.
Store a single Joda DateTime object once for the data type, and have a long offset for all the other timestamps relative to the main DateTime
Express every timestamp as a long offset to a fixed point (e.g. the Unix epoch of 1970-01-01T00:00:00Z )
.....other combinations.....
Questions:
What is the best way to store a sequence of timestamps?
What are the key tradeoffs?
Any pitfalls to watch out for?
Each of your storage options makes sense, and it's nice to see all of your options are with actual instants and never local datetimes (e.g., without time zones).
Your custom class will really be defined by its interface, so if you choose to store longs (epoch offsets) you can always provide interface methods to get the values from the sequence (and I assume other things like "deltas" -- or intervals, durations, or periods in Joda-speak) in human readable datetimes and periods if you like.
As you asked a number of questions, involving trade-offs, here's what I can offer:
Storing a sequence of longs is the most space-efficient.
Longs are not really as bad as you might think, since if your interface methods want to return datetimes, you just pass the long to the DateTime constructor.
Instants are thin wrappers over longs and provide convenience methods if you need to add durations to them or compute durations from instants; your code might look a little nicer than if you do your own math on longs and then construct a DateTime or Period or Duration around them.
DateTimes are great if you don't have excessive storage requirements and the actual date and time-of-day matter to the clients of your custom data type. Will your users care that a timestamp is on October 10th at 16:22 in the America/Los Angeles time zone? Or is the duration between the timestamps all that matter?
Storing a datetime or instant plus an array of offsets looks like a messy implementation since there are two concepts in play. I would think storing a single sequence of instants/datetimes only, and not mixing in durations, make a lot more sense. If you need to work with durations, just compute them in your interface methods.
I would say the only pitfalls to watch out for involve dealing with time zones if you are storing longs and your clients need to be thinking in datetimes.
In terms of tradeoffs, I only really see that longs, being primitive, save space, and I would guess a miniscule amount of time since DateTimes are objects and there is all that heap allocation and deallocation that takes place. But again, unless you are severely memory-constrained, I would say the best thing is to store DateTimes. Joda-Time can do all the time zone management for you. The parsing and formatting routines are easy and thread-safe. There are tons of convenience methods in place. And you won't have to do any conversion of your own from datetimes to longs. Storing longs only feels like a premature optimzation to me. Strangely enough, FWIW, I would probably do that in Python, since Python's datetime objects are naive, rather than timezone-aware, by default! Jada-Time makes IMHO a very nice and easy to understand distinction between Instants and Local DateTimes, so sticking with DateTimes everywhere will be, I believe, your best bet.
I have seen many programs in java (android programs in particular) that convert string dates to longs and also save the date to an sqlite database. Why is this necessary? Is it doing a conversion? Or is there something in particular about sqlite that requires this? When coding mobile applications CPU work should be kept to a minimum.
Databases do not store dates as strings. Think about this for a second the text "July 20, 2013" takes 13 bytes. How many bits is that? 13 * 8 = 104. Instead the internal representation is some variation of an offset from a known time. In C this is the epoch. The number of days since 1970 can be stored in a 16 bit number for example. The reason for the conversion is to change the representation into a compact form so that the database can operate effectively with the data.
I want to use time stamp as part of my application(I am using JAVA).I was asked to use Network Time Protocol(NTP).I have searched in google and I was able to find a package "org.apache.commons.net" where there is a TimeStamp class.I have gone through this link to know more about the class.
What should I pass to the constructors of this class(what is the significance of each constructor). Actually TS class should return us the time stamp,instead it is asking to input time stamp.I am confused with that.
You can use the following overload of the constructor to create the TimeStamp Object.
public TimeStamp(Date d)
pass an object of java.util.Date as argument.
This will give you a timestamp value which is represented as a
64-bit unsigned fixed-point number in seconds relative to 0-hour on 1-January-1900.
The main significance is that it is a protocol, a standard followed by different systems. Different systems present in a network may not have their clock synchronized, and may not understand how others are measuring time, may follow different time zones. using NTP they synchronize their clock to UTC
You can use the static getCurrentTime() to get a timestamp that represents the current time measured to the nearest milliseconds:
Timestamp myTs = Timestamp.getCurrentTime();
I'm setting the standards for our application.
I've been wondering, what default date format should I choose to use ?
It should be:
Internationalization & timezone aware, the format should be able to represent user local time
Can be efficiently parsed by SimpleDataFormat (or alike, jdk classes only)
Programming Language agnostic (can parse in java, python, god forbid C++ :) and co.)
Preferably ISO based or other accepted standard
Easy to communicate over HTTP (Should such need arises, JSON or YAML or something in this nature)
Can represent time down to seconds resolution (the more precise the better, micro seconds if possible).
Human readable is a plus but not required
Compact is a plus but not required
Thank you,
Maxim.
yyyy-MM-ddThh:mmZ (See ISO 8601) You can add seconds, etc
You can read it easily, it will not be a problem for SimpleDateFormat.
The most canonical and standard form is probably "Unix Time": The number of seconds elapsed since midnight Coordinated Universal Time (UTC) of January 1, 1970.
If you set that as the default time-format you can easily parse it, store it in memory, write it to disk, easily communicate it over HTTP and so on. It is also definitely an accepted standard, and in a sense it is "time-zone aware", since it is well-defined regardless of time-zones.
(This is the format in which I always store all my time stamps; in databases, in memory, on disk, ...)
The "right" default format really depends on what you're doing with it. The formats for parsing, storing, and displaying can all be different.
For storing the date you're (almost) always going to want to use UTC as aioobe says, even when you want to display it in user local time. I say "(almost)" but I really can't think of a case where I would not want UTC for a saved date. You may want to store the TZ information for where the date originated also, so you can report it in that local time, but more often you want to display the local time for the whoever is currently looking at the date. That means having a way to determine the current user's local time regardless of what the original local time was.
For displaying it, the "default format" should usually be determined by the viewers locale. 08/09/10 usually means 2010-Aug-9 in the U.S. ("Middle endian") but normally means 2010-Sep-8 in most of the rest of the world ("Little endian"). The ISO-8601 format "2010-09-10" is safe and unambiguous but often not what people expect to see. You can also look over RFC-3339 for Date and Time on the internet and RFC-2822 for message format (transmitting the date)
For parsing a date, you'll want to parse it and convert it to UTC, but you should be fairly flexible on what you accept. Again, the end users Locale and timezone, if discoverable, can help you determine what format(s) of string to accept as input. This is assuming user-typed strings. If you're generating a date/time stamp you can control the form and parsing will be no problem.
I also second BalusC link which I hadn't seen before and have now favorited.