Time handling with timestamp and offset - java

I'm creating a custom data type that needs to store a set of exact timestamps (to millisecond accuracy) in a way that is both efficient and correct. I'm not particularly famiilar with the intricacies of timestamp handling so thought I would ask here for some wise advice.
I can see many options:
Store a Joda Instant for every timestamp.
Store a Joda DateTime for every timestamp.
Store a single Joda DateTime object once for the data type, and have a long offset for all the other timestamps relative to the main DateTime
Express every timestamp as a long offset to a fixed point (e.g. the Unix epoch of 1970-01-01T00:00:00Z )
.....other combinations.....
Questions:
What is the best way to store a sequence of timestamps?
What are the key tradeoffs?
Any pitfalls to watch out for?

Each of your storage options makes sense, and it's nice to see all of your options are with actual instants and never local datetimes (e.g., without time zones).
Your custom class will really be defined by its interface, so if you choose to store longs (epoch offsets) you can always provide interface methods to get the values from the sequence (and I assume other things like "deltas" -- or intervals, durations, or periods in Joda-speak) in human readable datetimes and periods if you like.
As you asked a number of questions, involving trade-offs, here's what I can offer:
Storing a sequence of longs is the most space-efficient.
Longs are not really as bad as you might think, since if your interface methods want to return datetimes, you just pass the long to the DateTime constructor.
Instants are thin wrappers over longs and provide convenience methods if you need to add durations to them or compute durations from instants; your code might look a little nicer than if you do your own math on longs and then construct a DateTime or Period or Duration around them.
DateTimes are great if you don't have excessive storage requirements and the actual date and time-of-day matter to the clients of your custom data type. Will your users care that a timestamp is on October 10th at 16:22 in the America/Los Angeles time zone? Or is the duration between the timestamps all that matter?
Storing a datetime or instant plus an array of offsets looks like a messy implementation since there are two concepts in play. I would think storing a single sequence of instants/datetimes only, and not mixing in durations, make a lot more sense. If you need to work with durations, just compute them in your interface methods.
I would say the only pitfalls to watch out for involve dealing with time zones if you are storing longs and your clients need to be thinking in datetimes.
In terms of tradeoffs, I only really see that longs, being primitive, save space, and I would guess a miniscule amount of time since DateTimes are objects and there is all that heap allocation and deallocation that takes place. But again, unless you are severely memory-constrained, I would say the best thing is to store DateTimes. Joda-Time can do all the time zone management for you. The parsing and formatting routines are easy and thread-safe. There are tons of convenience methods in place. And you won't have to do any conversion of your own from datetimes to longs. Storing longs only feels like a premature optimzation to me. Strangely enough, FWIW, I would probably do that in Python, since Python's datetime objects are naive, rather than timezone-aware, by default! Jada-Time makes IMHO a very nice and easy to understand distinction between Instants and Local DateTimes, so sticking with DateTimes everywhere will be, I believe, your best bet.

Related

Google Datastore - Storing dates as ISO 8601 Strings vs java.util.Date

I am using Joda-Time and I have noticed that DateTime is stored as a java.util.Date in the Google App Engine Datastore for Java, while LocalDateTime is stored as a ISO 8601 compliant String.
http://code.google.com/p/objectify-appengine/source/browse/src/com/googlecode/objectify/impl/translate/opt/joda/?r=2d48a85eae3a679c0dc0d01631de99f3b4775b29
I know that java.util.Date is a native type of the Datastore.
Is there any particular advantages is storing date/times as a java.util.Date as compared to a ISO 8601 compliant String or is it all the same. When I say advantage I might consider differences in regards to ...
Inequality queries
Storage size
Read/write cost
etc.
The accepted answer is not wrong, but I would like to add extra details which give a more balanced view.
a) Stable queries: ISO-8601 is stable as long as you assert that
you only use one date format for storage (ISO defines three: calendar date, ordinal date and week date)
and that you always use one precision degree for the time part (for example always in milliseconds)
and that you always use UTC with respect to global timestamps (that is zero offset with symbol Z).
Confirming this kind of stability can be application-dependent while java.util.Date does not require the same care.
b) Precision: ISO-8601 can express more precision beyond milliseconds while java.util.Date and Joda-Time are limited here. This is particularly true if you might later think of other new time libraries like JSR-310 in Java 8 or my own one which provide nanosecond precision. Then you will have precision issues with all JDBC types, java.util.Date and the database columns as far as they are not CHAR or VARCHAR.
A striking example is the JDBC-type java.sql.Time whose precision is limited to seconds, not better. This is pretty much in contrast to the new Java8-type java.time.LocalTime which offers nanoseconds. And worse: This aspect is also relevant for Joda-Time and java.util.Date in your application layer.
For rather academic purposes: Leapseconds can only be stored in ISO-8601-format, not with java.util.Date or similar.
c) Storage size: Of course, java.util.Date has a more compact representation, but else I have to say that disk space is cheap nowadays, so this is not an item to worry about so much.
d) Read-Write-costs: This item is in favor of compact data types like java.util.Date. But you have also to consider that even in this case you have to represent it in a human-readable format in any other layer sooner or later (most in logging or in representation layer). So for data exchange with other proprietary applications which expect java.util.Date this native type is okay, but for logging purposes or XML-data exchange ISO-8601 is probably the better format.
And if you really care so much about performance costs, you might even consider a number type (long - 64 bit) to avoid garbage burdens by unnecessary object creation (in extreme edge cases). Remember: java.util.Date is just a wrapper around a long.
The advantage of java.util.Date: Stable queries (both inequality and equality), storage size, and interoperability with other GAE languages that have native date representations.

Java SE 8 TemporalAccessor.from issues when used with a java.time.Instant object

java.time has an Instant class which encapsulates a position (or 'moment') on the timeline. While I understand that this is a seconds/nanoseconds value so not directly related to time zones or times offsets, its toString returns a date and time formatted as a UTC date/time, eg 2014-05-13T20:05:08.556Z. Also anInstant.atZone(zone) and anInstant.atOffset(offset) both produce a value that is consistent with treating the Instant as having an implied UTC time-zone/'zero' offset.
I would have expected therefore:
ZoneOffset.from(anInstant) to produce a 'zero' ZoneOffset
OffsetDateTime.from(anInstant) to produce a date/time with a 'zero' offset
ZoneId.from(anInstant) (probably) to produce a UTC ZoneId
ZonedDateTime.from(anInstant) (probably) to produce a ZonedDateTime with a UTC ZoneId
The documentation for ZonedDateTime.from, as I read it, appears to endorse this.
In fact ZoneOffset.from(anInstant) fails with DateTimeException, and I suppose for that reason OffsetDateTime.from(anInstant) also fails, as do the other two.
Is this the expected behaviour?
Short answer:
The JSR-310-designers don't want people to do conversions between machine time and human time via static from()-methods in types like ZoneId, ZoneOffset, OffsetDateTime, ZonedDateTime etc. This is explicitly specified if you carefully study the javadoc. Instead use:
OffsetDateTime#toInstant():Instant
ZonedDateTime#toInstant():Instant
Instant#atOffset(ZoneOffset):OffsetDateTime
Instant#atZone(ZoneId):ZonedDateTime
The problem with the static from()-methods is that otherwise people are able to do conversions between an Instant and for example a LocalDateTime without thinking about the timezone.
Long answer:
Whether to consider an Instant as counter or as field tuple, the answer given by JSR-310-team was a strict separation between so-called machine time and human time. Indeed they intend to have a strict separation - see their guidelines. So finally they want Instant to be only interpreted as a machine time counter. So they intentionally have a design where you cannot ask an Instant for fields like year, hour etc.
But indeed, the JSR-310-team is not consistent at all. They have implemented the method Instant.toString() as a field tuple view including year, ..., hour, ... and offset-symbol Z (for UTC-timezone) (footnote: Outside of JSR-310 this is quite common to have a field-based look on such machine times - see for example in Wikipedia or on other sites about TAI and UTC). Once the spec lead S. Colebourne said in a comment on a threeten-github-issue:
"If we were really hard line, the toString of an Instant would simply be the number of seconds from 1970-01-01Z. We chose not to do that, and output a more friendly toString to aid developers, But it doesn't change the basic fact that an Instant is just a count of seconds, and cannot be converted to a year/month/day without a time-zone of some kind."
People can like this design decision or not (like me), but the consequence is that you cannot ask an Instant for year, ..., hour, ... and offset. See also the documentation of supported fields:
NANO_OF_SECOND
MICRO_OF_SECOND
MILLI_OF_SECOND
INSTANT_SECONDS
Here it is interesting what is missing, above all a zone-related field is missing. As a reason, we often hear the statement that objects like Instant or java.util.Date have no timezone. In my opinion this is a too simplistic view. While it is true that these objects have no timezone state internally (and there is also no need for having such an internal value), those objects MUST be related to UTC timezone because this is the basis of every timezone offset calculation and conversion to local types. So the correct answer would be: An Instant is a machine counter counting the seconds and nanoseconds since UNIX epoch in timezone UTC (per spec). The last part - relationship to UTC zone - is not well specified by JSR-310-team but they cannot deny it. The designers want to abolish the timezone aspect from Instant because it looks human-time-related. However, they can't completely abolish it because that is a fundamental part of any internal offset calculation. So your observation regarding
"Also an Instant.atZone(zone) and an Instant.atOffset(offset) both produce a value that is consistent with treating the Instant as having an implied UTC time-zone/'zero' offset."
is right.
While it might be very intuitive that ZoneOffset.from(anInstant) might produce ZoneOffset.UTC, it throws an exception because its from()-method searches for a non-existent OFFSET_SECONDS-field. The designers of JSR-310 have decided to do that in the specification for the same reason, namely to make people think that an Instant has officially nothing to do with UTC timezone i.e. "has no timezone" (but internally they must accept this basic fact in all internal calculations!).
For the same reason, OffsetDateTime.from(anInstant) and ZoneId.from(anInstant) fail, too.
About ZonedDateTime.from(anInstant) we read:
"The conversion will first obtain a ZoneId from the temporal object, falling back to a ZoneOffset if necessary. It will then try to obtain an Instant, falling back to a LocalDateTime if necessary. The result will be either the combination of ZoneId or ZoneOffset with Instant or LocalDateTime."
So this conversion will fail again due to the same reasons because neither ZoneId nor ZoneOffset can be obtained from an Instant. The exception message reads as:
"Unable to obtain ZoneId from TemporalAccessor: 1970-01-01T00:00:00Z of type java.time.Instant"
Finally we see that all static from()-methods suffer from being unable to do a conversion between human time and machine time even if this looks intuitive. In some cases a conversion between let's say LocalDate and Instant is questionable. This behaviour is specified, but I predict that your question is not the last question of this kind and many users will continue to be confused.
The real design problem in my opinion is that:
a) There should not be a sharp separation between human time and machine time. Temporal objects like Instant should better behave like both. An analogy in quantum mechanics: You can view an electron both as a particle and a wave.
b) All static from()-methods are too public. Ihat is too easily accessible in my opinion and should better have been removed from public API or use more specific arguments than TemporalAccessor. The weakness of these methods is that people can forget to think about related timezones in such conversions because they start the query with a local type. Consider for example: LocalDate.from(anInstant) (in which timezone???). However, if you directly ask an Instant for its date like instant.getDate(), personally I would consider the date in UTC-timezone as valid answer because here the query starts from an UTC perspective.
c) In conclusion: I absolutely share with the JSR-310-team the good idea to avoid conversions between local types and global types like Instant without specifying a timezone. I just differ when it comes to the API-design to prevent users from doing such a timezone-less conversion. My preferred way would have been to restrict the from()-methods rather than saying that global types should not have any relation to human-time-formats like calendar-date or wall-time or UTC-timezone-offset.
Anyway, this (inconsequent) design of separation between machine time and human time is now set in stone due to preserving backward compatibility, and everyone who wants to use the new java.time-API has to live with it.
Sorry for a long answer, but it is pretty tough to explain the chosen design of JSR-310.
The Instant class does not have a time zone. It gets printed like it's in the UTC zone because it has to be printed in some zone(you wouldn't want it to be printed in ticks, would you?), but that's not it's time zone - as demonstrated in the following example:
Instant instant=Instant.now();
System.out.println(instant);//prints 2014-05-14T06:18:48.649Z
System.out.println(instant.atZone(ZoneId.of("UTC")));//prints 2014-05-14T06:18:48.649Z[UTC]
The signatures of ZoneOffset.from and OffsetDateTime.from accept any temporal object, but they fail for some types. java.time does not seem to have an interface for temporals that have a timezone or offset. Such an interface could have declare getOffset and getZone. Since we don't have such interface, these methods are declared separately in multiple places.
If you had a ZonedDateTime you could call it's getOffset. If you had an OffsetDateTime you could also call it's offset - but these are two different getOffset methods, as the two classes don't get that method from a common interface. That means that if you have a Temporal object that could be either you would have to test if it's an instanceof both of them to get the offset.
OffsetDateTime.from simplifies the process by doing that for you - but since it also can't rely on a common interface, it has to accept any Temporal object and throw an exception for those that don't have an offset.
Just an example w.r.t conversions, i believe some folks will get below exception
(java.time.DateTimeException: Unable to obtain LocalDateTime from
TemporalAccessor: 2014-10-24T18:22:09.800Z of type java.time.Instant)
if they try -
LocalDateTime localDateTime = LocalDateTime.from(new Date().toInstant());
to resolve the issue, please pass in zone -
LocalDateTime localDateTime = LocalDateTime.from(new Date().toInstant().atZone(ZoneId.of("UTC")));
The key to to get a LocalDateTime from an Instant is to provide the system's TimeZone:
LocalDateTime.ofInstant(myDate.toInstant(), ZoneId.systemDefault())
As a side note, beware of multi-zone cloud servers as the timezone will surely change.

Memory usage of Date object

I have a String which is of the form: 2013-10-20 15:18:39.954. I am trying to figure out the best way to store this data so that it would use the least possible memory. Currently I am storing it as a Date object in Java. From this link I found out that the object uses about 32 bytes of memory. Is there some way to use less memory to store this data? I am trying to use as low memory as possible, so even 1 byte lesser would be fine.
I was thinking that I could use a String but this link says Strings also use a lot of memory. Any help would be appreciated!
You can convert the Date to a long with the getTime() method (milliseconds since epoch) and back, and a long is 8 bytes.
A java.util.Date is just a long under the hood. You could represent the same information with a long and build Dates as necessary.
Just beware that storing information this way, either java.util.Date or long, is not fully portable across platforms if you convert it back to human readable representation.
The number of milliseconds since epoch that, for example, translates to 'Midnight June 18th, 2019' can change if a platform upgrade introduces new leap seconds, or daylight savings rules change.

Time stamp class in org.apache.commons.net package

I want to use time stamp as part of my application(I am using JAVA).I was asked to use Network Time Protocol(NTP).I have searched in google and I was able to find a package "org.apache.commons.net" where there is a TimeStamp class.I have gone through this link to know more about the class.
What should I pass to the constructors of this class(what is the significance of each constructor). Actually TS class should return us the time stamp,instead it is asking to input time stamp.I am confused with that.
You can use the following overload of the constructor to create the TimeStamp Object.
public TimeStamp(Date d)
pass an object of java.util.Date as argument.
This will give you a timestamp value which is represented as a
64-bit unsigned fixed-point number in seconds relative to 0-hour on 1-January-1900.
The main significance is that it is a protocol, a standard followed by different systems. Different systems present in a network may not have their clock synchronized, and may not understand how others are measuring time, may follow different time zones. using NTP they synchronize their clock to UTC
You can use the static getCurrentTime() to get a timestamp that represents the current time measured to the nearest milliseconds:
Timestamp myTs = Timestamp.getCurrentTime();

What system default date format to use?

I'm setting the standards for our application.
I've been wondering, what default date format should I choose to use ?
It should be:
Internationalization & timezone aware, the format should be able to represent user local time
Can be efficiently parsed by SimpleDataFormat (or alike, jdk classes only)
Programming Language agnostic (can parse in java, python, god forbid C++ :) and co.)
Preferably ISO based or other accepted standard
Easy to communicate over HTTP (Should such need arises, JSON or YAML or something in this nature)
Can represent time down to seconds resolution (the more precise the better, micro seconds if possible).
Human readable is a plus but not required
Compact is a plus but not required
Thank you,
Maxim.
yyyy-MM-ddThh:mmZ (See ISO 8601) You can add seconds, etc
You can read it easily, it will not be a problem for SimpleDateFormat.
The most canonical and standard form is probably "Unix Time": The number of seconds elapsed since midnight Coordinated Universal Time (UTC) of January 1, 1970.
If you set that as the default time-format you can easily parse it, store it in memory, write it to disk, easily communicate it over HTTP and so on. It is also definitely an accepted standard, and in a sense it is "time-zone aware", since it is well-defined regardless of time-zones.
(This is the format in which I always store all my time stamps; in databases, in memory, on disk, ...)
The "right" default format really depends on what you're doing with it. The formats for parsing, storing, and displaying can all be different.
For storing the date you're (almost) always going to want to use UTC as aioobe says, even when you want to display it in user local time. I say "(almost)" but I really can't think of a case where I would not want UTC for a saved date. You may want to store the TZ information for where the date originated also, so you can report it in that local time, but more often you want to display the local time for the whoever is currently looking at the date. That means having a way to determine the current user's local time regardless of what the original local time was.
For displaying it, the "default format" should usually be determined by the viewers locale. 08/09/10 usually means 2010-Aug-9 in the U.S. ("Middle endian") but normally means 2010-Sep-8 in most of the rest of the world ("Little endian"). The ISO-8601 format "2010-09-10" is safe and unambiguous but often not what people expect to see. You can also look over RFC-3339 for Date and Time on the internet and RFC-2822 for message format (transmitting the date)
For parsing a date, you'll want to parse it and convert it to UTC, but you should be fairly flexible on what you accept. Again, the end users Locale and timezone, if discoverable, can help you determine what format(s) of string to accept as input. This is assuming user-typed strings. If you're generating a date/time stamp you can control the form and parsing will be no problem.
I also second BalusC link which I hadn't seen before and have now favorited.

Categories