Timezones are a well-documented quagmire in programming. At Paloma, I write a lot of code that has to do with how long ago things happened, and in what order, which means I do a lot of timestamp comparison. Python's approach to timezones is designed to make this process as painless as possible, and it (mostly) succeeds. Python achieves this by splitting datetimes into two different types:
- One type of datetime is the naive type. Naive datetimes have no concept of what timezone they are in, and they only have meaning relative to each other. This works really well if all of your times come from the same timezone (for instance, if you are only using your local machine time for a CLI)
- The other type are called aware datetimes. The python documentation describes aware datetimes as having:
sufficient knowledge of applicable algorithmic and political time adjustments, such as time zone and daylight saving time information, to locate itself relative to other aware objects.
The catch is that you can't compare aware and naive datetimes, or else you'll get the following error:
TypeError: can't compare offset-naive and offset-aware datetimes
So you should pick one of these strategies and stick with it, wherever possible.
At Paloma, we almost never use aware datetimes on the server side. We store all datetimes in UTC and only do timezone logic when we need to display something to the user. This approach generally dodges all of the possible timezone gotchyas, but I recently hit a bug that made me second-guess my sanity.
We often get events with timestamps in unix time, so we have to convert between unix timestamps and python datetimes. As an example, consider this code which converts a datetime into a unix timestamp:
>>> from datetime import datetime >>> now = datetime.utcnow() >>> now datetime(2018, 11, 15, 3, 24, 18, 997071) >>> now.timestamp() 1542223458.997071
Now, I've just gotten a timestamp from a datetime that was created by calling python's
datetime.utcnow(). I would expect that if we treat this as a UTC timestamp, we would get back the same datetime we started with. Let's try it:
>>> import pytz >>> datetime.fromtimestamp(now.timestamp(), pytz.utc) datetime(2018, 11, 14, 19, 24, 18, 997071, tzinfo=<UTC>)
Uh oh, this time is different! What's going on?
The tricky part is this:
utcnow returns the current utc time as a naive object, but
timestamp needs to know whether the naive object is in UTC or your local timezone in order to work. In this case, it will assume it is in your local timezone, which will be a major problem if you don't live in UTC and you try to round-trip your timestamp like above.
I solved this by adding an additional step anytime we needed to convert between naive datetimes and timestamps: converting to an aware datetime positioned in UTC. Basically the process now looks like:
Naive Datetime -> Aware Datetime (UTC) -> Timestamp
And the reverse when going the other way:
Timestamp -> Aware Datetime (UTC) -> Naive Datetime
And, to put it into python code:
>>> from datetime import datetime >>> import pytz >>> now = datetime.now(pytz.utc) >>> now datetime(2018, 11, 15, 11, 57, 33, 320996, tzinfo=<UTC>) >>> now.timestamp() 1542283053.320996 >>> datetime.fromtimestamp(now.timestamp(), pytz.utc) datetime(2018, 11, 15, 11, 57, 33, 320996, tzinfo=<UTC>)
Of course, it would be setting our code base up for failure if we just had to remember to do this process every time, so I added a couple of helper methods:
import pytz from datetime import datetime def datetime_from_timestamp(ts): return datetime.fromtimestamp(ts, pytz.utc).replace(tzinfo=None) def timestamp_from_datetime(dt): return dt.replace(tzinfo=pytz.utc).timestamp()
Now we can easily round-trip things between formats without changing the moment in time we are referencing:
>>> from datetime import datetime >>> import pytz >>> now = datetime.now(pytz.utc) >>> now datetime(2018, 11, 15, 11, 57, 33, 320996, tzinfo=<UTC>) >>> now.timestamp() 1542283053.320996 >>> datetime.fromtimestamp(now.timestamp(), pytz.utc) datetime(2018, 11, 15, 11, 57, 33, 320996, tzinfo=<UTC>) >>> assert datetime_from_timestamp(timestamp_from_datetime(now)) == now
So, to recap: most of the time, it is possible to use naive datetimes everywhere, and everything is grand. However, you should be careful when converting between naive datetimes and other representations of time like timestamps, it might not behave the way you expect! It's much safer to make an intermediate form that is timezone-aware when doing these kinds of conversions, to make sure that no timezone information is being assumed.