Skip to content

DOC: timezone warning for DST beyond 2038-01-18 #33863

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 10, 2020
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions doc/source/user_guide/timeseries.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2265,6 +2265,20 @@ you can use the ``tz_convert`` method.
Instead, the datetime needs to be localized using the ``localize`` method
on the ``pytz`` time zone object.

.. warning::

If you are using dates beyond 18 Jan 2038, note that pandas does not apply daylight saving time adjustments to timezone aware dates. This is partly because the underlying libraries do not currently address the Year 2038 Problem, and partly because there is some discussion on how reliable any DST settings that far into the future will be.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second part and partly because there is some discussion on how reliable any DST settings that far into the future will be sounds vague. There are not just discussions. It seems arbitrary to start raising this concern in 2038. With so many jurisdictions in the world, some will change their time zones earlier. For instance, the EU is on track to abolish DST in the 2020s. I think this sentence should be left out because it's a separate issue.

Or that half-sentence could be put in a separate warning but I think that would be overkill since this issue is not Pandas-specific. Our docs will be very cluttered if we warn about every generic issue that might occur with date and time handling.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC it is only DST transitions that are affected , and only after 2038. So any changes up to 2038 will be reflected correctly in construction of a tz aware time ... have I understood you OK? The exact sentence you quoted has been removed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, changes will not be reflected correctly until 2038 because you forgot to import crystalball (and even if you had imported it you'd find it can't be used to read the future). However, I think the warning should only address the 2038 issue. One could make an extra warning for the general issue of real-world timezones being unpredictable in the future but I don't think that's necessary because that's what people expect.

The new sentence It should be noted though, that time zone data for far future time zones are likely to be inaccurate, as they are simple extrapolations of the current set of (regularly revised) rules. is misleading and confusing. One can't even predict timezone switches a day in advance, as we have seen in 2018 in Morocco, let alone 18 years.

Also, the example you give is only about the 2038 problem. Especially in Britain it's likely that they abolish daylight saving together with the EU before 2038. The quoted sentence just makes it a little bit harder for people to follow that example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it is quite possible that UK will stick to permanent DST some time in the next 5 years, but if that happens, the underlying libraries as they are will support that - the changes will be made in pre-2038 dates. I can't see what the problem is!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it is quite possible that UK will stick to permanent DST some time in the next 5 years, but if that happens, the underlying libraries as they are will support that - the changes will be made in pre-2038 dates.

The same argument can be made for post-2038 dates. Certainly the underlying libraries will be fixed before 2038, so the only time this would ever come up is if you're trying to convert a local time in the far future into a timestamp (either in UTC or find out what offset applies), which is basically something that cannot be known to be accurate.

This is my problem with the whole idea of adding this warning ­– it's saying, "If you are trying to do this thing you shouldn't do, the answer might be different from what you expect." It may be that the zone you're in has eliminated DST by then, in which case the answer is right and the "correct" rule is the one that's wrong. It's not a problem particular to pandas, and it's not easy to convey to end users what the problem is and what should be done about it, so a warning in the documentation doesn't seem like a good fit.

I still don't like the focus on 2038 as the cut-off point, because it makes it seem like adding version 2 support to the underlying libraries will fix the problem, but in reality the problem is that the users are trying to do this at all, and they just notice something out of the ordinary after 2038.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pganssle you're essentially saying that people shouldn't use timezones on any far future dates that they deal with. (side note: I disagree with the word far, the future is unpredictable, far or near.) This is unrealistic. While many applications could rely on UT, UTC, or system time, I'm sure you can come up with many use cases where developers need to deal with time zones in future dates (e.g. when opening hours, TV schedules or flight schedules are involved).

Obviously, nobody can guarantee that future time zones will be correct. However, developers expect a predictable behaviour of the library. Now, the 2038 issue breaks that predictability. It's common sense that the future cannot be predicted, so there is no need to warn about potential future political changes. However, developers should be warned if code yields unexpected results.

That's why we should warn developers about the 2038 issue (as in the PR title). And the following sentence should be removed because it has nothing to do with the 2038 issue: It should be noted though, that time zone data for far future time zones are likely to be inaccurate, as they are simple extrapolations of the current set of (regularly revised) rules.

@pganssle if you think we need to warn developers about using timezones when dealing with future dates, this should go in a separate pull request.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pganssle you're essentially saying that people shouldn't use timezones on any far future dates that they deal with. (side note: I disagree with the word far, the future is unpredictable, far or near.) This is unrealistic. While many applications could rely on UT, UTC, or system time, I'm sure you can come up with many use cases where developers need to deal with time zones in future dates (e.g. when opening hours, TV schedules or flight schedules are involved).

No, I'm saying that you should only use "time zones" in future dates and that conversion to UTC is increasingly unreliable the further into the future you go. 18 years is a long way into the future, so this is like saying, "Don't forget to bring a bathing suit if you jump into shark-infested water!"

It's somewhat misleading to include a warning like this without explaining that yes it's not what you might expect but it's basically not a problem at the moment, because if you are relying on accuracy in this situation you have bigger problems. It also implies that dates in December 2037 can be used somewhat accurately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I'm convinced now. We should warn about converting (any) future time. I still think this warning shouldn't be an afterthought in the 2038 warning text. It should be its own warning box. That gives it the prominence it deserves and ensures the warning remains if the 2038 problem is resolved in the downstream libraries and that box is removed.

Something along the lines of:

Be aware that for times in the future, correct conversion between time zones (and UTC) cannot be guaranteed by any time zone library. Sometimes the rules governing a timezone's offset from UTC are changed. Authorities usually announce such changes many months in advance but there have been examples of much shorter lead times such as when Morocco announced just two days before the planned switch from summer time to winter time in 2018 that the country would stay on summer time permanently. Furthermore, the databases that Pandas relies on may need some time to record planned changes to timezone offsets.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a PR to add this text: #34100


For example, for two dates that are in British Summer Time and so would normally be GMT+1, both the following asserts evaluate as true:

.. ipython:: python

d_2037 = '2037-03-31T010101'
d_2038 = '2038-03-31T010101'
DST = 'Europe/London'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this called DST instead of LON or some other thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because I am focusing on DST transitions - just happen to have picked London as that's local

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

People will think it means "this is the Daylight Saving Time zone". You don't have to call it LON, but you should at least call it ZONE or something.

assert pd.Timestamp(d_2037, tz=DST) != pd.Timestamp(d_2037, tz='GMT')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not seem right to me.

If you're going to keep the example (and I'm not convinced you should, especially since it's right around when a DST transition happens - if anything you should move it deep into summer to guarantee that the fluctuations aren't just due to the transition moving around), it would be better to make assertions about the thing you care about.

assert pd.Timestamp(d_2037, tz=LON).tzname() != "GMT"
assert pd.Timestamp(d_2038, tz=LON).tzname() != "GMT"

Even better, though, would be a repr:

>>> pd.Timestamp(d_2037, tz=LON).tzname()
'BST'
>>> pd.Timestamp(d_2037, tz=LON).tzname()
'GMT'

Copy link
Contributor Author

@telferm57 telferm57 May 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree summer date would be clearer, but not so sure whether your examples are clearer, or whether using BST (local zone name for DST) is clearer for a global audience ... hmmm

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't really matter what offsets you use there, pick anything. The important thing is that it's obvious that they are different on the same date in different years, and that it's obvious that that's not due to fluctuations one way or the other in the date of the DST change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it OK to submit further changes once the PR has been approved?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you can submit further changes. I think @pganssle's suggestion about using the repr would be clearer.

assert pd.Timestamp(d_2038, tz=DST) == pd.Timestamp(d_2038, tz='GMT')

Under the hood, all timestamps are stored in UTC. Values from a time zone aware
:class:`DatetimeIndex` or :class:`Timestamp` will have their fields (day, hour, minute, etc.)
localized to the time zone. However, timestamps with the same UTC value are
Expand Down