Skip to content

Conversation

@khwilliamson
Copy link
Contributor

tl;dr:

Fixes #23878

I botched this in Perl 5.42. These conditional compilation statements in locale.c were just plain wrong, causing code to be skipped that should have been compiled. It only affected the few hours of the year when daylight savings time is removed, so that the hour value is repeated. We didn't have a good test for that.

gory details:

libc uses 'struct tm' to hold information about a given instant in time, containing fields for things like the year, month, hour, etc. The libc function mktime() is used to normalize the structure, adjusting, say, an input Nov 31 to be Dec 01.

One of the fields in the structure, 'is_dst', indicates if daylight savings is in effect, or whether that fact is unknown. If unknown, mktime() is supposed to calculate the answer and to change 'is_dst' accordingly. Some implementations appear to always do this calculation even when the input value says the result is known. Others appear to honor it.

Some libc implementations have extra fields in 'struct tm'.

Perl has a stripped down version of mktime(), called mini_mktime(), written by Larry Wall a long time ago. I don't know why. This crippled version ignores locale and daylight time. It also doesn't know about the extra fields in 'struct tm' that some implementations have. Nor can it be extended to know about those fields, as they are dependent on timezone and daylight time, which it deliberately doesn't consider.

The botched #ifdef's were supposed to compensate for both the extra fields in the struct and that some libc implementations always recalculate 'is_dst'.

On systems with these fields, the botched #if's caused only mini_mktime() to be called. This meant that these extra fields didn't get populated, and daylight time is never considered to be in effect. And 'is_dst' does not get changed from the input.

On systems without these fields, the regular libc mktime() would be called appropriately.

The bottom line is that for the portion of the year when daylight savings is not in effect, things mostly worked properly. The two extra fields would not be populated, so if some code were to read them, it would only get the proper values by chance. We got no failure reports of this. I attribute that to the fact that the use of these is not portable, so code wouldn't tend to use them. There are portable ways to access the information they contain.

Tests were failing for the portions of the year when daylight savings is in effect; see GH #22351. The code looked correct just reading it (not seeing the flaw in the #ifdef's), so I assumed that it was an issue in the libc implementations and instituted a workaround. (I can't now think of a platform where there hasn't been a problem with a libc with something regarding locales, so that was a reasonable assumption.)

Among other things that workaround overrode the 'is_dst' field after the call to mini_mktime(), so that the value actually passed to libc strftime() indicated that daylight is in effect.

What happens next depends on the libc strftime() implementation. It could conceivably itself call mktime() which might choose to override is_dst to be the correct value, and everything would always work. The more likely possibility is that it just takes the values in the struct as-is. Remember that those values on systems with the extra fields were calculated as if daylight savings wasn't in effect, but now we're telling strftime() to use those values as if it were in effect. This is a discrepancy. I'd have to trace through some libc implementations to understand why this discrepancy seems to not matter except at the transition time.

But the bottom line is this p.r removes that discrepancy, and causes mktime() to be called appropriately on systems where it wasn't, so strftime() should now function properly. And the workarounds are also removed.

This regression fix should go into a maintenance release.

  • This set of changes requires a perldelta entry, and it is included.

@khwilliamson khwilliamson force-pushed the strftime branch 2 times, most recently from 61bab2c to 338b313 Compare November 16, 2025 18:34
@jkeenan
Copy link
Contributor

jkeenan commented Nov 18, 2025

@khwilliamson, this p.r. had 2 failures in CI, one on Windows, one on Linux. Could you make a smoke-me branch out of this so that we can test it on some other platforms? Thanks.

@khwilliamson
Copy link
Contributor Author

After a lot of effort (facilitated by @bram-perl, I have concluded that the i386 failure is due to a bug in its glibc.

< Return from localtime for 1741510799= tm_sec=59, tm_min=59, tm_hour=0, tm_mday=9, tm_mon=2, tm_year=125, isdst=0 gmtoff=-28800, tm_zone=PST
> Return from localtime for 1741510799= tm_sec=59, tm_min=59, tm_hour=1, tm_mday=9, tm_mon=2, tm_year=125, isdst=1 gmtoff=-25200, tm_zone=PDT

The call marked < is my 64-bit Linux box compiled with g++ The one marked > is i386 32-bit compiled with gcc. (The same results are obtained if I use gcc on my box.) Notice that the calls are passed identical inputs, but have different outputs. The i386 one improperly thinks daylight savings time is in effect, and adjusts the hour and offset from gmt accordingly.

Further, I note the man page for localtime on my box says something belied by the result, and not supported by either the C or POSIX standards. It says that daylight is set to 1 if the locale has dst in effect for any time of the year.

I'll investigate using gmtime instead to work around the libc issue here (and maybe on other platforms).

@jkeenan
Copy link
Contributor

jkeenan commented Nov 19, 2025

After a lot of effort (facilitated by @bram-perl, I have concluded that the i386 failure is due to a bug in its glibc.

< Return from localtime for 1741510799= tm_sec=59, tm_min=59, tm_hour=0, tm_mday=9, tm_mon=2, tm_year=125, isdst=0 gmtoff=-28800, tm_zone=PST
> Return from localtime for 1741510799= tm_sec=59, tm_min=59, tm_hour=1, tm_mday=9, tm_mon=2, tm_year=125, isdst=1 gmtoff=-25200, tm_zone=PDT

The call marked < is my 64-bit Linux box compiled with g++ The one marked > is i386 32-bit compiled with gcc. (The same results are obtained if I use gcc on my box.) Notice that the calls are passed identical inputs, but have different outputs. The i386 one improperly thinks daylight savings time is in effect, and adjusts the hour and offset from gmt accordingly.

Further, I note the man page for localtime on my box says something belied by the result, and not supported by either the C or POSIX standards. It says that daylight is set to 1 if the locale has dst in effect for any time of the year.

I'll investigate using gmtime instead to work around the libc issue here (and maybe on other platforms).

Would we need to investigate this on other platforms (e.g., FreeBSD)? Or with other C-compilers (e.g., clang)? If so, can you provide specfic instructions?

@thesamesam
Copy link

What version of glibc is this? Can you get a C testcase please?

@khwilliamson
Copy link
Contributor Author

khwilliamson commented Nov 20, 2025

perl -V gives gnulibc_version='2.41' and libc=/lib/i386-linux-gnu/libc.so.6

@khwilliamson
Copy link
Contributor Author

@jkeenan All major platforms have bugs with their locale handling; Darwin being the worst. I have written lots of workarounds for our code. And lots of skips for our tests for particular platforms.

This one just happened to show up. There's no telling what others are out there. This bug is just for a moment in time, and could easily be an entry in their database that is wrong, and we wouldn't know what other moments in time are also problematic. I'll see if using gmtime instead of localtime fixes this; I have my doubts. In any case, this came up due to a new test case I just added. I'm doing a smoke-me now.

@khwilliamson khwilliamson force-pushed the strftime branch 2 times, most recently from 2cc2507 to 0272c12 Compare November 21, 2025 02:56
@khwilliamson
Copy link
Contributor Author

I earlier uploaded a .c file to reproduce the problem. I later found a bug. Here is a corrected version
localtime_bug.c

Using it, I found that this libc thinks daylight time starts precisely 5 hours earlier than it actually does.

@thesamesam
Copy link

thesamesam commented Nov 22, 2025

Thanks. I can't reproduce w/ glibc-2.42 (tip of release branch). I also can't spot any commits that would definitely be related.

The container failure at https://github.com/Perl/perl5/actions/runs/19575295110/job/56059927984?pr=23921 is using Debian trixie (i386) and has libc6-dev i386 2.41-12.

@eggert By chance, do you remember anything that might've fixed the testcase from #23921 (comment), giving wrong isdst output like in #23921 (comment)?

@khwilliamson
Copy link
Contributor Author

Could this be an error with the container configuration somehow?

If so, is it feasible for someone to try this by not using the container? If it still fails, could glibc be bisected?

@khwilliamson
Copy link
Contributor Author

@thesamesam do you have any recommendations about how to proceed?

@thesamesam
Copy link

Let me have a look.

@thesamesam
Copy link

thesamesam commented Dec 21, 2025

I can reproduce with debootstrap --arch i386 sid sid http://deb.debian.org/debian ; systemd-nspawn -D sid --personality x86 (ditto trixie, forky).

A statically linked binary that works in a Gentoo x86 container fails on Debian. I don't see patches that Debian apply that could be relevant to glibc or tzdata.

stracing the statically linked binary from Gentoo in sid where it fails, I see:

openat(AT_FDCWD, "/usr/share/zoneinfo/PST8PDT", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/zoneinfo/posixrules", O_RDONLY|O_CLOEXEC) = 3

.. and that file exists on the Gentoo system where it works.

In eggert/tz@a0b09c0, PST8PDT was marked as obsolescent. I see https://salsa.debian.org/glibc-team/tzdata/-/commit/85487c1ac6f30ce537b21a0b3cefd7cbf1b906b3 (before that upstream tz change) splits PST8PDT into Debian's tzdata-legacy package.

Installing tzdata-legacy fixes it for me. It also failed for me in a 64-bit sid debootstrap. It looks like there isn't a 64-bit Debian trixie container in CI which explains that (i.e. bitness was just a red herring).

TL;DR: Try add tzdata-legacy to the CI container, or we can switch to a timezone that is more reliably available (America/Los_Angeles based on that tz commit; not sure if it has the behaviour you need though, maybe skip the test if /usr/share/zoneinfo/PST8PDT isn't available instead)?

@khwilliamson
Copy link
Contributor Author

The reason I chose the PST8PDT is to get something that doesn't need to be skipped on MSWin, as it doesn't understand the LosAngeles format

@khwilliamson khwilliamson force-pushed the strftime branch 2 times, most recently from 4f078b8 to 60a5638 Compare December 21, 2025 20:47
@khwilliamson
Copy link
Contributor Author

@thesamesam thanks; that clarified things a lot. I had no idea about that debian package.

But things are more subtle. Your comment gave me an idea for portably testing if a locale is present. What I did is to choose a time_t in the middle of the locale's summer where DST is going to be in effect if it ever is for that locale. Then I check if strftime for the locale returns the expected hour. If so, libc must know the locale and that it has DST.

This works for the Windows CI's, allowing me to remove skips based on those OS's in favor of this results based one. But the debian CI is still failing, indicating it knows about PST8PDT, but not the rules. But it only fails half the tests. This is for the case in March where we spring forward. It thinks it is in daylight time before it actually should. So the tests start succeeding when we expect to be in daylight time.

2025-12-21T19:54:20.2257403Z #   Failed test 'Chg -1 hr,-1 sec'
2025-12-21T19:54:20.2257806Z #   at t/time.t line 288.
2025-12-21T19:54:20.2258107Z #          got: '2025-03-09 01:59:59-0700'
2025-12-21T19:54:20.2258512Z #     expected: '2025-03-09 00:59:59-0800'
2025-12-21T19:54:20.2262020Z #   Failed test 'Chg -1 hr, 0 sec'
2025-12-21T19:54:20.2262486Z #   at t/time.t line 288.
2025-12-21T19:54:20.2263056Z #          got: '2025-03-09 02:00:00-0700'
2025-12-21T19:54:20.2263650Z #     expected: '2025-03-09 01:00:00-0800'
2025-12-21T19:54:20.2267231Z #   Failed test 'Chg -59 min,-59 sec'
2025-12-21T19:54:20.2267688Z #   at t/time.t line 288.
2025-12-21T19:54:20.2268944Z #          got: '2025-03-09 02:00:01-0700'
2025-12-21T19:54:20.2269441Z #     expected: '2025-03-09 01:00:01-0800'
2025-12-21T19:54:20.2272740Z #   Failed test 'Chg -1 sec'
2025-12-21T19:54:20.2273235Z #   at t/time.t line 288.
2025-12-21T19:54:20.2273699Z #          got: '2025-03-09 02:59:59-0700'
2025-12-21T19:54:20.2274126Z #     expected: '2025-03-09 01:59:59-0800'
2025-12-21T19:54:20.2288733Z # Looks like you failed 4 tests of 49.
2025-12-21T19:54:20.2353877Z ext/POSIX/t/time ................................................. FAILED at test 41

I looked up the description of the package that fixes things for you, and it talks about a 10 second offset. But that can't be what is happening here, because it is wrong for more than an hour.

@eggert
Copy link

eggert commented Dec 22, 2025

What I did is to choose a time_t in the middle of the locale's summer where DST is going to be in effect if it ever is for that locale.

That does not work in places like Morocco and Ireland where DST is not in effect during summer months, but is in effect at other times. (For example, in Morocco it was in effect from February 23 to April 6 this year.) If the tests are not meant to be run in places like that you should be OK, I guess.

the debian CI is still failing, indicating it knows about PST8PDT, but not the rules

POSIX does not specify the DST rules for TZ="PST8PDT". Although many implementations use current US DST rules, that's not required by the standard and I know of software where other rules are used. glibc's behavior differs from tzcode's, for example.

@thesamesam
Copy link

thesamesam commented Dec 22, 2025

I looked up the description of the package that fixes things for you, and it talks about a 10 second offset. But that can't be what is happening here, because it is wrong for more than an hour.

root@trixie:~$ TZ=PST8PDT ./a
For PST8PDT, return from localtime for 1741510799 is:
tm_sec=59
tm_min=59
tm_hour=1
tm_mday=9
tm_mon=2
tm_year=125
isdst=1
gmtoff=-25200
tm_zone=PDT

root@trixie:~$ apt install tzdata-legacy
Installing:
  tzdata-legacy

Summary:
  Upgrading: 0, Installing: 1, Removing: 0, Not Upgrading: 0
  Download size: 179 kB
  Space needed: 1276 kB / 913 GB available

Get:1 http://deb.debian.org/debian trixie/main i386 tzdata-legacy all 2025b-4+deb13u1 [179 kB]
Fetched 179 kB in 2s (87.1 kB/s)
Selecting previously unselected package tzdata-legacy.
(Reading database ... 18750 files and directories currently installed.)
Preparing to unpack .../tzdata-legacy_2025b-4+deb13u1_all.deb ...
Unpacking tzdata-legacy (2025b-4+deb13u1) ...
Setting up tzdata-legacy (2025b-4+deb13u1) ...

root@trixie:~$ TZ=PST8PDT ./a
For PST8PDT, return from localtime for 1741510799 is:
tm_sec=59
tm_min=59
tm_hour=0
tm_mday=9
tm_mon=2
tm_year=125
isdst=0
gmtoff=-28800
tm_zone=PST

Isn't that final output what you were expecting?

@khwilliamson
Copy link
Contributor Author

@eggert, @thesamesam thanks for your responses

Perl endeavors to work on both POSIX systems, and non-POSIX ones to the extent possible, notably Microsoft WIndows. What is failing here isn't the code in general, but a test that attempts to get as much code coverage as possible by testing both kinds. And the test is checking that the code correctly handles the edges when DST starts and ends.

If the system doesn't handle the timezone syntax and semantics, the tests that use that syntax should be skipped. I'm looking for a way to determine that, and I thought I had one. That way was to choose a date in the past, away from the edges, where the daylight rules are known for two particular timezones and/or locations, and then see if the system correctly knows the DST status. So concerns about other locations, like Morocco, aren't applicable.

One location, Paris, was chosen because there was a report from the field about it not working properly there. The other, PST8PDT for making sure MSWin works, was chosen because that is the default timezone on Windows boxes. (That is a poor choice for the default in my opinion, but nonetheless that's what it is.)

My scheme works on Windows builds, but not on Debian without the legacy package installed. @thesamesam I didn't doubt that the package fixes the issue, but what my results showed was that without the package Debian does understand PST8PDT to some extent, but doesn't get the edge right. I had expected without the package it wouldn't understand that syntax at all.

Does anyone have a suggestion about how to detect if the package is installed? It should not refer to particular file paths. This code will run on many operating systems and configurations

@eggert
Copy link

eggert commented Dec 22, 2025

Does anyone have a suggestion about how to detect if the package is installed? It should not refer to particular file paths. This code will run on many operating systems and configurations

I suggest not worrying about that Debian package, as Debian does things differently from non-Debian distros and you don't want to fall down that rabbit hole.

When you say you're checking Paris, I assume you mean you're checking TZ="Europe/Paris", and that you're checking only historical timestamps, i.e., timestamps before today (and by "today" I mean really today, December 22, 2025, and not the time of the test - important since Paris might well change rules in the future). That should be relatively safe. I also assume that you somehow know whether TZ="Europe/Paris" actually works (as opposed to being Microsoft Windows, or a system without TZDB installed).

When you say you're checking PST8PDT I assume you mean you're checking TZ="PST8PDT", which unfortunately as you've discovered has different meaning on different platforms. I suggest that you instead check TZ="PST8PDT,M3.2.0,M11.1.0" which should have the same meaning on all POSIX platforms. Although the longer setting might not work on Microsoft Windows, you can special-case that by testing TZ="PST8PDT" only on Microsoft Windows and only with historical timestamps.

As an aside: given your comments I strongly suspect there are other bugs in the Perl code. One way to address them would be to switch to TZDB's strftime and mktime implementations (they're public-domain). But that's a bigger project and I lack time to investigate it.

The next commits that fix some bugs showed these were not properly
getting initialized.
On some systems this was unused.  Now that we have C99, we can move the
declaration and some #ifdef's and not declare it unless it is going to
be used.
tl;dr:

Fixes GH Perl#23878

I botched this in Perl 5.42.  These conditional compilation statements
were just plain wrong, causing code to be skipped that should have been
compiled.  It only affected the few hours of the year when daylight
savings time is removed, so that the hour value is repeated.  We didn't
have a good test for that.

gory details:

libc uses 'struct tm' to hold information about a given instant in
time, containing fields for things like the year, month, hour, etc.  The
libc function mktime() is used to normalize the structure, adjusting,
say, an input Nov 31 to be Dec 01.

One of the fields in the structure, 'is_dst', indicates if daylight
savings is in effect, or whether that fact is unknown.  If unknown,
mktime() is supposed to calculate the answer and to change 'is_dst'
accordingly.  Some implementations appear to always do this calculation
even when the input value says the result is known.  Others appear to
honor it.

Some libc implementations have extra fields in 'struct tm'.

Perl has a stripped down version of mktime(), called mini_mktime(),
written by Larry Wall a long time ago.  I don't know why.  This crippled
version ignores locale and daylight time.  It also doesn't know about
the extra fields in 'struct tm' that some implementations have.  Nor can
it be extended to know about those fields, as they are dependent on
timezone and daylight time, which it deliberately doesn't consider.

The botched #ifdef's were supposed to compensate for both the extra
fields in the struct and that some libc implementations always
recalculate 'is_dst'.

On systems with these fields, the botched #if's caused only
mini_mktime() to be called.  This meant that these extra fields didn't
get populated, and daylight time is never considered to be in effect.
And 'is_dst' does not get changed from the input.

On systems without these fields, the regular libc mktime() would be
called appropriately.

The bottom line is that for the portion of the year when daylight
savings is not in effect, that portion worked properly.  The two extra
fields would not be populated, so if some code were to read them, it
would only get the proper values by chance.  We got no reports of this.
I attribute that to the fact that the use of these is not portable, so
code wouldn't tend to use them.  There are portable ways to access the
information they contain.

Tests were failing for the portions of the year when daylight savings is
in effect; see GH Perl#22351.  The code looked correct just reading it (not
seeing the flaw in the #ifdef's), so I assumed that it was an issue in
the libc implementations and instituted a workaround.  (I can't now
think of a platform where there hasn't been a problem with a libc with
something regarding locales, so that was a reasonable assumption.)

Among other things (fixed in the next commit), that workaround overrode
the 'is_dst' field after the call to mini_mktime(), so that the value
actually passed to libc strftime() indicated that daylight is in effect.

What happens next depends on the libc strftime() implementation.  It
could conceivably itself call mktime() which might choose to override
is_dst to be the correct value, and everything would always work.  The
more likely possibility is that it just takes the values in the struct
as-is.  Remember that those values on systems with the extra fields were
calculated as if daylight savings wasn't in effect, but now we're
telling strftime() to use those values as if it were in effect.  This
is a discrepancy.  I'd have to trace through some libc implementations
to understand why this discrepancy seems to not matter except at the
transition time.

But the bottom line is this commit removes that discrepancy, and causes
mktime() to be called appropriately on systems where it wasn't, so
strftime() should now function properly.
Because of the bug fixed in the previous commit, this function was
changed in 5.42 to have a work around, which is no longer needed.
Because of the bug fixed two commits ago, this function was changed in
5.42 to have a work around, which is no longer needed.
I ran some experiments, and found that tzset works on Windows, and is
required after changing the TZ environment variable from within perl.

But it did not work on MingW.  Maybe there is something else needed in
the Posix module that would get it to work; I didn't investigate

The only way I could figure out how to distinguish in Perl space between
MSVC and MingW was looking at the make command.  Maybe there is a better
way
Due to the differences in various systems' implementations, I think it
is a good idea to more fully document the vagaries I have discovered,
and how perl resolves them.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Change in output of strftime from v5.40 to v5.42

4 participants