Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib/getdate.y: Ignore time-zone information and use UTC #1214

Merged
merged 1 commit into from
Mar 2, 2025

Conversation

alejandro-colomar
Copy link
Collaborator

@alejandro-colomar alejandro-colomar commented Feb 17, 2025

There is exactly one caller of this function, and it wants a date, not a time. It is useless to be able to parse local dates, because we ultimately store a UTC date. To avoid confusion, unconditionally use UTC. Since this code had important bugs regarding offset, we can safely assume that no existing users rely on being able to use their local date (this never worked correctly).

Also, the code parsing time zones was quite bad.

Link: #1202
Link: #1209
Reported-by: @zeha
Reported-by: @timparenti
Reported-by: @leegarrett
Cc: @kenion
Cc: @jubalh
Cc: @eggert
Cc: @ikerexxe
Cc: @hallyn
Cc: @BrianInglis


Revisions:

v2
  • Remove dead code. difftm() was only used in code that we're removing.
$ git range-diff master gh/gd gd
1:  1ba2fb51 ! 1:  5888f06c lib/getdate.y: Ignore time-zone information and use UTC
    @@ lib/getdate.y: static int LookupWord (char *buff)
        return tID;
      }
      
    +@@ lib/getdate.y: yylex (void)
    + 
    + #define TM_YEAR_ORIGIN 1900
    + 
    +-/* Yield A - B, measured in seconds.  */
    +-static long difftm (struct tm *a, struct tm *b)
    +-{
    +-  int ay = a->tm_year + (TM_YEAR_ORIGIN - 1);
    +-  int by = b->tm_year + (TM_YEAR_ORIGIN - 1);
    +-  long days = (
    +-  /* difference in day of year */
    +-          a->tm_yday - b->tm_yday
    +-  /* + intervening leap days */
    +-          + ((ay >> 2) - (by >> 2))
    +-          - (ay / 100 - by / 100)
    +-          + ((ay / 100 >> 2) - (by / 100 >> 2))
    +-  /* + difference in years * 365 */
    +-          + (long) (ay - by) * 365
    +-  );
    +-  return (60 * (60 * (24 * days + (a->tm_hour - b->tm_hour))
    +-          + (a->tm_min - b->tm_min))
    +-    + (a->tm_sec - b->tm_sec));
    +-}
    +-
    + time_t get_date (const char *p, const time_t *now)
    + {
    +   struct tm tm, tm0, *tmp;
     @@ lib/getdate.y: time_t get_date (const char *p, const time_t *now)
      
        yyInput = p;
v2b
  • Remove unused variables. [@zeha]
$ git range-diff master gh/gd gd
1:  5888f06c ! 1:  340ffd8f lib/getdate.y: Ignore time-zone information and use UTC
    @@ lib/getdate.y: static TABLE const OtherTable[] = {
      ^L
      
      
    +@@ lib/getdate.y: static int ToYear (int Year)
    + static int LookupWord (char *buff)
    + {
    +   register char *p;
    +-  register char *q;
    +   register const TABLE *tp;
    +   int i;
    +   bool abbrev;
     @@ lib/getdate.y: static int LookupWord (char *buff)
        }
          }
    @@ lib/getdate.y: yylex (void)
     -
      time_t get_date (const char *p, const time_t *now)
      {
    -   struct tm tm, tm0, *tmp;
    -@@ lib/getdate.y: time_t get_date (const char *p, const time_t *now)
    +-  struct tm tm, tm0, *tmp;
    ++  struct tm  tm, *tmp;
    +   time_t Start;
      
        yyInput = p;
        Start = now ? *now : time(NULL);
v2c
  • Remove now-unused macros.
$ git range-diff master gh/gd gd
1:  340ffd8f ! 1:  a32fa0bc lib/getdate.y: Ignore time-zone information and use UTC
    @@ Commit message
         Signed-off-by: Alejandro Colomar <[email protected]>
     
      ## lib/getdate.y ##
    +@@
    + static int yylex (void);
    + static int yyerror (const char *s);
    + 
    +-#define EPOCH             1970
    +-#define HOUR(x)           ((x) * 60)
    + 
    + #define MAX_BUFF_LEN    128   /* size of buffer to read the date into */
    + 
     @@ lib/getdate.y: static int       yyHaveDate;
      static int        yyHaveDay;
      static int        yyHaveRel;
v2d
  • Revert accidental removal from v2b.
$ git range-diff master gh/gd gd
1:  a32fa0bc ! 1:  0b0cc2ae lib/getdate.y: Ignore time-zone information and use UTC
    @@ lib/getdate.y: yylex (void)
     -
      time_t get_date (const char *p, const time_t *now)
      {
    --  struct tm tm, tm0, *tmp;
    -+  struct tm  tm, *tmp;
    -   time_t Start;
    +   struct tm tm, tm0, *tmp;
    +@@ lib/getdate.y: time_t get_date (const char *p, const time_t *now)
      
        yyInput = p;
        Start = now ? *now : time(NULL);
v3
  • Remove unused variable. [@zeha]
$ git range-diff master gh/gd gd
1:  0b0cc2ae ! 1:  a2718233 lib/getdate.y: Ignore time-zone information and use UTC
    @@ lib/getdate.y: yylex (void)
     -
      time_t get_date (const char *p, const time_t *now)
      {
    -   struct tm tm, tm0, *tmp;
    -@@ lib/getdate.y: time_t get_date (const char *p, const time_t *now)
    +-  struct tm tm, tm0, *tmp;
    ++  struct tm  tm, *tmp;
    +   time_t Start;
      
        yyInput = p;
        Start = now ? *now : time(NULL);
    @@ lib/getdate.y: time_t get_date (const char *p, const time_t *now)
        tm.tm_min += yyRelMinutes;
        tm.tm_sec += yyRelSeconds;
     -  tm.tm_isdst = -1;
    +-  tm0 = tm;
     +  tm.tm_isdst = 0;
    -   tm0 = tm;
      
        Start = mktime (&tm);
      
v4
  • Use timegm(3) instead of mktime(3).
$ git range-diff master gh/gd gd
1:  a2718233 ! 1:  e43879f1 lib/getdate.y: Ignore time-zone information and use UTC
    @@ lib/getdate.y: time_t get_date (const char *p, const time_t *now)
     -  tm0 = tm;
     +  tm.tm_isdst = 0;
      
    -   Start = mktime (&tm);
    +-  Start = mktime (&tm);
    ++  Start = timegm(&tm);
      
        if (Start == (time_t) -1)
          {
    @@ lib/getdate.y: time_t get_date (const char *p, const time_t *now)
          }
      
     @@ lib/getdate.y: time_t get_date (const char *p, const time_t *now)
    +     {
    +       tm.tm_mday += ((yyDayNumber - tm.tm_wday + 7) % 7
    +                + 7 * (yyDayOrdinal - (0 < yyDayOrdinal)));
    +-      Start = mktime (&tm);
    ++      Start = timegm(&tm);
    +       if (Start == (time_t) -1)
        return Start;
          }
      
v4b
  • Remove stuff from the syntax, as reported by @hallyn .
  • Reword commit to be nicer.
$ git range-diff master gh/gd gd
1:  e43879f1 ! 1:  6bad604a lib/getdate.y: Ignore time-zone information and use UTC
    @@ Commit message
         assume that no existing users rely on being able to use their local
         date (this never worked correctly).
     
    -    Also, the code parsing time zones was quite bad.
    +    Also, the code parsing time zones is quite bad, for today's standards.
     
         Link: <https://github.com/shadow-maint/shadow/issues/1202>
         Link: <https://github.com/shadow-maint/shadow/issues/1209>
    @@ lib/getdate.y: spec      : /* NULL */
            yyHaveDate++;
        }
     @@ lib/getdate.y: time     : tUNUMBER tMERIDIAN {
    +       yySeconds = 0;
    +       yyMeridian = $4;
    +   }
    +-  | tUNUMBER ':' tUNUMBER tSNUMBER {
    ++  | tUNUMBER ':' tUNUMBER {
            yyHour = $1;
            yyMinutes = $3;
            yyMeridian = MER24;
    @@ lib/getdate.y: time      : tUNUMBER tMERIDIAN {
        | tUNUMBER ':' tUNUMBER ':' tUNUMBER o_merid {
            yyHour = $1;
     @@ lib/getdate.y: time     : tUNUMBER tMERIDIAN {
    +       yySeconds = $5;
    +       yyMeridian = $6;
    +   }
    +-  | tUNUMBER ':' tUNUMBER ':' tUNUMBER tSNUMBER {
    ++  | tUNUMBER ':' tUNUMBER ':' tUNUMBER {
    +       yyHour = $1;
            yyMinutes = $3;
            yySeconds = $5;
            yyMeridian = MER24;

@alejandro-colomar
Copy link
Collaborator Author

alejandro-colomar commented Feb 17, 2025

@zeha, @leegarrett

I think this should partially close your issue (the other part being printing also UTC, which is the other PR open about this at the moment). Please test.

@timparenti

I think this should partially address your concern regarding parsing timezones. :-)
The next step will be to remove all parsing of time, which will leave us with code parsing just a UTC date.

@alejandro-colomar alejandro-colomar force-pushed the gd branch 2 times, most recently from 50b8330 to 1ba2fb5 Compare February 17, 2025 23:54
@alejandro-colomar alejandro-colomar changed the title lib/getdate.y: Ignore time zone information and use UTC lib/getdate.y: Ignore time-zone information and use UTC Feb 17, 2025
@alejandro-colomar alejandro-colomar marked this pull request as ready for review February 18, 2025 00:20
@zeha
Copy link
Contributor

zeha commented Feb 18, 2025

I'll note that usermod(8) documents exactly two acceptable formats: YYYY-MM-DD and integers (incl. -1). That makes me think a lot of the remaining code is not actually necessary.

@zeha
Copy link
Contributor

zeha commented Feb 18, 2025

Please test.

Quick test looks good to me.

@zeha
Copy link
Contributor

zeha commented Feb 18, 2025

I'll note that usermod(8) documents exactly two acceptable formats: YYYY-MM-DD and integers (incl. -1). That makes me think a lot of the remaining code is not actually necessary.

Also I just discovered it accepts 2024-18-18 which seems like not a valid date in YYYY-MM-DD format.

@alejandro-colomar
Copy link
Collaborator Author

I'll note that usermod(8) documents exactly two acceptable formats: YYYY-MM-DD and integers (incl. -1). That makes me think a lot of the remaining code is not actually necessary.

Also I just discovered it accepts 2024-18-18 which seems like not a valid date in YYYY-MM-DD format.

Yeah, that code is just brain damaged. If you don't specify a valid date, it goes dadada and doesn't report an error at all.
I would like to get rid of most of it.

@alejandro-colomar
Copy link
Collaborator Author

I'll note that usermod(8) documents exactly two acceptable formats: YYYY-MM-DD and integers (incl. -1). That makes me think a lot of the remaining code is not actually necessary.

I definitely want to get rid of most of that code. I don't know how much I'll be able to get rid of, but ideally, I'd end up with a 10-liner date parser written in pure C. :-)

@alejandro-colomar
Copy link
Collaborator Author

alejandro-colomar commented Feb 18, 2025

Actually, I think this fix is not correct/enough.

@alejandro-colomar
Copy link
Collaborator Author

alejandro-colomar commented Feb 18, 2025

Actually, I think this fix is not correct/enough.

After chopping off most of lib/getdate.y code in #1217 , here's what remains in get_date():

time_t get_date (const char *p, const time_t *now)
{
  struct tm  tm;

  yyInput = p;
  yyHaveDate = 0;

  if (yyparse ()
      || yyHaveDate > 1)
    return -1;

  tm.tm_year = ToYear (yyYear) - TM_YEAR_ORIGIN;
  tm.tm_mon = yyMonth - 1;
  tm.tm_mday = yyDay;
  tm.tm_hour = tm.tm_min = tm.tm_sec = 0;
  tm.tm_isdst = 0;

  return mktime(&tm);
}

See that mktime(3) call? That's all that matters. It's a local date. We need a UTC-equivalent of mktime(3). @timparenti, what do you recommend?

@alejandro-colomar
Copy link
Collaborator Author

Ahhh, what I was looking for is timegm(3). I didn't remember about it. :-)

@alejandro-colomar
Copy link
Collaborator Author

I have added a mention in the mktime(3) manual page. It was weird that there was no mention about it, except for a reference in SEE ALSO.

commit 4da273dfd6880399f0ea87aa1a452fe07a60ed3a (HEAD -> contrib)
Author: Alejandro Colomar <[email protected]>
Date:   Tue Feb 18 13:25:01 2025 +0100

    man/man3/ctime.3: Mention that timegm(3) is a UTC equivalent of mktime(3)
    
    Signed-off-by: Alejandro Colomar <[email protected]>

diff --git a/man/man3/ctime.3 b/man/man3/ctime.3
index acd6f1565..c2d14269c 100644
--- a/man/man3/ctime.3
+++ b/man/man3/ctime.3
@@ -177,6 +177,9 @@ .SH DESCRIPTION
 .BR mktime ()
 should (use timezone information and system databases to)
 attempt to determine whether DST is in effect at the specified time.
+See
+.BR timegm (3)
+for a UTC equivalent of this function.
 .P
 The
 .BR mktime ()

Copy link
Collaborator

@ikerexxe ikerexxe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we would say in Spain, this code makes the baby Jesus cry (and me too, for that matter). With this I want to say that I am unable to review this code, but if you tell me that your tests work I see with good eyes to press the merge button.

@alejandro-colomar
Copy link
Collaborator Author

alejandro-colomar commented Feb 25, 2025

As we would say in Spain, this code makes the baby Jesus cry (and me too, for that matter). With this I want to say that I am unable to review this code, but if you tell me that your tests work I see with good eyes to press the merge button.

Yep, tests looked good. (You forgot to actually press approve? I can't merge otherwise. Or you can merge.) Thanks!

@ikerexxe
Copy link
Collaborator

Yep, tests looked good. (You forgot to actually press approve? I can't merge otherwise. Or you can merge.) Thanks!

No, unfortunately I can't review this code 😓

@alejandro-colomar
Copy link
Collaborator Author

Yep, tests looked good. (You forgot to actually press approve? I can't merge otherwise. Or you can merge.) Thanks!

No, unfortunately I can't review this code 😓

Ok, so we should leave it for @hallyn to review?

@hallyn
Copy link
Member

hallyn commented Feb 26, 2025

I'll take a look.

There is exactly one caller of this function, and it wants a date, not a
time.  It is useless to be able to parse local dates, because we
ultimately store a UTC date.  To avoid confusion, unconditionally use
UTC.  Since this code had important bugs regarding offset, we can safely
assume that no existing users rely on being able to use their local
date (this never worked correctly).

Also, the code parsing time zones is quite bad, for today's standards.

Link: <shadow-maint#1202>
Link: <shadow-maint#1209>
Reported-by: Chris Hofstaedtler <[email protected]>
Reported-by: Tim Parenti <[email protected]>
Reported-by: Lee Garrett <[email protected]>
Cc: Gus Kenion <https://github.com/kenion>
Cc: Michael Vetter <[email protected]>
Cc: Paul Eggert <[email protected]>
Cc: Iker Pedrosa <[email protected]>
Cc: "Serge E. Hallyn" <[email protected]>
Cc: Brian Inglis <[email protected]>
Signed-off-by: Alejandro Colomar <[email protected]>
@hallyn hallyn merged commit 00c78bf into shadow-maint:master Mar 2, 2025
10 checks passed
@alejandro-colomar alejandro-colomar deleted the gd branch March 2, 2025 06:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants