fsu seal Florida State University
  Center home>> help>> stat faq>> date
 

Situation 1: when DATE value is separated by either hyphen,slash,dot,comma,or space


Such as:

1001 7-11-1995
1002 1/21/1962
1003 11.2.1952
1004 Sept 18, 1995
1005 Jun20,1997
1006 January 1 2000
1007 4/12.1990
1008 3/22/68

This file contains IDs and birth dates in various format.The way you get dates into Stata is to read them as strings and then
convert the strings to Stata elapsed dates:

First, read them as strings into data:
infix id 1-4 str bday 6-20 using http://cdph.fsu.edu/people/minxing/date1.raw
Second, convert the strings to Stata elapsed dates by generating a new variable "edate":

. gen edate=date(bday,"mdy")
(1 missing value generated)

. list

     +-------------------------------+
     |   id             bday   edate |
     |-------------------------------|
  1. | 1001        7-11-1995   12975 |
  2. | 1002        1/21/1962     751 |
  3. | 1003        11.2.1952   -2616 |
  4. | 1004    Sept 18, 1995   13044 |
  5. | 1005       Jun20,1997   13685 |
     |-------------------------------|
  6. | 1006   January 1 2000   14610 |
  7. | 1007        4/12.1990   11059 |
  8. | 1008          3/22/68       . |
     +-------------------------------+
As you can see, Stata was able to handle almost all of these crazy date formats, as long as there are delimiters separating the month,
day, and year. It was able to handle June20,1997 even though there was not a delimiter between the month and day (Stata was able to
figure it out since the month was character and the day was a number).The only date that did not work was 3/22/68. 3/22/68 is considered
as missing because of two-digit years, however, it will show if you tell Stata whether it is 1968 or 2068. for example: “md19y” or “dm20y”.

Situation 2: How about if Month, Day, and Year run together in one variable?


Such as:


   . infile id long bday using http://cdph.fsu.edu/people/minxing/date2.raw
     +-----------------+
     |   id       bday |
     |-----------------|
  1. | 1001    7111995 |
  2. | 1002    1211962 |
  3. | 1003   11021952 |
     +-----------------+

Your program could be 
        generate month = int(bday/1000000)
        generate day = int((bday - month*1000000)/10000)
        generate year = bday - month*1000000 - day*10000
        generate elapdate = mdy(month, day, year)
        list  

The output from this program is 

     +-------------------------------------------------+
     |   id       bday   month   day   year   elapdate |
     |-------------------------------------------------|
  1. | 1001    7111995       7    11   1995      12975 |
  2. | 1002    1211962       1    21   1962        751 |
  3. | 1003   11021952      11     2   1952      -2616 |
     +-------------------------------------------------+

Situation 3: How about if Month, Day, and Year run together in one STRING variable?



infile id str10 bday using http://cdph.fsu.edu/people/minxing/date3.raw
(3 observations read)
.         list
     +------------------+
     |   id        bday |
     |------------------|
  1. | 1001   Jul111995 |
  2. | 1002   Jan211962 |
  3. | 1003   Nov021952 |
     +------------------+
	Now we have a string variable “bday”, we may use the same method to create month, day, and year from this variable:

   .gen month = substr(bday,1,3)
   .gen day = real(substr(bday,4,2)) 
   .gen year = real(substr(bday,6,4)) 
   .list

     +----------------------------------------+
     |   id        bday   month    day   year |
     |----------------------------------------|
  1. | 1001   Jul111995     Jul     11   1995 |
  2. | 1002   Jan211962     Jan     21   1962 |
  3. | 1003   Nov021952     Nov      2   1952 |
     +----------------------------------------+

	Now we have three variables for month, day, and year, however, “month” is still character variable, 
we should convert it to numerical variable by using ecode command: . encode month,gen(month2) . list,nolabel +-------------------------------------------------+ | id bday month day year month2 | |-------------------------------------------------| 1. | 1001 Jul111995 Jul 11 1995 2 | 2. | 1002 Jan211962 Jan 21 1962 1 | 3. | 1003 Nov021952 Nov 2 1952 3 | +-------------------------------------------------+ We need to use nolabel option to see month2 is really numeric. Now we can recode month2 variable by using replace command: .replace month2=7 if month2==2 .replace month2=1 if month2==1 .replace month2=11 if month2==3 .list,nolabel +-------------------------------------------------+ | id bday month day year month2 | |-------------------------------------------------| 1. | 1001 Jul111995 Jul 11 1995 7 | 2. | 1002 Jan211962 Jan 21 1962 1 | 3. | 1003 Nov021952 Nov 2 1952 11| +-------------------------------------------------+ Finally we can create elapdate: . gen elapdate=mdy(month2,day,year) . list +-----------------------------------------------------------+ | id bday month day year month2 elapdate | |-----------------------------------------------------------| 1. | 1001 Jul111995 Jul 11 1995 7 12975 | 2. | 1002 Jan211962 Jan 21 1962 1 751 | 3. | 1003 Nov021952 Nov 2 1952 11 -2616 | +-----------------------------------------------------------+
If you are not confortable with the DATE in Stata's elapdate format, you can format it to normal date:

. format elapdate %d
. list elapdate

     +-----------+
     |  elapdate |
     |-----------|
  1. | 11jul1995 |
  2. | 21jan1962 |
  3. | 02nov1952 |
     +-----------+

Some applications of Stata elapsed dates

Elapsed dates is the format that Stata uses to manipulate date information. Elapsed dates are calculated
as the number of days from January 1, 1960. This format is useful for adding or subtracting dates or to change
the format of date variables.
For example, if elapdate=0, the real date would be January 1,1960; if elapdate=10, the real date would be To display value of elapdate for a date (e.g. Dec 03,1999), you may use:


. display d(03dec1999)
14581
 
Thinking about dates in this way has a big advantage : if you subtract two dates, you obtain the number of days 
between dates. For example: 1) If we have birthday variable, and suppose we want to know the age on Jan 1, 2000? .gen age = (mdy(1,1,200)-birthday)/365.25 2) Or Mary was admitted to the hospital on 27mar1995 (12,869 by Stata's way of thinking) and released on 3apr1995 (12,876).
Mary was in the hospital 12,876-12,869 = 7 days. 3) Sam was born on 14jun1952 (-2,757 by Stata's count), and we want to know his age as of 18sep1995 (13,044).
Sam is 13,044-(-2,757) =15,801 days old or, if you prefer, 15,801/365.25 = 43.26 years old.

back to previous page