Home Forums Main Forums SAS Forum How to “impute” missing values with previous values?

  • How to “impute” missing values with previous values?

     Justin updated 7 months, 1 week ago 2 Members · 3 Posts
  • Justin

    Administrator
    February 7, 2021 at 3:48 pm

    Method A: Retain + Merge

    filename Alice 'file path\alice.xls';
    proc import datafile=Alice out=alice(rename=(Name_ = Name)) dbms=Excel replace;
    run;
    *************** Condition: By default, the blank in Name means it has same value  as the previous obs. ************;

    data B(drop=pre);
    length Pre $20 New_Name $20 ;
    retain Pre ' ';
    set Alice;

    if Name =" " then New_Name=Pre;
    else New_Name=Name;
    Pre=New_Name;
    run;

    data First_Name ;
    set B;
    if New_Name = Name;
    run;

    proc sort data=First_Name; by name; run;
    proc sort data=B(keep=New_Name Q5) out=Full ;
    by New_Name;
    run;

    data final(drop=New_Name);
    length name $20;
    merge First_Name full(rename=(new_name=name)) ;
    by name;
    run;

  • Justin

    Administrator
    February 7, 2021 at 3:53 pm

    Method B:

    Use SUM statement to create a counter variable CNT, then split the dataset into 2 parts based on CNT=0 or not. Then merge them back together. Drop and rename vars if needed. See below:

    data W;
    length CNT 3 New $20 ;
    set Alice;
    retain New ' ' ;
    if Name =' ' then CNT +1;
    else CNT=0;
    if CNT=0 then New=Name;
    run;

    data Old(drop=CNT);
    set W;
    where CNT=0;
    run;

    proc sort data=Old; by New; run;
    proc sort data=W(keep= New Name Q5) out= New;
    by New;
    run;
    data final;
    merge old new;
    by New;
    run;

    Method 3: Use By—NotSorted option.

     data B;
    length New $10 Pre $10;
    set Alice;
    by Name notsorted;
    retain Pre 'AAA';
    F=first.Name;
    L=last.Name;
    Flag=F+L;
    if flag=2 and not missing(Name) then New=Name;
    else New=Pre;
    Pre=New;
    run;

     data Old;
    set B;
    if F=1 and not missing(Name) then output Old;
    run;

    Then merge the OLD with B by New.

    Reach-out:

    Some people may not have SAS in their company. If so, can you only use SQL to do it? It will be much more challenging in coding. But no pain, no gain!

    • This reply was modified 7 months, 1 week ago by  Justin.
  • Datura

    Member
    February 7, 2021 at 3:39 pm

    Q: Given the below dataset, for the Name variable, we want to impute the missing values with previous value. By default, they are same as the previous names.

    name state date q1 q2 q3 q4
    Jason NY 04-15-2010 very poor not satisfied yes no
    Mike NY 6/12/2010 very good satisfied no yes
    9/21/2010 very good satisfied no yes
    3/8/2011 very good satisfied no yes
    Mary CA 5-26-2010 good satisfied no yes
    8/12/2009 good satisfied no yes
    6/12/2012 very good not satisfied yes no
    7/30/2003 bad satisfied no no
    3/12/2008 good satisfied yes yes
    Peter CA 12/3/2010 poor not satisfied yes no
    12/11/2003 good satisfied yes no
    10/23/2007 bad not satisfied no no

Log in to reply.

Original Post
0 of 0 posts June 2018
Now