Friday, May 11, 2007

SAS - 1 : Data Step

Objectives

  1. Learn to use the DATA statement

  2. Learn to use the INPUT statement

  3. Learn to use the CARDS statement

  4. Learn how to use the semicolon (;)

  5. Learn how to include TITLES on your output

  6. Learn how to run a SAS program

DATA statement

    Use:

    Names the SAS data set

    Syntax:

    DATA SOMENAME;

    Result:

    A temporary SAS data set named SOMENAME is created

The DATA statement signals the beginning of a DATA step. The general form of the SAS DATA statement is:

    DATA SOMENAME;

The DATA statement names the data set you are creating. The name should be from 1-8 characters and must begin with a letter or underscore.

ToC

INPUT statement

    Use:

    Defines names and order of variables to SAS

    Syntax:

    INPUT variable variable_type column(s);

    Result:

    Input data are defined for SAS

The INPUT statement specifies the names and order of the variables in your data. Although there are three types of INPUT statements which can be mixed, the beginning SAS user should only be concerned with learning how to use the Column Input style.

The INPUT statement should indicate in which columns each variable may be found. In addition, the INPUT statement should indicate how many lines it takes to record all of the information for each case or observation. The general form of the SAS INPUT statement is:

    INPUT NAME $ 1-14 AGE 16-17 SEX 19 Q1 21 Q2 23;

The variable NAME is a character variable as is indicated by the dollar sign ($) after the variable name. All of the other variables are numeric.

If there are multiple lines of data for each observation, use a forward slash ('/') in the INPUT statement to indicate where a new data line begins.

The general form of the SAS INPUT statement with multiple lines of data per observation is:

    INPUT NAME $ 1-14 AGE 16-17 / SEX 1 Q1 3 Q2 5;

Note: When describing the second line of input data, you begin with column one again. Each piece of data, or variable, will be read from the same columns for each of your observations. Only one INPUT statement is necessary to describe the data for all of your cases.

ToC

CARDS statement

    Use:

    Signals that input data will follow

    Syntax:

    CARDS;

    Result:

    Data can be processed for the SAS data set

The CARDS statement signals that the data will follow next and immediately precedes your input data lines. The general form of the CARDS statement is:

    DATA SURVEY;
    INPUT NAME $ 1-14 AGE 16-17 SEX 19 Q1 21 Q2 23;
    CARDS;

    (the data goes here)

Note: If the data is contained in an external file, instead of the CARDS, you will usse an INFILE statement to specify where that file resides. (Example: INFILE 'survey.dat';).

ToC

Semicolon

    Use:

    Signals the end of any SAS statement

    Syntax:

    A DATA Step or PROCedure statement;

    Result:

    SAS is signaled that the statement is complete

The semicolon (;) is used as a delimiter to indicate the end of SAS statements.

    DATA SURVEY;
    INPUT NAME $ 1-14 AGE 16-17 SEX 19 Ql 21 Q2 23;
    CARDS;

ToC

TITLE statement

    Use:

    Puts TITLES on your output

    Syntax:

    TITLE 'some title';

    Result:

    A TITLE is added at the top of each page

The TITLE statement assigns a title which appears at the top of the output page. The general form of the TITLE statement is:

    TITLE 'some title';

ToC

How to run a SAS program

Once you have used your editor to type in a SAS program and have saved that program you're ready to run the program. At the Linux ($) prompt, type:

    sas programname.sas

When SAS is finished running the program, it will return two files to the current directory: 1.) programname.log, which contains a log of the job execution, including errors, warnings and notes, and 2) programname.lst, which contains output from the procedures in the program.

For example, let's say you have used the Pico editor to enter a SAS program named survey.sas and have saved it in your root directory. To run that program from the Linux ($) prompt, type:

    sas survey.sas

You will know when SAS has finished running the program when the $ prompt reappears. You will have two new files in your directory, survey.log and survey.lst. Now you may edit the .log file and the .lst file to check for warnings, errors, and notes and to look at the output from the procedures.

A SAS Program Example

ToC

Session 2 Exercise

In the following exercise you will enter the first five SAS statements in your SAS program and you will enter the data from the sample survey.

  1. Invoke the Pico editor to create a new file. At the $ prompt, type:

      pico survey.sas

  2. Enter the first five lines of a SAS program. In order for the exercises in this tutorial to work successfully, you must type the program statements exactly as they are presented here.

      Type:

      TITLE 'Sample SAS Program';

      Result:

      A title is added to the program

      Type:

      DATA SURVEY;

      Result:

      The SAS data set named SURVEY is created

      Type:

      INPUT NAME $ 1-14 AGE 16-17 SEX 19 Q1 21 Q2 23;

      Result:

      The format for your data is described to SAS

      Type:

      CARDS;

      Result:

      SAS is given the signal that data will follow directly

      Type:


      JANE 20 2 2 5
      MICHAEL 18 1 5 2
      MARIA 21 2 2 4
      JUAN 26 1 4 3
      MILDRED 28 2 3 4
      GUNTHER 30 1 5 2
      JOSEPH 25 1 4 4
      JULIA 19 2 2 2
      CODY 27 1 1 1
      AARON 29 1 2 2

      Result:

      Your raw data are entered to be read by your INPUT statement.

    Note: Remember to enter the data in the exact columns you have specified in your INPUT statement. For example, AGE must be in columns 16 and 17.

  3. Now, save this program under the name survey.sas. (In Pico, you will type , answer yes when asked if you want to save changes and press when asked if you want to call the program survey.sas.)

    Your SAS program should look like this:

No comments: