SAS sample statistic functions
Sample statistics for a single variable across all observations are simple to obtain using, for example, PROC MEANS, PROC UNIVARIATE, etc. The simplest method to obtain similar statistics across several variables within an observation is with a 'sample statistics function'.
For example:
sum_wt=sum(of weight1 weight2 weight3 weight4 weight5);
Note that this is equivalent to
sum_wt=sum(of weight1-weight5);
but is not equivalent to
sum_wt=weight1 + weight2 + weight3 + weight4 + weight5;
since the SUM function returns the sum of non-missing arguments, whereas the '+' operator returns a missing value if any of the arguments are missing.
The following are all valid arguments for the SUM function:
sum(of variable1-variablen) where n is an integer greater than 1
sum(of x y z)
sum(of array-name{*})
sum(of _numeric_)
sum(of x--a) where x precedes a in the PDV order
A comma delimited list is also a valid argument, for example:
sum(x, y, z)
However, I recommend always using an argument preceded by OF, since this minimises the chance that you write something like
sum_wt=sum(weight1-weight5);
which is a valid SAS expression, but evaluates to the difference between weight1 and weight5.
Other useful sample statistic functions are:
MAX(argument,...) returns the largest value
MIN(argument,...) returns the smallest value
MEAN(argument,...) returns the arithmetic mean (average)
N(argument,....) returns the number of nonmissing arguments
NMISS(argument,...) returns the number of missing values
STD(argument,...) returns the standard deviation
STDERR(argument,...) returns the standard error of the mean
VAR(argument,...) returns the variance
Example usage
You may, for example, have collected weekly test scores over a 20 week period and wish to calculate the average score for all observations with the proviso that a maximum of 2 scores may be missing.
if nmiss(of test1-test20) le 2 then
testmean=mean(of test1-test20);
else testmean=.;
No comments:
Post a Comment