Fundamentals of Programming in SAS. James Blum

Читать онлайн книгу.

Fundamentals of Programming in SAS - James Blum


Скачать книгу
2.7.3: Using a Format to Control Categories for a Variable in the TABLE Statement

First mortgage monthly payment
MortgagePaymentFrequencyPercentCumulativeFrequencyCumulativePercent
None60369152.0860369152.08
$350 and Below598565.1666354757.25
$351 to $100028311124.4394665881.67
$1001 to $160012880111.11107545992.79
Over $1600836037.211159062100.00

      The FREQ procedure is not limited to one-way frequencies—special operators between variables in the TABLE statement allow for construction of multi-way tables.

      The * operator constructs cross-tabular summaries for two categorical variables, which includes the following statistics:

       cross-tabular and marginal frequencies

       cross-tabular and marginal percentages

       conditional percentages within each row and column

      Program 2.7.4 summarizes all combinations of Metro and MortgagePayment, with Metro formatted to add detail and MortgagePayment formatted into the bins used in the previous example.

      Program 2.7.4: Using the * Operator to Create a Cross-Tabular Summary with PROC FREQ

      proc format;

      value METRO

      0 = “Not Identifiable”

      1 = “Not in Metro Area”

      2 = “Metro, Inside City”

      3 = “Metro, Outside City”

      4 = “Metro, City Status Unknown”

      ;

      value Mort

      0=’None’

      1-350=”$350 and Below”

      351-1000=”$351 to $1000”

      1001-1600=”$1001 to $1600”

      1601-high=”Over $1600”

      ;

      run;

      proc freq data=BookData.IPUMS2005Basic;

      table Metro*MortgagePayment;

      format Metro Metro. MortgagePayment Mort.;

      run;

       The first variable listed in any request of the form A*B is placed on the rows in the table. Requesting MortgagePayment*Metro transposes the table and the included summary statistics.

       The format applied to the Metro variable is merely a change in display and has no effect on the structure of the table—it is five rows with or without the format. The format on MortgagePayment is essential to the column structure—allowing each unique value of MortgagePayment to form a column does not produce a useful summary table.

      Output 2.7.4: Using the * Operator to Create a Cross-Tabular Summary with PROC FREQ

Table of METRO by MortgagePayment
METRO(Metropolitan status)MortgagePayment(First mortgage monthly payment)
FrequencyPercentRow PctCol PctNone$350 and Below$351 to $1000$1001 to $1600Over $1600Total
Not Identifiable493794.2653.668.1869790.607.5811.66254882.2027.709.0073070.637.945.6728750.253.123.44920287.94
Not in Metro Area13431411.5958.2022.25216981.879.4036.25609485.2626.4121.53104640.904.538.1233510.291.454.0123077519.91
Metro, Inside City964878.3262.5015.9844100.382.867.37288662.4918.7010.20140491.219.1010.91105560.916.8412.6315436813.32
Metro, Outside City14996112.9443.9824.84121481.053.5620.30793886.8523.2828.04563304.8616.5243.73431553.7212.6651.6234098229.42
Metro, City Status Unknown17355014.9750.9128.75146211.264.2924.43884217.6325.9431.23406513.5111.9231.56236662.046.9428.3134090929.41
Total60369152.08598565.1628311124.4312880111.11836037.211159062100.00

      Various options are available to control the displayed statistics. Program 2.7.5 illustrates some of these with the result shown in Output 2.7.5.

      Program 2.7.5: Using Options in the TABLE Statement.

      proc freq data=BookData.IPUMS2005Basic;

      table Metro*MortgagePayment / nocol nopercent format=comma10.;

      format Metro Metro. MortgagePayment Mort.;

      run;

       NOCOL and NOPERCENT suppress the column and overall percentages, respectively, with NOPERCENT also applying to the marginal totals. NOROW and NOFREQ are also available, with NOFREQ also applying to the marginal totals.

       A format can be applied to the frequency statistic; however, this only applies to cross-tabular frequency tables and has no effect in one-way tables.

      Output 2.7.5: Using Options in the TABLE Statement

Table of METRO by MortgagePayment
METRO(Metropolitan status)MortgagePayment(First mortgage monthly payment)
FrequencyRow PctNone$350 and Below$351 to $1000$1001 to $1600Over $1600Total
Not Identifiable49,37953.666,9797.5825,48827.707,3077.942,8753.1292,028
Not in Metro Area134,31458.2021,6989.4060,94826.4110,4644.533,3511.45230,775
Metro, Inside City96,48762.504,4102.8628,86618.7014,0499.1010,5566.84154,368
Metro, Outside City149,96143.9812,1483.5679,38823.2856,33016.5243,15512.66340,982
Metro, City Status Unknown173,55050.9114,6214.2988,42125.9440,65111.9223,6666.94340,909
Total603,69159,856283,111128,80183,6031,159,062

      Higher dimensional requests can be made; however, they are constructed as a series of two-dimensional tables. Therefore, a request of A*B*C in the TABLE statement creates the B*C table for each level of A, while a request of A*B*C*D makes the C*D table for each combination of A and B, and so forth. Program 2.7.6 generates a three-way table, where a cross-tabulation of Metro and HomeValue is built for each level of Mortgage Status as shown in Output 2.7.6. The VALUE statement that defines the character format $MortStatus takes advantage of the fact that value ranges are legal for character variables. Be sure to understand the difference between uppercase and lowercase letters when ordering the values of a character variable.

      Program 2.7.6: A Three-Way Table in PROC FREQ

      proc format;

      value MetroB

      0 = “Not Identifiable”

      1 = “Not in Metro Area”

      other = “In a Metro Area”

      ;

      value $MortStatus

      ‘No’-’Nz’=’No’

      ‘Yes’-’Yz’=’Yes’

      ;

      value Hvalue

      0-65000=’$65,000 and Below’

      65000<-110000=’$65,001 to $110,000’

      110000<-225000=’$110,001 to $225,000’

      225000<-500000=’$225,001 to $500,000’

      500000-high=’Above $500,000’


Скачать книгу