__Introduction__

These practice questions are just a sample of the SAS Base exam and does not represent all types of questions asked. It is up to you to verify all answers are correct and make sure to use extra resources to cross validate answers.

#### Question 1

Given the following data set WORK.CLASS:

The following program is submitted:

```
data WORK.MALES_OVER25;
set WORK.CLASS;
where Gender="M";
where age>25;
run;
```

**Question:** How many observations are in the data set WORK.MALES_OVER25?

**Answer:** 5 Obervations

**Explanation:**

Here, we focus on the WHERE statements and remember that only the LAST WHERE statement is used to subset the observations. So, the line ‘*where Gender=”M”;*‘ is actually not used and only the last WHERE line is used ‘*where age>25;*‘. Thus, we select all observations where the age is greater than 25 and 5 observations satisfy this criteria.

#### Question 2

Given the SAS data set WORK.ONE:

The following SAS program is submitted:

```
data WORK.TWO;
set WORK.ONE;
Total=sum(of Rev:);
run;
```

**Question:** What value will SAS assign to Total?

**Answer:** 4.8

**Explanation:** The Total variable is the sum of all the columns in the WORK.ONE dataset that start with the characters ‘Rev’ followed by any number of valid characters. Since, all columns ‘Revenue2008’, ‘Revenue2009’ and ‘Revenue2010’ start with ‘Rev’, then there values will be added to the sum to give the Total variable (1.2 + 1.6 + 2.0 = 4.8).

#### Question 3

Given the SAS data set WORK.ONE:

And the SAS data set WORK.TWO:

The following program is submitted:

```
data WORK.BOTH;
merge WORK.ONE WORK.TWO;
by Id;
run;
```

**Question: **What is the first observation in the SAS data set WORK.BOTH?

**Answer:** Id: 182

Char1: M

Char2: Q

**Explanation:** This is a one-to-one merge with two datasets and is joined together using one common variable that is the ‘Id’ column. The first observation will thus be the row where the Id = 182 as that is the first row in both datasets.

#### Question 4

Given the SAS data set WORK.INPUT:

The following SAS program is submitted:

```
data WORK.ONE WORK.TWO;
set WORK.INPUT;
if Var1='A' then output WORK.ONE;
output;
run;
```

**Question:** How many observations will be in data set WORK.ONE?

**Answer:** 8 observations will be in WORK.ONE

**Explanation:**

Here, the focus of the questions is the output statements. In the if statement we output the observations where the Var1 variable is ‘A’, which there is three rows in the WORK.INPUT so WORK.ONE will have these 3 observations present.

Also, the output statement in line 4 is outside of the if statement so not matter what the current observation will always be outputted. This means that all observations will be outputted and the observations if Var1=’A’ will be outputted twice. Therefore, there will be 3 + 5 = 8 observations in total present in the WORK.ONE dataset.

#### Question 5

The following SAS program is submitted:

```
data WORK.LOOP;
X = 0;
do Index = 1 to 5 by 2;
X = Index;
end;
run;
```

**Question:** Upon completion of execution, what are the values of the variables X and Index in the SAS data set named WORK.LOOP?

**Answer:** X = 5 and Index = 7

**Explanation:**

We will use step checking for the do loop-

1. X = 0, Index = 1

2. X = 1, Index = 3

3. X = 3, Index = 5

4. X = 5, Index = 7

#### Question 6

The following SAS program is submitted:

```
proc format;
value score 1 - 50 = 'Fail'
51 - 100 = 'Pass';
run;
```

**Question:** Which one of the following PRINT procedure steps correctly applies the format?

**a)**

```
proc print data = SASUSER.CLASS;
var test;
format test score;
run;
```

**b)**

```
proc print data = SASUSER.CLASS;
var test;
format test score.;
run;
```

**c)**

```
proc print data = SASUSER.CLASS
format = score;
var test;
run;
```

**d)**

```
proc print data = SASUSER.CLASS
format = score.;
var test;
run;
```

**Answer:** B

**Explanation:**

a) This is incorrect as in the FORMAT statement, you must have a period after the format value at the end before the semicolon.

c) This is incorrect as you cannot assign a SAS keyword to a value and it is not the correct syntax for formatting SAS variables.

d) Same argument as above.

Thus, B is the correct answer.

#### Question 7

This item will ask you to provide a line of missing code;

The SAS data set WORK.INPUT contains 10 observations, and includes the numeric variable Cost.

The following SAS program is submitted to accumulate the total value of Cost for the 10 observations:** **

```
data WORK.TOTAL;
set WORK.INPUT;
```
Total=Total+Cost;
run;

**Question: **Which statement correctly completes the program?

**Answer:** *retain Total 0;*

**Explanation:** The retain statement allows the Total variable to be initialized in the program data to the default value of 0 and retain its value after each iteration. So, after each call the Total will accumulate the total value of the Cost for the 10 observations.

#### Question 8

Given the following SAS error log:

```
44 data WORK.OUTPUT;
45 set SASHELP.CLASS;
46 BMI=(Weight*703)/Height**2;
47 where bmi ge 20;
ERROR: Variable bmi is not on file SASHELP.CLASS.
48 run;
```

**Question:** What change to the program will correct the error?

**Answer:** Replace the WHERE statement with an IF statement

**Explanation:**

Since the variable bmi is not on the file SASHELP.CLASS you cannot use the WHERE statement as its for variables in the input datasets only.

The IF statement checks the condition using the variables in the program data vector. As here, the bmi is defined in the DATA step and therefore in the program data vector and that is why it can be used.

#### Question 9

The following SAS program is submitted:

```
data WORK.TEMP;
Char1='0123456789';
Char2=substr(Char1,3,4);
run;
```

**Question:** What is the value of Char2?

**Answer:** 2345

**Explanation:** Here, Char2 is assigned to the substring of Char1, starting from the 3rd character and getting the next 4 characters, including the 3rd one. So, the 3rd character is 2, 4th character is 3, 5th character is 4 and the 6th character is 5. Thus, Char2=’2345′.

#### Question 10

This project will use data set **sashelp.shoes**.

Write a SAS program that will:

Run the program and answer the following questions:

```
/* Create work.sortedshoes and sort product in descending order and sales in ascending order */
proc sort data=sashelp.shoes out=work.sortedshoes;
by product descending sales;
run;
/* Get the observations from 130 to 148 and keep the two columns product and sales */
proc print data=work.sortedshoes (firstobs=130 obs=148);
run;
```

Output:

**Question 1:** What is the value of the **Product **variable in observation 148?

**Answer:** Sandal

**Question 2:** What is the value of the **Region **variable in observation 130?

**Answer:** Eastern Europe

#### Question 11

This project will use the data set **sashelp.shoes**.

Write a SAS program that will:

Run the program, then use additional SAS procedures to answer the following questions:

Program code:

```
data work.shoerange;
/* Read the shoes dataset from the sashelp library as input */
set sashelp.shoes;
/* Set length of the SalesRange variable to the longest assigned value (which is 'Middle') */
length SalesRange $ 6;
/* Catergorize the SalesRange into the three groups based on the Sales price */
if Sales < 100000 then
SalesRange = 'Lower';
else if 100000 <= Sales <= 200000 then
SalesRange = 'Middle';
else SalesRange = 'Upper';
run;
proc print data=work.shoerange;
run;
```

Partial print output:

**Question 1:** How many observations are classified into the “Lower” group?

**Answer:** 394

**Explanation:** Just print out the dataset and use a WHERE statement on that condition of the SalesRange = 'Lower'. Go to the last row of the printed output and check the observation number.

```
proc print data=work.shoerange;
where SalesRange = 'Lower';
run;
```

**Question 2:** What is the mean value of observations in the “Middle” group? Round your answer to the nearest whole number.

**Answer:** 135127 (Rounded from 135126.88)

**Explanation:** Use the PROC MEANS procedure to get the mean value of the Sales column and a WHERE statement inside the procedure.

```
proc means data=work.shoerange mean;
var Sales;
where SalesRange = 'Middle';
run;
```

Proc means output:

#### Question 12

This project will work with the following program:

```
data work.lowchol work.highchol;
set sashelp.heart;
if cholesterol lt 200 output work.lowchol;
if cholesterol ge 200 output work.highchol;
if cholesterol is missing output work.misschol;
run;
```

This program is intended to:

Fix the errors in the above program. There may be multiple errors in the program. Errors may be syntax errors, program structure errors, or logic errors. In the case of logic errors, the program may not produce an error in the log.

Corrected program code:

```
data work.lowchol work.highchol work.misschol;
set sashelp.heart;
if cholesterol lt 200 then output work.lowchol;
if cholesterol ge 200 then output work.highchol;
if cholesterol eq missing then output work.misschol;
run;
```

After fixing all of the errors in the program, answer the following questions:

**Question 1:** How many observations are in the **work.highchol** data set?

**Answer:** 3652

**Explanation:** We will use the PROC CONTENTS procedure in order to view the number of observations in the work.highchol dataset.

```
proc contents data=work.highchol nods;
run;
```

**Question 2:** How many observations are in the **work.lowchol** data set?

**Answer:** 1557

**Explanation:** Same as above except for the work.highchol we use the

work.lowchol in the PROC CONTENTS procedure.

**Question 13**

Given the following SAS data sets ONE and TWO:

The following SAS program is submitted:

```
proc sql;
select one.*, sales
from one right join two
on one.year = two.year;
quit;
```

Which one of the following reports is generated?

**Answer:** D

**Explanation:** Here, we wanted all the columns in the ONE dataset as shown by the 'one.*' option and the sales column, which is only present in the second dataset. So, we are joining using the 'year' column as the key and since 2001 exists twice in the ONE dataset and only once in the TWO dataset but since there is only one row present in the TWO dataset for this year than the sales value is copied over for the second column containing the 2001 year value.

## Comments