1.
Which program will combine Brothers.One and Brothers.Two to produce Brothers.Three?
Correct Answer
A. Data brothers.three;
set brothers.one;
set brothers.two;
run;
Explanation
This is a case of one-to-one reading, which requires multiple SET statements. Notice that where same-named variables occur, the values that are read in from the second data set replace those that are read in from the first one. Also, the number of observations in the new data set is the number of observations in the smallest original data set.
2.
Which program will combine Actors.Props1 and Actors.Props2 to produce Actors.Props3?
Correct Answer
C. Data actors.props3;
set actors.props1 actors.props2;
by actor
run;
Explanation
This is a case of interleaving, which requires a list of data set names in the SET statement and one or more BY variables in the BY statement. Notice that observations in each BY statement and one or more BY group are read sequentially, in the order in which the data sets and BY variables are listed. The new data set contains all the variables from all the input data sets, as well as the total number of records from all input data sets.
3.
If you submit the following program, which new data set is created?
Correct Answer
A.
Explanation
Concatenating the observations from one data set to another data set. The new data set contains the total number of records from all input data sets, so b is incorrect. All the variables from all the input data sets appear in the new data set, so c is incorrect.
4.
If you concatenate the data sets below in the order shown, what is the value of Sale in observation 2 of the new data set?
Correct Answer
A. Missing
Explanation
The concatenated data sets are read sequentially, in the order in which they are listed in the SET statement. The second observation in Sales.Reps does not contain a value for Sale, so a missing value appears for this variable. (Note that if you merge the data sets, the value of Sale for the second observation is $30,000.)
5.
What happens if you merge the following data sets by the variable SSN?
Correct Answer
B. The values of Age in the 2nd data set overwrite the values of Age in the 1st data set.
Explanation
If you have variables with the same name in more than one input data set, values of the same-named variable in the first data set in which it appears are overwritten by values of the same-named variable in the first data set in which it appears are overwritten by values of the same-named variable in subsequent data sets.
6.
Suppose you merge data sets Health.Set1 and Health.Set2 below:
Which output does the following program create?
data work.merged;
merge health.set1(in=in1) health.set2(in=in2);
by id;
if in1 and in2;
run;
proc print data=work.merged;
run;
Correct Answer
C.
Explanation
The DATA step uses the IN= data set option and the subsetting IF statement to exclude unmatched observations from the output data set. So a and b, which contain unmatched observations, are incorrect.
7.
The data sets Ensemble.Spring and Ensemble.Summer both contain a variable named Blue. How do you prevent the values of the variable Blue from being overwritten when you merge the two data sets?
Correct Answer
D. Data ensemble.merged;
merge ensemble.spring(rename=(blue=navy))
ensemble.summer;
by fabric;
run;
Explanation
Match-merging overwrites same-named variables in the first data set with same-named variables in subsequent data sets. To prevent overwriting, rename variables by using the RENAME= data set option in the MERGE statement.
8.
What happens if you submit the following program to merge Blood.Donors1 and Blood.Donors2, shown below?
Correct Answer
C. The DATA step produces errors.
Explanation
The two input data sets are not sorted by values of the BY variable, so the DATA step produces errors and stops processing.
9.
If you merge Company.Staff1 and Company.Staff2 below by ID, how many observations does the new data set contain?
Correct Answer
C. 6
Explanation
In this example the new data set contains one observation for each unique value of ID. The merged data set is shown below.
10.
If you merge data sets Sales.Reps, Sales.Close, and Sales.Bonus by ID, what is the value of Bonus in the third observation in the new data set?
Correct Answer
A. $4,000
Explanation
In the new data set, the third observation is the second observation for ID number 2(Kelly Windsor). THe value for Bonus is retained from the previous statement because the BY variable didn't change. The new data set is shown below.