Intermediate Level SAS Interview Questions
31. How can we remove duplicates using PROC SQL?
This is an important question in the series of SAS Interview Questions.
We can remove duplicates by:
Proc SQL noprint;
Create Table inter.Merged1 as
Select distinct * from inter.readin ;
Quit;
32. Define PDV.
The PDV or Program Data Vector represents the logical area in the memory. One observation at a time is collected to form a dataset. During compilation, an input buffer is made to retain a record from an external file. The PDV is created after the input buffer creation.
33. Write some of the most common programming errors that occur in SAS?
The most common programming errors are:
- Missing semicolon
- Not checking the log after submitting the program
- Unmatched quotation marks
- Invalid dataset option
- Invalid statement option
- Not using the FSVIEW option vigorously
- Not using debugging techniques
34. Write the difference between using the drop = data set option in the data statement and the set statement?
Specify the drop = data set option in the set statement if you don't want to process a particular variable and do not want them to appear in the new data set.
Instead, use the drop = data set option in the data statement to process any variables that you don't want to appear in the new data set.
35. How can you Interleave SAS datasets?
We can combine various sorted data sets into a single sorted data set via interleaving. A SET statement and a BY statement are used to interleave data. The data sets we want to interleave are specified using the SET statement. And the variable on which we want the final data set to be sorted is defined using the BY statement.
We can interleave as many data sets as we want. The number of observations in the new data set is the sum of the observations in the original data sets.
36. What are _N_ and _ERROR_ in SAS?
The _N_ variable and the _ERROR_ variable automatically create variables found in a SAS Data Step.
- _N_: This variable keeps track of how frequently a data step is repeated. The value is initially set to 1. The value grows each time the data step of a data statement is repeated
- _ERROR_: This variable detects errors during execution, including input data, math, conversion, etc. The value is set to 0 by default
37. What is the purpose of SUBSTR functions in SAS programming?
In SAS programming, the SUBSTR function is utilized in the case of a character variable whenever the program needs to abstract a substring.
This function abstracts character strings when a start position and length are given.
Syntax:
SUBSTR(char_var, start,length);
38. What are the ways in which Macro variables can be created in SAS?
There are many methods by which Macro variables can be created. Some of them are listed below:
- %Let Statements
- Macro parameters
- %Do Statements
- INTO in PROC SQL
- CALL SYMPUTX routine
39. Name some key concepts of SAS.
Some key concepts of SAS are:
- SORT procedure
- Missing values
- KEEP=, DROP= dataset options
- Data step logic
- Reset to missing, or the RETAIN statement
- Log
- FORMAT procedure for creating value formats
- Data types
- IN= dataset option
40. Tell the difference between INPUT and INFILE.
While an INPUT statement in SAS programming explains the variables used, an INFILE statement in SAS programming indicates an external file that contains the data.
Syntax of INPUT is
INPUT ‘varname1’ ‘varname2’;
Syntax of INFILE is
INFILE 'filename';
41. How can you filter data in SAS?
Data can be filtered in SAS using a WHERE statement or condition within a DATA step or PROC step:
PROC PRINT DATA=dataset_name;
WHERE variable = 'value';
RUN;
42. How do you create a temporary dataset in SAS?
To create a temporary dataset, use a single-level name in the DATA statement. Temporary datasets are automatically deleted at the end of the SAS session.
43. How can you rename variables in SAS?
Variables in a SAS dataset can be renamed using the RENAME statement in a DATA step:
DATA new_dataset;
SET old_dataset;
RENAME old_variable = new_variable;
RUN;
44. What is the difference between PUT and INPUT functions?
PUT converts numeric values to character values, while INPUT converts character values to numeric values.
45. How do you read raw data in SAS?
Raw data can be read into SAS using the INFILE statement in a DATA step. Example:
DATA dataset_name;
INFILE 'file_path';
INPUT var1 var2 var3;
RUN;
46. What is the purpose of PROC TABULATE?
PROC TABULATE is used to create complex, customizable tabular reports from SAS datasets.
47. How do you create a SAS macro variable?
You can create a macro variable using the %LET statement or through CALL SYMPUT in a DATA step.
48. What is a BY statement, and how is it used?
A BY statement is used in SAS to process observations by groups. It is typically used with PROC SORT, MERGE, and other procedures to indicate the key variable(s) for grouping.
49. What is the difference between a DATA step and a PROC step?
A DATA step is used to create or modify SAS datasets, while a PROC step is used to process or analyze data using pre-written procedures.
50. What is the purpose of PROC TRANSPOSE?
PROC TRANSPOSE is used to convert rows into columns and vice versa. It’s particularly useful for restructuring data.
PROC TRANSPOSE DATA=dataset OUT=transposed_data;
BY variable;
VAR columns;
RUN;
51. How do you create a custom format in SAS?
You can create a custom format using PROC FORMAT with the VALUE statement to define the mapping of values to labels.
52. What is the difference between KEEP and DROP statements in SAS?
- KEEP: Specifies which variables to retain in the output dataset.
- DROP: Specifies which variables to exclude from the output dataset.
Example:
DATA new_data;
SET old_data (KEEP=var1 var2);
RUN;
53. How do you perform a many-to-many merge in SAS?
A many-to-many merge can be performed using a DATA step with multiple SET statements and a BY statement or using PROC SQL with a full outer join.
54. What is the CALL statement in SAS?
The CALL statement is used to execute SAS routines within a DATA step. For example, CALL SYMPUT assigns a value to a macro variable.
55. What is the difference between FIRST. and LAST. variables?
FIRST. and LAST. are temporary variables created by SAS when using a BY statement. FIRST. is true for the first observation of each BY group, while LAST. is true for the last observation.
56. How do you use arrays in SAS?
Arrays in SAS are used to process a series of variables in the same manner. Example:
ARRAY nums[3] var1 var2 var3;
DO i = 1 TO 3;
nums[i] = nums[i] * 2;
END;
57. How do you create a cumulative sum in SAS?
You can create a cumulative sum using the SUM statement or the SUM function in a DATA step.
58. How can you merge datasets with non-matching keys in SAS?
You can merge datasets with non-matching keys using a MERGE statement along with an IN= option to identify matching and non-matching records. Example:
DATA merged;
MERGE dataset1(IN=a) dataset2(IN=b);
BY key_var;
IF a AND b;
RUN;
59. How do you handle unbalanced data in PROC MIXED?
PROC MIXED can handle unbalanced data naturally. You can use the METHOD= option to specify different estimation methods, such as REML or ML, depending on your needs.
60. What is the RETENTION statement in SAS?
The RETAIN statement keeps the value of a variable across iterations of the DATA step. Without it, the variable is reset at each iteration.
61. What is the difference between PROC SQL and traditional SAS DATA step programming?
PROC SQL uses SQL syntax and can perform operations like joins, subqueries, and creating tables in a single step, while traditional DATA step programming may require multiple steps for complex operations.
62. How do you perform conditional processing in SAS?
You can perform conditional processing using IF-THEN-ELSE statements within a DATA step. Example:
DATA new_data;
SET old_data;
IF var1 = 'A' THEN new_var = 1;
ELSE new_var = 0;
RUN;
63. How do you create a SAS macro function?
You can create a SAS macro function using the %MACRO statement to define the function and its parameters, followed by the macro code and ending with %MEND.
64. What is PROC UNIVARIATE?
PROC UNIVARIATE provides detailed descriptive statistics, including measures of central tendency and distribution shape, along with hypothesis testing.
65. What is the purpose of the ODS (Output Delivery System) in SAS?
ODS is used to format and generate output in various formats (e.g., HTML, PDF, RTF) and to control the appearance of SAS output.
SAS Interview Questions For Experienced
66. How can all the numerical variables be recoded using arrays?
You can add this question as an important one to your SAS Interview Questions list as you will encounter these types of questions in the interview.
We can use both _numeric_ and dim functions in the array to recode all the numeric variables.
data readin;
array Q(*) _numeric_;
do i=1 to dim(Q);
if Q(i)=6 then Q(i)=.;
end;
run;
67. How can we identify the number of iterations and specific conditions within a single 'do' loop?
The following code will help you to identify the number of iterations and specific conditions within a single 'do' loop:
data work;
do i=1 to 20 until(Sum>=20000);
Year+1;
Sum+2000;
Sum+Sum1*.10;
end;
Run;
In this code, the do statement enables you to execute the do loop until the sum is greater than or equal to 20,000 units; it occurs ten times.
68. Which SAS program command is used to achieve sorting?
Sorting can be done on single or multiple variables using the PROC SORT command. This operation is carried out on the dataset where, as a result of sorting, a new data set is created while the original data set is left unchanged.
Syntax:
PROC SORT DATA=original OUT=Sorted;
BY variable;
Here, 'Original' means the original dataset
'Sorted' means result as a sorted dataset
'Variable' is the column on which sorting is done.
We can perform Sorting in both ascending and descending.
The keyword "Descending" must be used in the BY statement. Including the name of the column being sorted for the dataset to display in descending order.
Syntax:
PROC SORT DATA=original OUT=Sorted;
BY DESCENDING variable
69. How are a character variable converted into a numeric variable and vice versa?
In the context of SAS programming, there are multiple tasks where a character value is to be converted into a numeric number and vice versa.
In the context of SAS programming, there are many tasks where a character value is to be converted into a numeric number and vice versa.
We can use Put() method to convert a numeric value to a character.
Example:
char_var= PUT( num_var, 7.);
And, We can use Input() method to convert a character to numeric.
Example:
Num_var= INPUT(char_var,3.0);
70. Describe the purpose of the RETAIN statement.
The RETAIN statement has the same goal in SAS programming as its name implies because the word "RETAIN" means to keep a value after it has been assigned.
When a SAS program has to go from one data step iteration to the next, the RETAIN command tells SAS to retain the values instead of setting them to missing.
Example:
data abc;
set xyz;
RETAIN z 0;
z = z + 1;
run;
Here, we have displayed the output value of 'z' starting from 1 by using the RETAIN statement.
71. Describe the function of the output statement in a SAS program.
The output statement assists in saving summary statistics in a SAS data set. So that custom reports can be created or that past details about a process can be kept.
We can use the output statement in the following ways:
- Stating the name of the output data set
- Stating the statistics to be saved in the output data set
- Computing and Storing the percentile that was not calculated automatically by the CAPABILITY procedure
72. Why and when can we use PROC SQL?
When compared to a data step merge, PROC SQL is far more practical for performing table joins. Because it doesn't require the sorting of the key columns before the join, for processing that occurs sequentially, observation following observation, a data step is more appropriate.
If you want to filter the variables while selecting them, or you want to modify them, format them, or create new macro variables, PROC SQL can save you a vast amount of time. Together with data subsetting. For joining tables, PROC SQL provides a lot of freedom.
73. Describe the common mistakes when programming in SAS.
Following are some common mistakes in SAS.
- Quotation marks do not match
- No debugging techniques are used
- The record option is invalid, or the statement option is invalid
- Log entries for transferred programs are not checked
- The data is not sorted before using an instruction that requires sorting
74. How will you add a number to a Macro variable?
Using %sysevalf function or %eval function if the number is a floating number.
We can call the macro with the below code:
CALL SYMPUT,
Proc SQL,
%LET statement and macro parameters.
75. Describe Normal Distribution.
The distribution of data can vary, with a bias to the left or right, or it can all be mixed up.
However, there is a chance that data will be distributed normally, or in the shape of a bell curve, around a center value without any bias to the left or right. The distribution of the random variables resembles a symmetrical bell curve.
76. How do you export data from SAS to Excel?
You can export data to Excel using PROC EXPORT:
PROC EXPORT DATA=dataset
OUTFILE='file_path'
DBMS=xlsx REPLACE;
RUN;
77. How do you perform error handling in SAS?
Error handling in SAS can be done using options like ERRORABEND, ERRORS=, and the %ABORT statement. You can also use conditional logic to check for specific error conditions.
78. How do you calculate the frequency of a variable in SAS?
The frequency of a variable can be calculated using PROC FREQ:
PROC FREQ DATA=dataset;
TABLES variable;
RUN;
79. What is the difference between a hash object and a traditional SAS array?
A hash object is a dynamic data structure that allows fast data lookup and storage, while a traditional SAS array is a fixed-size data structure with predetermined dimensions.
80. How do you subset data in SAS?
You can subset data in SAS using a WHERE statement or the IF condition in a DATA step:
DATA subset_data;
SET original_data;
WHERE condition;
RUN;
81. How do you optimize SAS code for better performance?
To optimize SAS code, you can:
- Use indexes for large datasets
- Avoid unnecessary sorting
- Use WHERE instead of IF when possible
- Minimize I/O operations
- Use PROC SQL for complex data manipulations
82. What is PROC CORR used for in SAS?
PROC CORR is used to calculate correlation coefficients between variables. Example:
PROC CORR DATA=dataset;
VAR var1 var2;
RUN;
83. What is the purpose of PROC FCMP?
PROC FCMP (Function Compiler) is used to create, test, and store user-defined functions and CALL routines for use in DATA step and SQL processing.
84. What is PROC GLM in SAS?
PROC GLM (General Linear Model) is used for regression analysis and analysis of variance (ANOVA).
85. How do you handle date and time calculations in SAS?
SAS provides various functions for date and time calculations, such as INTCK, INTNX, DATDIF, and TIME. You can also use arithmetic operations on SAS date values.
86. How do you run a macro in SAS?
A macro in SAS is run using the %macro_name syntax. Example:
%macro macro_name;
/* macro code */
%mend;
%macro_name;
87. How do you use SAS/GRAPH to create custom visualizations?
SAS/GRAPH provides procedures like GPLOT, GCHART, and GMAP for creating various types of graphs. You can customize these using options, statements, and annotation facilities.
88. How do you transpose data from wide to long format in SAS?
You can transpose wide data to long format using PROC TRANSPOSE:
PROC TRANSPOSE DATA=wide_data OUT=long_data;
BY id_var;
VAR columns_to_transpose;
RUN;
89. What is the purpose of PROC SURVEYSELECT?
PROC SURVEYSELECT is used to select probability-based random samples from a dataset, supporting various sampling designs such as simple random sampling, stratified sampling, and cluster sampling.
90. What is PROC LIFETEST, and how is it used?
PROC LIFETEST is used for survival analysis in SAS. It estimates survival probabilities and performs non-parametric analysis of time-to-event data using methods like Kaplan-Meier estimation.
PROC LIFETEST DATA=dataset;
TIME time_var*censoring_var(0);
STRATA group_var;
RUN;
91. How do you perform text mining in SAS?
Text mining in SAS can be performed using SAS Text Miner, which includes procedures like PROC TMFILTER, PROC TMPPARSE, and PROC TEXTCLUST for text preprocessing, parsing, and clustering.
92. How do you debug a SAS program?
Debugging in SAS can be done by:
- Using the PUT statement to print intermediate values.
- Applying the OPTIONS MPRINT, MLOGIC, and SYMBOLGEN to trace macro execution.
- Checking the SAS log for warnings, errors, and notes.
- Running PROC COMPARE to compare datasets and validate results.
93. What is the difference between explicit and implicit output in a DATA step?
Explicit output uses the OUTPUT statement to write observations to a dataset, while implicit output automatically writes an observation at the end of each DATA step iteration unless suppressed.
94. What is the use of PROC COMPARE?
PROC COMPARE is used to compare two datasets in SAS. It provides a detailed comparison of variables, data values, and dataset attributes.
PROC COMPARE BASE=dataset1 COMPARE=dataset2;
RUN;
95. How do you use PROC TABULATE for complex reporting?
PROC TABULATE allows you to create complex, multi-dimensional tables using CLASS, VAR, and TABLE statements. You can customize the output using options like FORMAT=, MISSTEXT=, and style-related options.
96. How can you create an index on a dataset in SAS?
An index can be created in SAS using the INDEX option in the DATA step or PROC DATASETS:
DATA indexed_data (INDEX=(var1 var2));
SET original_data;
RUN;
97. What is the difference between a data step and a procedure step in SAS?
- DATA step: Used for reading, manipulating, and transforming data. It processes data row by row.
- PROC step: Used predefined procedures to perform analysis, summary statistics, and data manipulation. It operates on entire datasets at once.
98. How do you implement parallel processing in SAS?
Parallel processing in SAS can be implemented using:
- The THREADS system option
- PROC SORT with the THREADS option
- SAS Grid Manager for distributed processing
- PROC DS2 for threaded DATA step processing
99. How do you write an efficient PROC SQL query in SAS?
To write an efficient PROC SQL query:
- Use INDEX on key columns for faster lookups.
- Avoid selecting unnecessary columns with SELECT *.
- Use INNER JOIN instead of OUTER JOIN where possible.
- Leverage SAS SQL pass-through for database queries.
100. How do you use PROC LOGISTIC?
PROC LOGISTIC performs logistic regression in SAS, typically for binary or multinomial outcomes. Example:
PROC LOGISTIC DATA=dataset;
MODEL target_var(event='1') = var1 var2 var3;
RUN;
Frequently Asked Questions
How do I prepare for a SAS interview?
Learn about the different SAS procedures and functions and practice the above SAS interview questions. Make sure to practice writing SAS code.
What is SAS functionality?
SAS functionality mostly includes data management, performing statistical analysis, creating data visualization like charts, graphs, etc. and business intelligence reports such as dashboards.
What are the capabilities of the SAS framework?
The SAS framework has a number of capabilities such as Scalability, as it can be easily scaled to handle large data sets, Interoperability, as it can be integrated with other applications, and Extensibility, it can be extended to add new functionality, and lastly, SAS framework can be used to secure and protect sensitive data.
Conclusion
We have discussed the topic of SAS Interview Questions. We have seen different types of SAS Interview Questions that are asked in the SAS Interview.
We hope this blog has helped you enhance your knowledge of SAS Interview Questions. If you want to learn more, check out our articles, Operating System Interview Questions, HCL HR Interview Questions, LWC interview questions, DB2 Interview Questions, and many more on our platform Code360.
Recommended Reading:
Manual testing interview questions
Kotlin Interview Questions
But suppose you have just started your learning process and are looking for questions from tech giants like Amazon, Microsoft, Uber, etc. In that case, you must look at the problems, interview experiences for placement preparations.
However, you may consider our paid courses to give your career an edge over others!
Happy Learning!