PROC CONTENTS
in SAS is a fundamental and highly versatile procedure used to display descriptive information about a SAS dataset or an entire SAS library. It acts as an invaluable tool for understanding the structure and metadata of your data, providing details about variables, their types, lengths, formats, and other crucial attributes.
Basic Syntax and Purpose
The primary function of PROC CONTENTS
is to reveal the metadata associated with your SAS data. The basic syntax is straightforward:
PROC CONTENTS DATA=sample;
RUN;
In this example, DATA=sample
specifies the SAS dataset you wish to examine. While the DATA
statement, which specifies the name of the dataset, is optional, it is highly recommended for clarity and precision. If you do not specify a dataset, SAS will, by default, use the most recently created dataset in your current session.
The output typically includes:
- Dataset Properties: Information like the dataset name, creation date, number of observations, and number of variables.
- Engine/Host Details: Specifics about the SAS engine used and the operating system where the dataset resides.
- Variable Attributes: For each variable, it lists its name, type (character or numeric), length, format, informat, and label.
Why is PROC CONTENTS Essential?
Understanding your data's structure is crucial for accurate analysis and reporting. PROC CONTENTS
helps you:
- Verify Data Integrity: Quickly check if variables have the expected types (e.g., numeric for calculations, character for text).
- Troubleshoot Issues: Identify unexpected variable lengths or formats that might cause errors in subsequent data steps or procedures.
- Document Datasets: Provide a clear summary of a dataset's structure, which is vital for collaboration and long-term data management.
- Explore Unknown Data: Get a quick overview when working with unfamiliar datasets.
Practical Applications and Examples
Let's explore various ways to leverage PROC CONTENTS
.
1. Displaying Information for a Single Dataset
To view the contents of a specific dataset, use the DATA
option.
/* Assuming 'sashelp.class' is a built-in SAS dataset */
PROC CONTENTS DATA=sashelp.class;
RUN;
This will output detailed information about the sashelp.class
dataset, including its properties and a list of all variables with their attributes.
2. Listing All Datasets in a Library
You can list all the datasets (or members) within a specified SAS library using the _ALL_
keyword with the DATA
option.
/* List all datasets in the 'work' library */
PROC CONTENTS DATA=work._ALL_;
RUN;
This is particularly useful when you need an inventory of all datasets available in a particular library, such as your temporary work
library or a permanent library like mylib
.
3. Outputting Contents to a New Dataset
For programmatic use or further analysis, PROC CONTENTS
can write its output to a new SAS dataset using the OUT=
option. This dataset will contain one observation for each variable in the input dataset, storing all its metadata.
PROC CONTENTS DATA=sashelp.class OUT=work.class_contents;
RUN;
/* You can then view the created dataset */
PROC PRINT DATA=work.class_contents;
RUN;
The work.class_contents
dataset will contain columns like MEMNAME
(dataset name), NAME
(variable name), TYPE
(numeric=1, character=2), LENGTH
, FORMAT
, LABEL
, etc.
4. Displaying Abbreviated or Detailed Output
PROC CONTENTS
offers options to control the level of detail in its output.
-
SHORT
: Provides a concise output, showing only the dataset name, engine, and a list of variables without their full attributes.PROC CONTENTS DATA=sashelp.class SHORT; RUN;
-
DETAILS
: Shows additional information, such as indexes, integrity constraints, and other advanced dataset attributes. This is often the default or included in the standard output.PROC CONTENTS DATA=sashelp.class DETAILS; RUN;
5. Displaying Variable Position
The POSITION
option lists variables in the order they appear in the dataset, which can be different from alphabetical order.
PROC CONTENTS DATA=sashelp.class POSITION;
RUN;
This is useful for understanding the physical layout of your data file.
Common PROC CONTENTS
Options
Here's a quick reference table for some frequently used options:
Option | Description | Example Usage |
---|---|---|
DATA=libref.dataset |
Specifies the input dataset. Can also use libref._ALL_ for all datasets in a library. |
DATA=mydata.customers |
OUT=output_dataset |
Creates an output dataset containing the metadata. | OUT=work.var_info |
SHORT |
Displays a short form of the directory or contents. | PROC CONTENTS DATA=mydata SHORT; |
NODS |
Suppresses the listing of dataset directory information. | PROC CONTENTS DATA=mydata NODS; |
MEMBERS |
Lists the members of the specified library (similar to _ALL_ ). |
PROC CONTENTS DATA=mydata. _ALL_ MEMBERS; |
POSITION |
Lists variables in their physical order within the dataset. | PROC CONTENTS DATA=mydata POSITION; |
HISTORY |
Displays the dataset's history, if available. | PROC CONTENTS DATA=mydata HISTORY; |
VARNUM |
Lists variables by their variable number (order). | PROC CONTENTS DATA=mydata VARNUM; |
For a comprehensive list of all available options and their detailed explanations, always refer to the official SAS documentation for PROC CONTENTS.
Conclusion
PROC CONTENTS
is a cornerstone SAS procedure for data exploration and understanding. By mastering its basic syntax and various options, you gain powerful insights into your datasets, enabling more efficient and error-free data management and analysis.