Ora

How to use PROC CONTENTS in SAS?

Published in SAS Data Management 5 mins read

PROC CONTENTS in SAS is a fundamental and highly versatile procedure used to display descriptive information about a SAS dataset or an entire SAS library. It acts as an invaluable tool for understanding the structure and metadata of your data, providing details about variables, their types, lengths, formats, and other crucial attributes.

Basic Syntax and Purpose

The primary function of PROC CONTENTS is to reveal the metadata associated with your SAS data. The basic syntax is straightforward:

PROC CONTENTS DATA=sample;
RUN;

In this example, DATA=sample specifies the SAS dataset you wish to examine. While the DATA statement, which specifies the name of the dataset, is optional, it is highly recommended for clarity and precision. If you do not specify a dataset, SAS will, by default, use the most recently created dataset in your current session.

The output typically includes:

  • Dataset Properties: Information like the dataset name, creation date, number of observations, and number of variables.
  • Engine/Host Details: Specifics about the SAS engine used and the operating system where the dataset resides.
  • Variable Attributes: For each variable, it lists its name, type (character or numeric), length, format, informat, and label.

Why is PROC CONTENTS Essential?

Understanding your data's structure is crucial for accurate analysis and reporting. PROC CONTENTS helps you:

  • Verify Data Integrity: Quickly check if variables have the expected types (e.g., numeric for calculations, character for text).
  • Troubleshoot Issues: Identify unexpected variable lengths or formats that might cause errors in subsequent data steps or procedures.
  • Document Datasets: Provide a clear summary of a dataset's structure, which is vital for collaboration and long-term data management.
  • Explore Unknown Data: Get a quick overview when working with unfamiliar datasets.

Practical Applications and Examples

Let's explore various ways to leverage PROC CONTENTS.

1. Displaying Information for a Single Dataset

To view the contents of a specific dataset, use the DATA option.

/* Assuming 'sashelp.class' is a built-in SAS dataset */
PROC CONTENTS DATA=sashelp.class;
RUN;

This will output detailed information about the sashelp.class dataset, including its properties and a list of all variables with their attributes.

2. Listing All Datasets in a Library

You can list all the datasets (or members) within a specified SAS library using the _ALL_ keyword with the DATA option.

/* List all datasets in the 'work' library */
PROC CONTENTS DATA=work._ALL_;
RUN;

This is particularly useful when you need an inventory of all datasets available in a particular library, such as your temporary work library or a permanent library like mylib.

3. Outputting Contents to a New Dataset

For programmatic use or further analysis, PROC CONTENTS can write its output to a new SAS dataset using the OUT= option. This dataset will contain one observation for each variable in the input dataset, storing all its metadata.

PROC CONTENTS DATA=sashelp.class OUT=work.class_contents;
RUN;

/* You can then view the created dataset */
PROC PRINT DATA=work.class_contents;
RUN;

The work.class_contents dataset will contain columns like MEMNAME (dataset name), NAME (variable name), TYPE (numeric=1, character=2), LENGTH, FORMAT, LABEL, etc.

4. Displaying Abbreviated or Detailed Output

PROC CONTENTS offers options to control the level of detail in its output.

  • SHORT: Provides a concise output, showing only the dataset name, engine, and a list of variables without their full attributes.

    PROC CONTENTS DATA=sashelp.class SHORT;
    RUN;
  • DETAILS: Shows additional information, such as indexes, integrity constraints, and other advanced dataset attributes. This is often the default or included in the standard output.

    PROC CONTENTS DATA=sashelp.class DETAILS;
    RUN;

5. Displaying Variable Position

The POSITION option lists variables in the order they appear in the dataset, which can be different from alphabetical order.

PROC CONTENTS DATA=sashelp.class POSITION;
RUN;

This is useful for understanding the physical layout of your data file.

Common PROC CONTENTS Options

Here's a quick reference table for some frequently used options:

Option Description Example Usage
DATA=libref.dataset Specifies the input dataset. Can also use libref._ALL_ for all datasets in a library. DATA=mydata.customers
OUT=output_dataset Creates an output dataset containing the metadata. OUT=work.var_info
SHORT Displays a short form of the directory or contents. PROC CONTENTS DATA=mydata SHORT;
NODS Suppresses the listing of dataset directory information. PROC CONTENTS DATA=mydata NODS;
MEMBERS Lists the members of the specified library (similar to _ALL_). PROC CONTENTS DATA=mydata. _ALL_ MEMBERS;
POSITION Lists variables in their physical order within the dataset. PROC CONTENTS DATA=mydata POSITION;
HISTORY Displays the dataset's history, if available. PROC CONTENTS DATA=mydata HISTORY;
VARNUM Lists variables by their variable number (order). PROC CONTENTS DATA=mydata VARNUM;

For a comprehensive list of all available options and their detailed explanations, always refer to the official SAS documentation for PROC CONTENTS.

Conclusion

PROC CONTENTS is a cornerstone SAS procedure for data exploration and understanding. By mastering its basic syntax and various options, you gain powerful insights into your datasets, enabling more efficient and error-free data management and analysis.