Ora

What is the maximum string length in Teradata?

Published in Teradata Data Types 4 mins read

The maximum string length in Teradata varies depending on the specific data type used for storing character data, ranging from tens of thousands of characters for standard character types to gigabytes for large objects.

Understanding Teradata String Data Types and Their Limits

Teradata offers several data types for storing character strings, each with distinct maximum length limitations tailored to different storage needs and performance characteristics.

CHAR (Fixed-Length Character String)

The CHAR data type stores fixed-length character string data. When defining a CHAR column, you specify its exact length n. For this fixed-length character string data, the maximum length for n is 64,000 characters. If a string inserted into a CHAR column is shorter than the defined length, it is padded with spaces to fill the specified length.

Use Case: Ideal for storing data with consistent, predefined lengths, such as status codes, two-letter state abbreviations, or fixed-length identifiers, offering efficient storage and retrieval for known sizes.

VARCHAR (Varying-Length Character String)

The VARCHAR data type stores varying-length character strings, which means it only consumes the amount of storage necessary for the actual string data, plus a small overhead for length information. While more flexible than CHAR, VARCHAR also has a maximum length.

The maximum length for a VARCHAR column is typically 64,000 characters when using a single-byte character set like LATIN. However, the underlying physical storage limit is 64,000 bytes. For multi-byte character sets, such as UNICODE (where each character can take 2 bytes), the maximum number of characters would be half of the byte limit, which is 32,000 characters.

Use Case: Best suited for data where the length can vary significantly, such as names, addresses, descriptions, or comments, optimizing storage by avoiding space padding.

CLOB (Character Large Object)

For exceptionally long string data that exceeds the limits of CHAR or VARCHAR, Teradata provides the CLOB (Character Large Object) data type. CLOB columns are designed to store very large amounts of character data.

The maximum length for a CLOB in Teradata is significantly larger, capable of storing up to 2,147,483,647 characters, which translates to 2 Gigabytes (GB) of data. CLOB data is often stored out-of-row to manage large sizes efficiently without impacting the main table's row size.

Use Case: Employed for storing extensive text, such as legal documents, large blocks of XML or JSON data, or detailed articles, where the content size can be very substantial.

Summary of Maximum String Lengths

The table below summarizes the maximum string lengths for common character data types in Teradata:

Data Type Description Maximum Length (Characters) Maximum Length (Bytes)
CHAR Fixed-length character string 64,000 64,000
VARCHAR Varying-length character string 64,000 (LATIN) / 32,000 (UNICODE) 64,000
CLOB Character Large Object 2,147,483,647 2 GB

Practical Considerations for String Lengths

When designing your Teradata schema, consider these practical aspects related to string lengths:

  • Storage Efficiency:
    • CHAR reserves space for its maximum length, regardless of actual data size, potentially wasting space for short strings.
    • VARCHAR is more space-efficient for variable-length data as it only stores the actual data plus a small overhead.
  • Performance:
    • While CHAR might seem faster for reads due to fixed length, VARCHAR often offers better overall performance by reducing I/O due to smaller row sizes, especially when dealing with many short strings.
    • CLOB data, being stored out-of-row, requires additional I/O operations to retrieve the large object data, which can impact performance for frequent access.
  • Character Sets: The choice of character set (e.g., LATIN, UNICODE, GRAPHIC) significantly impacts the maximum number of characters for VARCHAR columns, as multi-byte character sets consume more bytes per character.
  • Indexing: Text columns, especially very long ones, might have limitations or specific considerations when used in indexes. For example, you might only index a prefix of a VARCHAR column.
  • Row Size Limits: The total byte length of all columns in a row must not exceed Teradata's maximum row size limit (typically 1MB for non-BLOB/CLOB data). While CLOBs are stored out-of-row, a small pointer is part of the row.

Example SQL DDL for String Columns

Here are examples of how to define columns with various string data types in Teradata:

CREATE TABLE EmployeeDetails (
    EmployeeID        INTEGER,
    FirstName         VARCHAR(50) CHARACTER SET UNICODE, -- Max 50 characters, using UNICODE
    LastName          VARCHAR(50) CHARACTER SET UNICODE,
    DepartmentCode    CHAR(10),                          -- Fixed 10 characters
    Description       VARCHAR(2000) CHARACTER SET LATIN, -- Max 2000 characters, using LATIN
    EmployeeBiography CLOB                               -- Stores very large text data
);

For more in-depth information on Teradata data types and their specifications, refer to the official Teradata documentation: Teradata® Database SQL Data Types and Literals.