The maximum string length in Teradata varies depending on the specific data type used for storing character data, ranging from tens of thousands of characters for standard character types to gigabytes for large objects.
Understanding Teradata String Data Types and Their Limits
Teradata offers several data types for storing character strings, each with distinct maximum length limitations tailored to different storage needs and performance characteristics.
CHAR (Fixed-Length Character String)
The CHAR
data type stores fixed-length character string data. When defining a CHAR
column, you specify its exact length n
. For this fixed-length character string data, the maximum length for n
is 64,000 characters. If a string inserted into a CHAR
column is shorter than the defined length, it is padded with spaces to fill the specified length.
Use Case: Ideal for storing data with consistent, predefined lengths, such as status codes, two-letter state abbreviations, or fixed-length identifiers, offering efficient storage and retrieval for known sizes.
VARCHAR (Varying-Length Character String)
The VARCHAR
data type stores varying-length character strings, which means it only consumes the amount of storage necessary for the actual string data, plus a small overhead for length information. While more flexible than CHAR
, VARCHAR
also has a maximum length.
The maximum length for a VARCHAR
column is typically 64,000 characters when using a single-byte character set like LATIN
. However, the underlying physical storage limit is 64,000 bytes. For multi-byte character sets, such as UNICODE
(where each character can take 2 bytes), the maximum number of characters would be half of the byte limit, which is 32,000 characters.
Use Case: Best suited for data where the length can vary significantly, such as names, addresses, descriptions, or comments, optimizing storage by avoiding space padding.
CLOB (Character Large Object)
For exceptionally long string data that exceeds the limits of CHAR
or VARCHAR
, Teradata provides the CLOB
(Character Large Object) data type. CLOB
columns are designed to store very large amounts of character data.
The maximum length for a CLOB
in Teradata is significantly larger, capable of storing up to 2,147,483,647 characters, which translates to 2 Gigabytes (GB) of data. CLOB
data is often stored out-of-row to manage large sizes efficiently without impacting the main table's row size.
Use Case: Employed for storing extensive text, such as legal documents, large blocks of XML or JSON data, or detailed articles, where the content size can be very substantial.
Summary of Maximum String Lengths
The table below summarizes the maximum string lengths for common character data types in Teradata:
Data Type | Description | Maximum Length (Characters) | Maximum Length (Bytes) |
---|---|---|---|
CHAR |
Fixed-length character string | 64,000 | 64,000 |
VARCHAR |
Varying-length character string | 64,000 (LATIN) / 32,000 (UNICODE) | 64,000 |
CLOB |
Character Large Object | 2,147,483,647 | 2 GB |
Practical Considerations for String Lengths
When designing your Teradata schema, consider these practical aspects related to string lengths:
- Storage Efficiency:
CHAR
reserves space for its maximum length, regardless of actual data size, potentially wasting space for short strings.VARCHAR
is more space-efficient for variable-length data as it only stores the actual data plus a small overhead.
- Performance:
- While
CHAR
might seem faster for reads due to fixed length,VARCHAR
often offers better overall performance by reducing I/O due to smaller row sizes, especially when dealing with many short strings. CLOB
data, being stored out-of-row, requires additional I/O operations to retrieve the large object data, which can impact performance for frequent access.
- While
- Character Sets: The choice of character set (e.g.,
LATIN
,UNICODE
,GRAPHIC
) significantly impacts the maximum number of characters forVARCHAR
columns, as multi-byte character sets consume more bytes per character. - Indexing: Text columns, especially very long ones, might have limitations or specific considerations when used in indexes. For example, you might only index a prefix of a
VARCHAR
column. - Row Size Limits: The total byte length of all columns in a row must not exceed Teradata's maximum row size limit (typically 1MB for non-BLOB/CLOB data). While CLOBs are stored out-of-row, a small pointer is part of the row.
Example SQL DDL for String Columns
Here are examples of how to define columns with various string data types in Teradata:
CREATE TABLE EmployeeDetails (
EmployeeID INTEGER,
FirstName VARCHAR(50) CHARACTER SET UNICODE, -- Max 50 characters, using UNICODE
LastName VARCHAR(50) CHARACTER SET UNICODE,
DepartmentCode CHAR(10), -- Fixed 10 characters
Description VARCHAR(2000) CHARACTER SET LATIN, -- Max 2000 characters, using LATIN
EmployeeBiography CLOB -- Stores very large text data
);
For more in-depth information on Teradata data types and their specifications, refer to the official Teradata documentation: Teradata® Database SQL Data Types and Literals.