How Does C Store Integers?

C stores integers as binary numbers within fixed-size memory locations, representing values using a sequence of bits (0s and 1s). The specific way an integer is stored depends on its type (e.g., int, short, long), whether it is signed or unsigned, and the architecture of the system.

The Foundation: Binary Representation

At its core, a computer understands only binary data. When you declare an integer variable in C, the compiler allocates a specific block of memory to hold its value. This block of memory is composed of individual bits, each capable of storing either a 0 or a 1. The combination of these 0s and 1s forms the binary representation of the integer.

For example, the decimal number 5 would be stored as 00000101 in an 8-bit (1-byte) system, while 10 would be 00001010.

Integer Data Types and Sizes

C offers several integer data types, each designed to store different ranges of values and occupying varying amounts of memory. These types are typically 1, 2, 4, or 8 bytes in length, corresponding to 8, 16, 32, or 64 bits, respectively.

While the C standard defines minimum sizes for these types, their exact sizes can vary based on the specific compiler and the underlying hardware architecture.

Common integer types in C include:

char: Often used for small integer values or characters. Typically 1 byte (8 bits).
short (short int): A smaller integer type. Typically 2 bytes (16 bits).
int: The most common integer type. Often 4 bytes (32 bits), but can be larger or smaller depending on the system.
long (long int): Guarantees at least 4 bytes (32 bits), often 4 or 8 bytes.
long long (long long int): Guarantees at least 8 bytes (64 bits).

These types can also be prefixed with signed or unsigned keywords to specify their range characteristics.

Signed vs. Unsigned Integers

One of the most crucial distinctions in how C stores integers is between signed and unsigned types. This choice dictates whether the integer can represent negative numbers.

Unsigned Integers: These integers can only store non-negative values (zero and positive numbers). All bits in their memory allocation are used to represent the magnitude of the number. For instance, a 32-bit unsigned int can store values from 0 to 4,294,967,295.
Signed Integers: These integers can store both positive and negative values, as well as zero. One bit, typically the most significant bit (MSB), is reserved to indicate the sign of the number.
- A '0' in the MSB usually means a positive number.
- A '1' in the MSB usually means a negative number.
  For example, a 32-bit signed int can store numbers from -2,147,483,648 to 2,147,483,647.

Two's Complement Representation

For signed integers, C systems almost universally use a method called two's complement to represent negative numbers. This method is efficient because it allows addition and subtraction operations to be performed using the same hardware logic for both positive and negative numbers.

In two's complement:

Positive numbers are stored directly in binary.
To represent a negative number:
- Take the binary representation of its positive counterpart.
- Invert all the bits (0s become 1s, 1s become 0s). This is called the one's complement.
- Add 1 to the result.

For instance, in an 8-bit system:

+5 is 00000101
-5:
1. Positive 5: 00000101
2. Invert bits: 11111010 (one's complement)
3. Add 1: 11111011 (two's complement representation of -5)

Typical Integer Ranges

Here's a table summarizing common integer types and their typical ranges based on standard bit sizes:

Type	Typical Size (Bytes/Bits)	Unsigned Range	Signed Range
`char`	1 Byte / 8 Bits	0 to 255	-128 to 127
`short int`	2 Bytes / 16 Bits	0 to 65,535	-32,768 to 32,767
`int`	4 Bytes / 32 Bits	0 to 4,294,967,295	-2,147,483,648 to 2,147,483,647
`long int` (often)	4 Bytes / 32 Bits	0 to 4,294,967,295	-2,147,483,648 to 2,147,483,647
`long int` (on 64-bit)	8 Bytes / 64 Bits	0 to 18,446,744,073,709,551,615	-9,223,372,036,854,775,808 to 9,223,372,036,854,775,807
`long long int`	8 Bytes / 64 Bits	0 to 18,446,744,073,709,551,615	-9,223,372,036,854,775,808 to 9,223,372,036,854,775,807

Endianness

Another factor influencing integer storage, particularly for multi-byte integers, is endianness. This refers to the byte order in which data is stored in memory.

Little-endian: The least significant byte (LSB) is stored at the lowest memory address.
Big-endian: The most significant byte (MSB) is stored at the lowest memory address.

While endianness doesn't change the value of the integer itself, it affects how its bytes are ordered in physical memory. This is primarily a concern when reading/writing binary data across different system architectures or when dealing with low-level memory manipulation.

Practical Insights

Choosing the Right Type: Select the smallest integer type that can reliably hold the expected range of values to conserve memory. For general-purpose integer arithmetic, int is usually a safe default.
Preventing Overflow: Be mindful of the maximum and minimum values an integer type can hold. Attempting to store a value outside this range results in integer overflow, leading to unexpected behavior. For unsigned integers, this typically "wraps around" to 0; for signed integers, it can result in a change of sign.
Using sizeof: You can use the sizeof operator in C to determine the exact size (in bytes) of any integer type on your current system (e.g., sizeof(int)).

In summary, C stores integers as binary sequences, with specific data types dictating the memory allocated and whether negative values can be represented through mechanisms like two's complement.