How do pipes work in C?

Pipes in C provide a fundamental mechanism for inter-process communication (IPC), allowing two related processes to exchange data in a unidirectional flow.

How Do Pipes Work in C?

Pipes in C are a form of inter-process communication (IPC) that allows data to flow from one process to another. They act as a conduit, enabling one process to write data and another process to read it. This is typically used for communication between a parent process and its child processes created via fork().

Understanding C Pipes for Inter-Process Communication

At its core, a pipe is a system call that creates a unidirectional communication link using two file descriptors. When a pipe is created, the operating system allocates an in-memory buffer. One process writes data to one end of the pipe, and another process reads data from the other end.

Key characteristics of pipes:

Unidirectional: Data flows in one direction only. If two-way communication is needed, two separate pipes must be created.
Byte Stream: Pipes transfer data as a stream of bytes, without any message boundaries.
Kernel-Managed Buffer: The operating system manages an internal buffer for the pipe. Data written to the pipe is temporarily stored in this buffer until it is read.
Related Processes: Pipes are primarily used for communication between processes that share a common ancestor, typically a parent and child process, because the file descriptors need to be inherited.

The `pipe()` System Call

The pipe() system call is used to create a pipe. It takes a single argument: an array of two integers.

int pipe(int fd[2]);

Upon successful creation, the pipe() system call populates the fd array with two new file descriptors:

File Descriptor	Purpose
`fd[0]`	The read-end of the pipe (for reading data).
`fd[1]`	The write-end of the pipe (for writing data).

This means that data written to fd[1] can be read from fd[0]. As the reference states, "the first element of the array contains the file descriptor that corresponds to the output of the pipe (stuff to be read)".

Return values of pipe():

0: On success.
-1: On error, and errno is set to indicate the error (e.g., EMFILE if too many file descriptors are in use, ENFILE if the system file table is full).

How Pipes Facilitate Communication

Pipes are most commonly used in conjunction with the fork() system call to establish communication channels between a parent and its child process. Here's a typical workflow:

Create Pipe: The parent process first calls pipe() to create the pipe, obtaining fd[0] (read end) and fd[1] (write end).
Fork Process: The parent then calls fork(). This creates a child process that inherits copies of the parent's file descriptors, including both fd[0] and fd[1] of the pipe.
Close Unused Ends: To ensure proper one-way communication:
- If the parent intends to write and the child intends to read:
  - The parent closes its fd[0] (read end).
  - The child closes its fd[1] (write end).
- If the parent intends to read and the child intends to write:
  - The parent closes its fd[1] (write end).
  - The child closes its fd[0] (read end).
Communicate: The processes then use the remaining open ends of the pipe:
- The writing process uses write() on its open write descriptor (fd[1]).
- The reading process uses read() on its open read descriptor (fd[0]).
Close Remaining Ends: After communication is complete, both processes should close their respective ends of the pipe to release resources.

Practical Example: Parent Writing, Child Reading

Consider a scenario where the parent process sends a message to its child.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h> // For pipe(), fork(), read(), write(), close()
#include <string.h> // For strlen()
#include <sys/wait.h> // For wait()

int main() {
    int pipe_fd[2]; // pipe_fd[0] for read, pipe_fd[1] for write
    pid_t pid;
    const char *message = "Hello from parent!";
    char buffer[256];

    if (pipe(pipe_fd) == -1) {
        perror("pipe");
        exit(EXIT_FAILURE);
    }

    pid = fork();

    if (pid == -1) {
        perror("fork");
        exit(EXIT_FAILURE);
    }

    if (pid == 0) { // Child process
        close(pipe_fd[1]); // Close unused write end
        printf("Child: Waiting to read from pipe...\n");
        ssize_t bytes_read = read(pipe_fd[0], buffer, sizeof(buffer) - 1);
        if (bytes_read == -1) {
            perror("read");
            exit(EXIT_FAILURE);
        }
        buffer[bytes_read] = '\0'; // Null-terminate the string
        printf("Child: Received message: \"%s\"\n", buffer);
        close(pipe_fd[0]); // Close read end
        exit(EXIT_SUCCESS);
    } else { // Parent process
        close(pipe_fd[0]); // Close unused read end
        printf("Parent: Sending message: \"%s\"\n", message);
        if (write(pipe_fd[1], message, strlen(message)) == -1) {
            perror("write");
            exit(EXIT_FAILURE);
        }
        close(pipe_fd[1]); // Close write end
        wait(NULL); // Wait for the child to finish
        printf("Parent: Child finished.\n");
        exit(EXIT_SUCCESS);
    }

    return 0;
}

In this example, the parent writes to pipe_fd[1], and the child reads from pipe_fd[0]. Each process closes the end of the pipe it doesn't need, preventing deadlocks or resource leaks.

Key Mechanisms and Considerations

Blocking I/O: By default, read() on an empty pipe will block until data is available, and write() to a full pipe will block until space becomes available. This synchronous behavior simplifies coordination between processes.
Pipe Buffer: The kernel maintains a finite-sized buffer for each pipe (e.g., typically 64KB on Linux). If the writer tries to write more than the buffer size, it will block until the reader consumes some data.
End-of-File (EOF): When the write end of a pipe is closed, and all data has been read, subsequent read() calls on the read end will return 0, indicating EOF. This is crucial for signaling the end of communication.
Error Handling: Always check the return values of pipe(), fork(), read(), write(), and close() for errors.

Advantages and Limitations

Advantages:

Simplicity: Easy to set up for basic parent-child communication.
Efficiency: Relatively low overhead for data transfer compared to some other IPC mechanisms.
Standardized: A core part of Unix-like operating systems.

Limitations:

Unidirectional: Only one-way communication per pipe.
Related Processes Only: Primarily restricted to processes with a common ancestor, as file descriptors need to be inherited. For unrelated processes, named pipes (FIFOs) or other IPC mechanisms are needed.
No Message Boundaries: Data is a byte stream; the applications must define their own protocols for structuring messages.

Pipes are an excellent starting point for understanding inter-process communication in C, offering a simple yet powerful way for related processes to exchange information.

How do pipes work in C?