Understanding Interprocess Communication (IPC): Pipes, Message Queues, Shared Memory, RPC, Semaphores, Sockets

Mohit Sharma
11 min readDec 27, 2023

--

Interprocess Communication

There are several methods for communication within a single machine. These methods are known as Interprocess Communication (IPC) and allow different processes to communicate with each other. Some common methods of IPC include Pipes, Named Pipes, Message Queues, Shared Memory, Remote Procedure Calls (RPC), Semaphores and Sockets.

Pipes

  • Pipes are a form of IPC that allow two or more processes to communicate with each other by creating a unidirectional channel between them (i.e. data to be transferred between processes in one direction only). One process writes to the pipe, and the other process reads from it.
  • Pipes are implemented using system calls in most modern operating systems, including Linux, macOS, and Windows.
  • There are two types of pipes: Anonymous Pipes and Named Pipes (FIFOs).

Anonymous Pipes

  1. Messages and Buffer: When using anonymous pipes, messages sent from a producer process to a subscriber process are stored in a buffer maintained by the kernel (typically implementing it as a fixed-size byte stream). The buffer’s size can vary depending on the operating system and its configuration. If the buffer becomes full and the producer tries to write more data, the write operation may block until space becomes available. Similarly, if the buffer becomes empty and the subscriber tries to read data, the read operation may block until new data is available.
  2. Creation of Named Pipes: Anonymous pipes are created using the pipe() system call in Unix-like operating systems. This creates a pair of file descriptors: one for the read end (receiving) of the pipe and one for the write end (sending) of the pipe. These file descriptors allow processes to interact with the pipe, but they do not directly represent the buffer managed by the kernel.
  3. Sending and Receiving Data: Data flows unidirectionally, with the parent process writing data to the write end, and the child process reading from the read end. The pipe acts as a buffer, facilitating the transfer of data between the two processes.
  4. Use Case: Anonymous pipes are commonly used between a parent process and its child process. After creating the pipe, the parent process forks a child process, and both processes have access to the read and write ends of the pipe.

Named Pipes (FIFOs)

  • File-Based Communication: Named pipes use the file system as a medium for communication between processes. They are created as special files in a directory, just like regular files. However, these files are not meant to store data on disk. Instead, they act as virtual channels through which data can flow between processes.
  • Creation: To create a named pipe, you use a mkfifo() system call provided by the operating system. Once created, the named pipe file appears in the file system, and processes can interact with it like they would with any other file.
  • Two Ends: A named pipe has two ends: a reading end and a writing end. These ends function as entry and exit points for data. One process can write data to the writing end of the named pipe, and another process can read the same data from the reading end. This creates a unidirectional flow of data.
  • Synchronization: Named pipes handle synchronization automatically. If a process tries to read from an empty buffer, it will be blocked until data becomes available. Similarly, if a process tries to write to a full buffer, it will be blocked until there’s space in the buffer.
  • Communication Beyond Parent-Child: One of the significant advantages of named pipes is that they can facilitate communication between unrelated processes. While they can still be used between parent and child processes, they extend their functionality to enable data exchange between processes that might not share a direct hierarchical relationship.
  • Lifespan and Cleanup: Named pipes persist beyond the lifespan of the processes that use them. They remain as files in the file system until explicitly removed. This allows different processes to communicate even if they start and stop at different times.

Message Queues

Message Queues allows processes to exchange data in the form of messages between two processes. It allows processes to communicate asynchronously by sending messages to each other where the messages are stored in a queue, waiting to be processed, and are deleted after being processed.

  • Creation: To use a message queue, a process first needs to create or open a message queue. This is done using system calls like msgget() which takes a key (message queue identifier) and creates or opens a message queue associated with that key. If the message queue doesn't already exist, it's created; if it does exist, the process gains access to it.
  • Structure of Messages: Each message in the queue has a specific structure. Every message has a positive long integer type field, a non-negative length, and the actual data bytes (corresponding to the length). The message type is a numerical identifier that helps the receiving process identify the type of message it wants to retrieve.
  • Sending Messages: To send a message, a process uses the msgsnd() system call. It specifies the message queue's identifier, the message type, the data to be sent, and some control flags. The message is then added to the message queue, waiting for the recipient process to retrieve it.
  • Receiving Messages: To receive a message, a process uses the msgrcv() system call. It specifies the message queue's identifier, the message type it wants to receive, a buffer to hold the message data, and other control flags. The kernel retrieves the message from the message queue that matches the specified type and copies it into the buffer.
  • Synchronization and Blocking: Message queues offer built-in synchronization. If a process tries to receive a message from an empty queue, it can be blocked until a message of the specified type becomes available. Similarly, if a process tries to send a message to a full queue, it can be blocked until there’s space.
  • Message Priority: Some implementations of message queues support message priorities. This means that higher-priority messages are retrieved before lower-priority ones.
  • Cleanup and Deletion: When a process is done using a message queue, it can close it using the msgctl() system call. If no processes are using the message queue, it can be removed from the system using the same system call.
  • Persistence: Depending on the system, message queues might persist even if the processes using them terminate. This allows processes to communicate even if they start and stop at different times.

Difference between Pipes and Message Queues

Message queues and pipes are both mechanisms for inter-process communication (IPC), but they differ in several important ways:

1. Communication Mechanism:

  • Message Queues: Message queues provide a structured way for processes to send and receive discrete messages. Messages can be of different types and sizes, and they are usually stored in a queue data structure. This allows for asynchronous communication, where processes can operate independently and handle messages at their own pace.
  • Pipes: Pipes are unidirectional communication channels that transfer a stream of bytes from one process to another. They are typically used for simple data exchange between a producer process (writing end) and a consumer process (reading end), and they rely on a sequential, byte-oriented data flow.

2. Data Format:

  • Message Queues: Messages in message queues can have complex structures, including various data types and structures. This structured approach makes them suitable for sending well-defined messages.
  • Pipes: Pipes handle data as an unstructured stream of bytes. They are less suitable for sending structured data and are often used for transmitting textual or binary data.

3. Synchronization:

  • Message Queues: Message queues offer built-in synchronization. If a process tries to read from an empty queue, it will wait until a message arrives. Similarly, if a process tries to send a message to a full queue, it will wait until there’s space.
  • Pipes: Pipes don’t inherently provide synchronization. If the reading process tries to read from an empty pipe, it might block or receive an end-of-file indication. Similarly, if the writing process tries to write to a full pipe, it might block or encounter an error.

4. Directionality:

  • Message Queues: Message queues can support bidirectional communication, allowing two processes to communicate with each other using two separate queues.
  • Pipes: Pipes are unidirectional by nature. To establish bidirectional communication, two pipes are needed — one for each direction.

5. Persistence:

  • Message Queues: In some systems, message queues can persist even if the sending or receiving processes terminate. This allows messages to be stored for future retrieval.
  • Pipes: Pipes are ephemeral and exist only as long as the processes that created them are running. Once the processes close their pipe ends, the pipe is removed.

6. Process Relationships:

  • Message Queues: Message queues are not limited by process hierarchy. They can be used between related processes (parent-child) or unrelated processes.
  • Pipes: Pipes are commonly used between a parent process and its child, with the parent process typically creating the pipe and forking the child.

In essence, message queues are more versatile and suitable for structured communication between processes, whereas pipes are simpler and are often used for direct data streaming between related processes. The choice between message queues and pipes depends on the complexity of the data being exchanged, the synchronization requirements, and the relationship between the communicating processes.

Shared Memory

A memory section is shared between different processes. In other words, one process writes to this memory and another process can read from this memory. This allows for fast communication between processes as data doesn’t have to be copied around. However, due to the potential for synchronization issues, careful management of synchronization mechanisms is crucial to maintaining data integrity and preventing conflicts.

Let’s dive into the practical aspects of how shared memory works in Unix-like systems:

1. Creating Shared Memory:

To use shared memory, you first need to create it. This is typically done using the shmget() system call, which allocates a shared memory segment. The call takes parameters such as a key (an identifier for the shared memory) and the size of the memory segment you want to create.

Example: int shmid = shmget(key, size, IPC_CREAT | 0666);

2. Attaching to Shared Memory:

Once the shared memory is created, processes that want to use it need to attach to it. This is done using the shmat() system call. It returns a pointer to the shared memory segment, allowing the process to access it.

Example: void *shared_memory = shmat(shmid, NULL, 0);

3. Sharing Data:

With the shared memory attached, processes can now read from and write to the shared memory region just like any other memory. Data written by one process can be immediately accessed by another process attached to the same segment.

Example: strcpy(shared_memory, "Hello from Process 1!");

4. Synchronization:

Shared memory can be accessed by multiple processes simultaneously, which can lead to race conditions. Synchronization mechanisms like semaphores or mutexes are used to ensure data consistency.

Example: Using a semaphore to control access to the shared memory.

5. Detaching from Shared Memory:

When a process is done using the shared memory, it should detach from it using the shmdt() system call.

Example: shmdt(shared_memory);

6. De-allocating Shared Memory:

When shared memory is no longer needed by any process, it should be de-allocated using the shmctl() system call with the IPC_RMID command.

Example: shmctl(shmid, IPC_RMID, NULL);

7. Permissions and Ownership:

Shared memory segments, like other IPC resources, have ownership and permissions associated with them. Proper permissions ensure that only authorized processes can access the shared memory.

8. Error Handling:

System calls related to shared memory return error codes that should be checked to handle various scenarios, such as when creating or attaching to shared memory fails.

9. Process Independence:

Shared memory is not limited by process hierarchy. Different processes can access the shared memory as long as they have the required permissions.

10. Cleaning Up:

It’s important to properly clean up shared memory resources when they are no longer needed. Detaching and de-allocating shared memory segments prevents resource leaks.

Semaphores

Semaphores are a synchronization mechanism used to coordinate the activities of multiple processes in a computer system. They are used to enforce mutual exclusion, avoid race conditions and implement synchronization between processes.

Semaphores provide two operations: wait (P) and signal (V). The wait operation decrements the value of the semaphore, and the signal operation increments the value of the semaphore. When the value of the semaphore is zero, any process that performs a wait operation will be blocked until another process performs a signal operation.

Semaphores are used to implement critical sections, which are regions of code that must be executed by only one process at a time. By using semaphores, processes can coordinate access to shared resources, such as shared memory or I/O devices.

Semaphores differ from other IPC methods such as Pipes, Message Queues and Shared Memory in that they are not used for direct communication between processes. Instead, they are used to coordinate access to shared resources and ensure that only one process can access a shared resource at a time.

Sockets

Sockets are an inter-process communication (IPC) mechanism that allows two or more processes to communicate with each other by creating a bidirectional channel between them. A socket is one endpoint of a two-way communication link between two programs running on the network. The socket mechanism provides a means of IPC by establishing named contact points between which the communication takes place.

POSIX sockets are a type of socket available in the POSIX API. There are two types of POSIX sockets: IPC sockets (aka Unix domain sockets) and network sockets. IPC sockets enable channel-based communication for processes on the same physical device (host), whereas network sockets enable this kind of IPC for processes that can run on different hosts, thereby bringing networking into play.

POSIX sockets differ from other IPC methods such as Pipes, Message Queues and Shared Memory in that they can be used for both local and network communication.

Middleware Layer

Software that provides services to applications beyond those generally available at the operating system.

Middleware is software that lies between an operating system and the applications running on it. It enables communication and data management for distributed applications. Some common examples of middleware include database middleware, application server middleware, message-oriented middleware, web middleware, and transaction-processing monitors.

Each program typically provides messaging services so that different applications can communicate using messaging frameworks like simple object access protocol (SOAP), web services, representational state transfer (REST), and JavaScript object notation (JSON). While all middleware performs communication functions, the type a company chooses to use will depend on what service is being used and what type of information needs to be communicated

  1. Database middleware provides a way for applications to access data stored in databases. Some examples of database middleware include Open Database Connectivity (ODBC) and Java Database Connectivity (JDBC).
  2. Application server middleware provides a platform for building and deploying enterprise applications. Some examples of application server middleware include Apache Tomcat and IBM WebSphere.
  3. Message-oriented middleware provides a way for applications to communicate with each other using messages. Some examples of message-oriented middleware include Apache Kafka and RabbitMQ.
  4. Web middleware provides a way for web applications to communicate with each other. Some examples of web middleware include Express.js and Ruby on Rails.
  5. Transaction-processing monitors provide a way to manage transactions in distributed systems. Some examples of transaction-processing monitors include IBM CICS and Oracle Tuxedo.

Google Protocol Buffer

Google Protocol Buffers (protobuf) is a language-neutral, platform-neutral extensible mechanism for serializing structured data. It was developed by Google for internal use and provided a code generator for multiple languages under an open-source license. The design goals for Protocol Buffers emphasized simplicity and performance. In particular, it was designed to be smaller and faster than XML.

  1. Protocol Buffers are widely used at Google for storing and interchanging all kinds of structured information. The method serves as a basis for a custom remote procedure call (RPC) system that is used for nearly all inter-machine communication at Google.
  2. To use protobuf, you first define your data structure in a .proto file using the protobuf interface definition language (IDL). This file specifies the fields and their types for the messages you want to serialize. Once you have defined your messages, you can use the protobuf compiler (protoc) to generate source code in the language of your choice. This generated code includes classes for each message type, with methods for reading and writing the message to a binary format.

Hope you enjoyed reading. I’m always open to suggestions and new ideas. Please write to me :)

--

--

Mohit Sharma
Mohit Sharma

Written by Mohit Sharma

Software Engineer @Google | Ardent to learn almost everything | linkedin.com/in/mohitdtumce | github.com/mohitdtumce

No responses yet