Anton Lindstrom (about, @twitter, @github)

Operating systems: Introduction to processes

Published:

I've wanted to write this for a long time but always had an excuse not to. Operating systems are a big part of my every day work, especially GNU/Linux which will be the focus of this post.

The topic of processes is huge so I'm not sure how to cover all of the pieces. This post will hopefully contain enough code so you'll be able to understand how to interact with a process. These code examples will focus on GNU/Linux as it is the operating system I'm most familiar with.

So, what is a process? The Linux Information Project defines a process as "an executing (i.e., running) instance of a program". So, to define a process we need to define what a program is. Again, according to The Linux Information Project, "A program is an executable file that is held in storage".

So, what we know is that a process is some part of a program that is running. Does that mean that a process must be running? Well, not quite.

Process states

In order to understand what it means that a process is running, we need to visit the different states of a process. A process can have several states (in the Linux kernel, a process is sometimes called a task).

In the file fs/proc/array.c the following is defined:

/*
* The task state array is a strange "bitmap" of
* reasons to sleep. Thus "running" is zero, and
* you can test for combinations of others with
* simple bit tests.
*/
static const char * const task_state_array[] = {
        "R (running)",          /*   0 */
        "S (sleeping)",         /*   1 */
        "D (disk sleep)",       /*   2 */
        "T (stopped)",          /*   4 */
        "t (tracing stop)",     /*   8 */
        "X (dead)",             /*  16 */
        "Z (zombie)",           /*  32 */
};

Running does not however mean that the process is running, it denotes that the process is either running or on the run queue. Sleeping means that the process is waiting for an event to complete (the sleep here is also sometimes called interruptible sleep). Disk sleep is sometimes called uninterruptible sleep, and the process is usually waiting for IO to finish.

A process can be stopped (T) by sending a process a SIGSTOP signal. This pauses the process and can be continued by sending the SIGCONT signal.

For example, stopping and continuing a process can be done in the following manner:

kill -SIGSTOP <pid>
kill -SIGCONT <pid>

Tracing stop can be done by using gdb to stop a process. If I recall correctly, this state is basically the same as stopped state.

The dead state is a state that is returned when the kernel is running the do_exit() function in kernel/exit.c. This is just to return a status but the state should not be seen in your task list.

Zombies is a state that is a bit peculiar. Some think of it as a state that happens when the process parent dies and the child process is left. This is not the case. The parent may die but the child could still live on, the parent process of that child will be the init process, pid 1. A zombie process occurs when the process exits and the return code hasn't been read by the parent process (using the wait() system call). It remains in the process table as terminated but is waiting for the parent to read the exit status.

Here's a simple example that creates a zombie process for 30 seconds:

#include <stdio.h>
#include <stdlib.h>

/*
* A program to create a 30s zombie
* The parent spawns a process that isn't reaped until after 30s.
* The process will be reaped after the parent is done with sleep.
*/
int main(int argc, char **argv[])
{
        int id = fork();

        if ( id > 0 ) {
                printf("Parent is sleeping..\n");
                sleep(30);
        }

        if ( id == 0 )
                printf("Child process is done.\n");

        exit(EXIT_SUCCESS);
}

The post Linux process states is an excellent post describing the process states with code examples and ptrace to control it.

What does a process contain?

I briefly mentioned the process table, here I'll explain what it is. A process table is a data structure in the Linux kernel that is loaded into RAM and contains information about processes.

Every process has information in the data structure, task_struct and contains amongst other:

  • State (task state, exit code, exit signal..)
  • Priority
  • PID
  • PPID
  • Children
  • Usage (cpu time, open files..)
  • Tracing information
  • Scheduling information
  • Memory management information

The data structure holding the process information is called task_struct and can be found in include/linux/sched.h. All processes that are running in the system are represented in the kernel as a linked list of task_struct.

Information about a process can be queried via the /proc system. To get information about the process with pid 400, you should be able to look into the /proc/400 directory. Most of the information can also be found using user land tools such as top and ps.

Process execution

When a process is executed, it is loaded into the virtual memory, allocates the space for program variables and adds the information into the task_struct data structure.

The process contains a memory layout of four different segments:

  • Text, contains source instructions of the program
  • Data, contains static variables
  • Heap is the area for dynamic memory allocations
  • Stack is of dynamic size and grows and shrinks as the process is running, this is the storage for local variables.

There are two ways to create a process, fork() and execve(). These are both system calls but works slightly different.

To create a child process the fork() system call can be executed. The child process then inherits a copy of the parent data, stack and heap memory segments. The child process can then modify these segments independently. The text segment is also shared with the child process but can not be modified.

A new process is created with execve(). This system call destroys all the memory segments to create new ones. execve() does however take an executable or script as an argument which is also different from fork(). execve() does however take an executable or script as an argument which is also different from fork().

Note that both execve() and fork() creates a process that is a child process of the executing process.

There's a lot of more to process execution than this. There's scheduling, permissions, resource limits, library linking, memory mapping.. However, this post will unfortunately be too long to cover everything. Perhaps this will be something to revisit later on.

Interprocess communication (IPC)

For processes to communicate with each other, a couple of methods exist such as shared memory or message passing.

In the case of shared memory, a shared region is created so that several processes can communicate. The region can then be accessed simultaneously by multiple processes. This is commonly used when working with threads. This is the fastest form of IPC because it's only writing and reading memory involved. However, this requires the processes involved to agree on accessing the memory segment as restrictions on accessing other processes memory is implemented by the kernel.

Shared memory segments in use can be found with the command ipcs -m.

Implementing a server for shared memory, looks something like this:

#include <stdlib.h>
#include <stdio.h>
#include <sys/ipc.h>
#include <sys/shm.h>

#define SEGMENT_SIZE 64

int main(int argc, char **argv[])
{
        int shmid;
        char *shmaddr;

        /* Create or get the shared memory segment */
        if ((shmid = shmget(555, SEGMENT_SIZE, 0644 | IPC_CREAT)) == -1) {
                printf("Error: Could not get memory segment\n");
                exit(EXIT_FAILURE);
        }

        /* Attach to the shared memory segment */
        if ((shmaddr = shmat(shmid, NULL, 0)) == (char *) -1) {
                printf("Error: Could not attach to memory segment\n");
                exit(EXIT_FAILURE);
        }

        /* Write a character to the shared memory segment */
        *shmaddr = 'a';

        /* Detach the shared memory segment */
        if (shmdt(shmaddr) == -1) {
                printf("Error: Could not close memory segment\n");
                exit(EXIT_FAILURE);
        }

        exit(EXIT_SUCCESS);
}

By substituting *shmaddr = 'a'; with printf("Segment: %s\n", shmaddr) you will get a client instead and be able to read the data in the shared memory segment.

Running ipcs -m will output the following information about the segment set with the server:

[email protected]:~$ ipcs -m

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status      
0x0000022b 0          anton      644        64         0   

The segment can be removed with ipcrm. To learn more about implementing shared memory IPC read Beej's fantastic guide, Shared memory segments.

Other approaches to IPC are files, signals, sockets, message queues, pipes, semaphores and message passing. I'm not able to dive into all of the approaches but I think that signals and pipes should provide some interesting examples.

Signals

In process states, we saw an example of signals with the help of kill. A signal is a software interrupt that informs processes of events or exceptions that occurs.

A signal is identified by an integer but is often described with SIGXXX, for example SIGSTOP or SIGCONT. Signals are used by the kernel to inform processes of events but can also be sent from a process with the kill() system call. A process that receives a signal may ignore it, be killed by it or be suspended by it. It is possible to handle signals via a signal handler and the process may do whatever it pleases when the signal occurs. The special signal SIGKILL cannot be trapped (handled), this is used when killing for example a hung process. SIGKILL should not be confused with SIGTERM which is sent by default when using Ctrl+C or kill <PID>. SIGTERM doesn't forcibly kill the process and the signal can be trapped and often a process is allowed to clean up.

Pipes

A pipe is used to connect one process output to another process input. This is one of the oldest methods of IPC. An ordinary pipe is a one-way communication, it has a unidirectional flow. A pipe can be created with pipe() and is similarly to other objects in Linux, treated as a file. The read() and write() operations apply to pipes as well as files.

Named pipes is an improvement of ordinary pipes, the communication can flow bidirectionally and several writers and readers can use the pipe. This is not possible in ordinary pipes. Named pipes can also exist even if no writers or readers are using it. The named pipes are created as a special device in the file system, in GNU/Linux named pipes are also referred to as FIFOs (First In First Out).

Here's an example of creating a named pipe:

#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>

int main(int argc, char **argv[])
{
        if (mknod("myfifo", S_IFIFO|0666, 0) == -1) {
                printf("Failed to mknod\n");
                exit(EXIT_FAILURE);
        }

        exit(EXIT_SUCCESS);
}

In the executing directory, we'll see the file myfifo. It will look something like the following:

prw-rw-r--  1 anton anton    0 Dec 16 16:14 myfifo

That was the basic introduction into processes. The more I started to write the more I realized that there's so much to cover. I had a hard time knowing where to start and also where to draw the line of what not to cover. Shared memory segments is something I haven't done that much of and it was really fun to revisit that part of interprocess communication. Also, by having a lot of good resources such as The Linux Programming Interface and Operating System Concepts made it easier to get back into the concepts.

References

The following resources has been used in order to gain more understanding of the field. If you want to learn more about operating systems, make sure to check these books out, they are pretty thick but a good read.