The purpose of this blog is to help web developers to jump into systems programming. So you can ask any questions; there are no dummy questions. I want this blog to be a discussion space for every programmer who lives this journey.
As a dev tool engineer, spawning sub-processes was part of my daily job. I got used to doing it in Node.js with the [child\_process](https://nodejs.org/api/child_process.html)
module from the standard library.
I recently switched to a system programming position (still in the dev tool world). I had to change my daily companion from Node.js to C++.
Then I had to learn how to spawn child processes in C++. Even if the terms and concepts are similar, the ergonomics of the C API we need to use for this purpose could fear people from a web background.
This article aims to unveil gray areas and make them easier for you!
✅ What you'll be able to do at the end of this article?
- spawning basic subprocesses in CPP for UNIX OSes,
- creating a user-friendly API to consume
❌ What won't this article cover?
- spawning subprocess in CPP for windows (maybe in the next post)
- complex cases like manipulating
stderr
orstdout
Ready? Let's start our journey by defining terms.
When you launch a program (binary) on your computer, the OS will create a process object stored in memory. This object contains a state (new, ready, running, waiting, terminated).
According to the machine resource and availability, the OS scheduler executes the program and puts the process into different states.
If you want to dig into this topic a bit more. I advise you to this video:
Now it's time to see which API we'll use to spawn a process in C++.
Which C++ API for spawning a process?
There are probably different approaches to spawning a process in C++.
My first web research brought me to posix_spawn
/posix_spawnp
functions.
According to the man:
The posix_spawn() and posix_spawnp() functions are used to create a new child process that executes a specified file.
"Executing a file?" (a voice pops into my mind)
Yes, in this context, file means your executable.
"Okay, but what is the difference between posix_spawn() and posix_spawnp()?"
The difference is about the second argument. In the posix_spawn
case, the argument should be a path to the executable (absolute or relative), but in the posix_spawnp
case, the executable is specified as a simple name.
posix_spawn -> "/usr/bin/curl"
posix_spawnp -> "curl"
In the latter case, the system will search for the executable in the list of directories stored in the PATH
environment variable. Now let's take a look at the API itself:
posix_spawnp API
In the following listing, you'll find the posix_spawnp
API description.
int posix_spawnp(pid_t *restrict pid,
const char *restrict file,
const posix_spawn_file_actions_t *restrict file_actions,
const posix_spawnattr_t *restrict attrp,
char *const argv[restrict],
char *const envp[restrict]);
We will skip parameter number 3 and number 4 as, during my personal experience, I didn't have to deal with them.
There are a few caveats:
argv
: the first item should be the same asfile
argumentargv
: the last item should be0
(Look atexecve
documentation for more details)envp
: should be declared asextern char** environ;
in the file because it will make available byexecve(2)
when a process begins
Let's try to implement a basic version!
Basic implementation of a spawn function in C++
Let's say that we want to spawn a basic curl
command:
#include <cstdio>
#include <cstdlib>
#include <errno.h>
#include <spawn.h>
#include <sys/wait.h>
// NOTE: made available by execve(2) when a process begins
extern char **environ;
int main() {
pid_t pid; // #1
char *args[] = {"curl", // #2
"https://jsonplaceholder.typicode.com/posts/1",
0};
int status = posix_spawnp(&pid, // #3
args[0],
nullptr,
nullptr,
args,
environ);
int s = waitpid(pid, &status, 0); // #4
if (s == -1) {
errno = status; // #5
perror("posix_spawn");
exit(EXIT_FAILURE);
}
return status > 0 ? EXIT_FAILURE : EXIT_SUCCESS;
}
What we have in the body of main
:
#1
: First, we declare the argumentpid
aspid_t
#2
: Then we declare an array of strings (namedargs
) that contains the program we want to use, followed by arguments.#3
: We invokeposix_spawnp
function withpid
andargs
and store the result into astatus
variable.
Note that we use
args
twice: once for thefile
argument and then forargv
the argument.
#4
: The last part is about waiting for the process to end with thewaitpid
function (sys/wait.h
header) that will return a-1
in case of error.#5
: We want to display the error message in case of an error. It's why we set theerrno
variable fromerrno.h
header and then callperror
with a label. Then it should print something like:posix_spawn: <error-message>
If you want to play with it, visit this link.
Now let's reshape the whole implementation. This version is not reusable at the moment.
First, we should wrap the body of our function into a spawn
function:
#include <cstdio>
#include <cstdlib>
#include <errno.h>
#include <spawn.h>
#include <sys/wait.h>
// NOTE: made available by execve(2) when a process begins
extern char **environ;
int spawn(char *args[]) {
pid_t pid;
int status = posix_spawnp(&pid,
args[0],
nullptr,
nullptr,
args,
environ);
int s = waitpid(pid, &status, 0);
if (s == -1) {
errno = status;
perror("posix_spawn");
exit(EXIT_FAILURE);
}
return status > 0 ? EXIT_FAILURE : EXIT_SUCCESS;
}
int main() {
char *args[] = {"curl",
"https://jsonplaceholder.typicode.com/posts/1",
0};
return spawn(args);
}
That's not enough! Ideally, we'd like to mimic the Node.Js API and be able to pass the command and arguments separately. Something like: spawn("curl", ["https://www.google.fr"]);
Also, instead of using old C strings and arrays, we could use std::string
& std::vector
.
#include <algorithm>
#include <cstdio>
#include <cstdlib>
#include <errno.h>
#include <spawn.h>
#include <string>
#include <sys/wait.h>
#include <vector>
// NOTE: made available by execve(2) when a process begins
extern char **environ;
// #2
std::vector<const char *> format_args(const std::string &command,
const std::vector<std::string> &arguments) {
// NOTE: we need two more slot:
// - one for the command itself
// - another for the last "0" item
std::vector<const char *> cstr_args(arguments.size() + 2);
std::transform(std::cbegin(arguments), std::cend(arguments),
std::begin(cstr_args) + 1,
[](const auto &v) { return v.c_str(); });
cstr_args[0] = command.c_str();
return cstr_args;
}
int spawn(const std::string &command, const std::vector<std::string> &args) {
pid_t pid;
const std::vector<const char *> cstr_args = format_args(command, args);
// #3
char *const *raw_args = const_cast<char *const *>(cstr_args.data());
int status = posix_spawnp(&pid,
raw_args[0],
nullptr,
nullptr,
raw_args,
environ);
int s = waitpid(pid, &status, 0);
if (s == -1) {
errno = status;
perror("posix_spawn");
return EXIT_FAILURE;
}
return status > 0 ? EXIT_FAILURE : EXIT_SUCCESS;
}
int main() {
const std::string command = "curl";
const std::vector<std::string> args{
"https://jsonplaceholder.typicode.com/posts/1"};
int status = spawn(command, args); // #1
return status;
}
What we have in this new version:
#1
: the API of the spawn function now takes two parameters, one for the command (std::string
) and another for the arguments (std::vector<std::string>
).#2
:then the spawn function has to format this input to fit into an old-fashioned array of strings. That's the purpose of theformat_args
function that returns a vector of C strings#3
:finally, we have to convert this vector into an old-fashioned array; it's trivial because thestd::vector
expose a.data()
the method that makes that conversion easier.
💡If you are not familiar with c++, consider just the format_args
function as a magic box that converts our arguments into an old-fashioned C array.If you want to play with the full version:
Conlusion
Congratulation! You managed to finish this post and you're now ready to spawn processes in C++.
Let me know in the comment section if you want to see a Windows version of this post.
Takeaways
If you want to read more about spawning process, I advise you to read:
- https://man7.org/linux/man-pages/man3/posix_spawn.3.html
- https://man7.org/linux/man-pages/man3/waitpid.3p.html
If you want to understand how I convert a std::vector<std::string>
to a char *args[]
, please take a look at: