Welcome to our exploration of macOS internals. In this blog, we dive into the fascinating world of system programming on macOS. Our journey will take us through both the familiar and the hidden aspects of macOS, providing insights that bridge the gap between high-level application development and the low-level magic that powers every Mac.

Suppose you want to reverse-engineer some of Darwin’s binary source code. I want to show you the technique I used to find the binary source code quickly.

We know Apple put a part of its open-source code in this https://github.com/apple-oss-distributions. When it comes to Darwin, some components are also available.

The technique

The open-source code is split into “projects.” For instance, the binary mv source code lies in the file_cmds project.

Then, if we know the project, we can quickly look for it on the Apple OSS distribution repositories. The framework is the following:

  • Locate the binary
  • Find the project name
  • Search it on the Apple OSS GitHub organization

Let’s try with cat command.

First example: cat

For this first example, I’ll detail the whole workflow, and then you can conceptualize how we get the final result.

First, we should locate where the binary is using where command.

➜  ~ where cat

You will notice that a lot of binaries are in /bin. But it is not always the case.

Now that we know where the binary is let’s inspect its content. There is a __const section in the __TEXT segment of the binary that contains all nonrelocatable data. This is where you will find the project’s name associated with this binary.

💡In macOS, executable files are stored in this Mach-O format. The __TEXT segment contains the executable code and __const is a subsection designated for constant data. A “segment” is a broad division of a program’s memory space, while a “section” is a subdivision within a segment.

Let’s inspect the content of the binary with otool:

➜  ~ otool -v -s __TEXT __const /bin/cat
Contents of (__TEXT,__const) section
0000000100003ef0	40 28 23 29 50 52 4f 47 52 41 4d 3a 63 61 74 20 
0000000100003f00	20 50 52 4f 4a 45 43 54 3a 74 65 78 74 5f 63 6d 
0000000100003f10	64 73 2d 31 35 34 0a 00 00 00 00 00 00 40 63 40 
0000000100003f20	24 46 72 65 65 42 53 44 24 00 

Not very useful, right? Let’s try to convert this:

➜  ~ otool -v -s __TEXT __const /bin/cat  | xxd -r -p 
>?@(#)PROGRAM:cat ? PROJECT:text_cm?ds-154
@c@? $FreeBSD$%   

Still ugly, right? This is not a proper conversion of the binary content, but it could help to use strings command that “find the printable strings in an object, or other binary, file.”

➜  ~ strings /bin/cat | grep \#                                    
@(#)PROGRAM:cat  PROJECT:text_cmds-154

💡What @(#) means? Originates from a convention used in the Version Control System known as an SCCS identifier or keyword. While SCCS is not as widely used today, its conventions, such as these identifiers, can still be found in older codebases or systems that have evolved from Unix-based environments.Hopefully, we have a more straightforward way to do it :) Using what command:

The what utility searches each specified file for sequences of the form
“@(#)” as inserted by the SCCS source code control system.  It prints the
remainder of the string following this marker, up to a NUL character,
newline, double quote, ‘>’ character, or backslash.

Let’s use it:

➜  ~ what -s /bin/cat
	PROGRAM:cat  PROJECT:text_cmds-154

Now we have the project name. Let’s find it on GitHub:

Find the project corresponding to the cat command on GitHubLet's try with another program.

Second example: launchd

Let’s repeat the previous steps but with launchd.

➜  ~ where launchd

💡sbin stands for “system binaries.” These are typically commands used by the system administrator tasks that are not intended for regular users. In contrast, /bin contains essential user binaries.

➜  ~ what -s /sbin/launchd
	PROGRAM:launchd  PROJECT:libxpc-2462.141.1.701.3

Let’s find libxpc on Github:

We cannot find libxpc source code

Oops! Unfortunately libxpc is close-source:

[…] which used to be open source but was closed in 10.10 following its integration into the libxpc project. J.Levin - *OS internals

Future Research Directions

Once I realized that @(#) could be defined by code maintainers; I tried to find for @(#) occurrences into cat code. But unfortunately, I did not find the project name constant in the code, but just:

I wonder if the build system adds the string into this specific segment. I need to dig more to understand this. Maybe the topic for the next post?