|
PatchworkOS
321f6ec
A non-POSIX operating system.
|

PatchworkOS is a modular non-POSIX operating system for the x86_64 architecture that rigorously follows an "everything is a file" philosophy, in the style of Plan9. Built from scratch in C and assembly, its intended to be an educational and experimental operating system.
In the end this is a project made for fun, but the goal is to make a "real" operating system, one that runs on real hardware and has the performance one would expect from a modern operating system without jumping ahead to user space features or drivers, a floppy disk driver and a round-robin scheduler are not enough.
Also, this is not a UNIX clone, it's intended to be a (hopefully) interesting experiment in operating system design by attempting to use unique algorithms and designs over tried and tested ones. Sometimes this leads to bad results, and sometimes, with a bit of luck, good ones.
Finally, despite its experimental nature and scale, the project aims to remain approachable and educational, something that can work as a middle ground between fully educational operating systems like xv6 and production operating system like Linux.
Will this project ever reach its goals? Probably not, but thats not the point.
Stress test showing ~100% utilization across 12 CPUs. | DOOM running on PatchworkOS using a doomgeneric port. |
O(log n) worst case complexity. EEVDF is the same algorithm used in the modern Linux kernel, but ours is obviously a lot less mature.O(1) per page and O(n) where n is the number of pages per allocation/mapping operation, see benchmarks for more info.And much more...
As one of the main goals of PatchworkOS is to be educational, I have tried to document the codebase as much as possible along with providing citations to any sources used. Currently, this is still a work in progress, but as old code is refactored and new code is added, I try to add documentation.
If you are interested in knowing more, then you can check out the Doxygen generated documentation. For an overview check the topics section in the sidebar.
PatchworkOS uses a "modular" kernel design, meaning that instead of having one big kernel binary, the kernel is split into several smaller "modules" that can be loaded and unloaded at runtime.
This is highly convenient for development, but it also has practical advantages, for example, there is no need to load a driver for a device that is not attached to the system, saving memory.
Making a module is intended to be as straightforward as possible. For the sake of demonstration, we will create a simple "Hello, World!" module.
First, we create a new directory in src/kernel/modules/ named hello, and inside that directory we create a hello.c file to which we write the following code:
An explanation of the code will be provided later.
Now we need to add the module to the build system. To do this, just copy an existing module's .mk file without making any modifications. For example, we can copy src/modules/drivers/ps2/ps2.mk to src/modules/hello/hello.mk. The build system will handle the rest, including copying the module to the final image.
Now, we can build and run PatchworkOS using make all run, or we could use make all and then flash the generated bin/PatchworkOS.img file to a USB drive.
Now to validate that the module is working, you can either watch the boot log and spot the Hello, World! message, or you could use grep on the /dev/klog file in the terminal program like so:
This should output something like:
That's all, if this did not work, make sure you followed all the steps correctly. If there is still issues, feel free to open an issue.
Whatever you want. You can include any kernel header, or even headers from other modules, create your own modules and include their headers or anything else. There is no need to worry about linking, dependencies or exporting/importing symbols, the kernels module loader will handle all of it for you. Go nuts.
This code in the hello.c file does a few things. First, it includes the relevant kernel headers.
Second, it defines a _module_procedure() function. This function serves as the entry point for the module and will be called by the kernel to notify the module of events, for example the module being loaded or a device attached. On the load event, it will print using the kernels logging system "Hello, World!", resulting in the message being readable from /dev/klog.
Finally, it defines the modules information. This information is, from left to right, the name of the module, the author of the module (that's you), a short description of the module, the module version, the license of the module, and finally a list of "device types", in this case just BOOT_ALWAYS, but more could be added by separating them with a semicolon (;).
The list of device types is what causes the kernel to actually load the module. I will avoid going into too much detail (you can check the documentation for that), but I will explain it briefly.
The module loader itself has no idea what these type strings actually are, but subsystems can specify that "a device of the type represented by this string is now available", the module loader can then load either one or all modules that have specified in their list of device types that it can handle the specified type. This means that any new subsystem, ACPI, USB, PCI, etc, can implement dynamic module loading using whatever types they want.
So what is BOOT_ALWAYS? It is the type of special device that the kernel will pretend to "attach" during boot. In this case, it simply causes our hello module to be loaded during boot.
For more information, check the Module Documentation.
PatchworkOS strictly follows the "everything is a file" philosophy in a way inspired by Plan9, this can often result in unorthodox APIs that seem overcomplicated at first, but the goal is to provide a simple, consistent and most importantly composable interface for all kernel subsystems, more on this later.
Included below are some examples to familiarize yourself with the concept. We, of course, cannot cover everything so the concepts presented here are the ones believed to provide the greatest insight into the philosophy.
The first example is sockets, specifically how to create and use local seqpacket sockets.
To create a local seqpacket socket, you open the /net/local/seqpacket file. This is equivalent to calling socket(AF_LOCAL, SOCK_SEQPACKET, 0) in POSIX systems. The opened file can be read to return the "ID" of the newly created socket which is a string that uniquely identifies the socket, more on this later.
PatchworkOS provides several helper functions to make file operations easier, but first we will show how to do it without any helpers:
Using the sread() helper which reads a null-terminated string from a file descriptor, we can simplify this to:
Finally, using use the sreadfile() helper which reads a null-terminated string from a file from its path, we can simplify this even further to:
Note that the socket will persist until the process that created it and all its children have exited. Additionally, for error handling, all functions will return either
NULLorERRon failure, depending on if they return a pointer or an integer type respectively. The per-threaderrnovariable is used to indicate the specific error that occurred, both in user space and kernel space (however the actual variable is implemented differently in kernel space).
Now that we have the ID, we can discuss what it actually is. The ID is the name of a directory in the /net/local directory, in which the following files exist:
data: Used to send and retrieve datactl: Used to send commandsaccept: Used to accept incoming connectionsSo, for example, the sockets data file is located at /net/local/[id]/data.
Say we want to make our socket into a server, we would then use the ctl file to send the bind and listen commands, this is similar to calling bind() and listen() in POSIX systems. In this case, we want to bind the server to the name myserver.
Once again, we provide several helper functions to make this easier. First, without any helpers:
Using the F() macro which allocates formatted strings on the stack and the swrite() helper that writes a null-terminated string to a file descriptor:
Finally, using the swritefile() helper which writes a null-terminated string to a file from its path:
If we wanted to accept a connection using our newly created server, we just open its accept file:
The file descriptor returned when the accept file is opened can be used to send and receive data, just like when calling accept() in POSIX systems.
For the sake of completeness, to connect the server we just create a new socket and use the connect command:
You may have noticed that in the above section sections the open() function does not take in a flags argument. This is because flags are directly part of the file path so to create a non-blocking socket:
Multiple flags are allowed, just separate them with the : character, this means flags can be easily appended to a path using the F() macro. Each flag also has a shorthand version for which the : character is omitted, for example to open a file as create and exclusive, you can do
or
For a full list of available flags, check the Documentation.
Permissions are also specified using file paths there are three possible permissions, read, write and execute. For example to open a file as read and write, you can do
or
Permissions are inherited, you can't use a file with lower permissions to get a file with higher permissions. Consider the namespace section, if a directory was opened using only read permissions and that same directory was bound, then it would be impossible to open any files within that directory with any permissions other than read.
For a full list of available permissions, check the Documentation.
Another example of the "everything is a file" philosophy is the spawn() syscall used to create new processes. We will skip the usual debate on fork() vs spawn() and just focus on how spawn() works in PatchworkOS as there are enough discussions about that online.
The spawn() syscall takes in two arguments:
const char** argv: The argument vector, similar to POSIX systems except that the first argument is always the path to the executable.spawn_flags_t flags: Flags controlling the creation of the new process, primarily what to inherit from the parent process.The system call may seem very small in comparison to, for example, posix_spawn() or CreateProcess(). This is intentional, trying to squeeze every possible combination of things one might want to do when creating a new process into a single syscall would be highly impractical, as those familiar with CreateProcess() may know.
PatchworkOS instead allows the creation of processes in a suspended state, allowing the parent process to modify the child process before it starts executing.
As an example, let's say we wish to create a child such that its stdio is redirected to some file descriptors in the parent and create an environment variable MY_VAR=my_value.
First, let's pretend we have some set of file descriptors and spawn the new process in a suspended state using the SPAWN_SUSPENDED flag
At this point, the process exists but its stuck blocking before it is can load its executable. Additionally, the child process has inherited all file descriptors and environment variables from the parent process.
Now we can redirect the stdio file descriptors in the child process using the /proc/[pid]/ctl file, which just like the socket ctl file, allows us to send commands to control the process. In this case, we want to use two commands, dup2 to redirect the stdio file descriptors and close to close the unneeded file descriptors.
Note that
closecan either take one or two arguments. When two arguments are provided, it closes all file descriptors in the specified range. In our case-1causes a underflow to the maximum file descriptor value, closing all file descriptors higher than or equal to the first argument.
Next, we create the environment variable by creating a file in the child's /proc/[pid]/env/ directory:
Finally, we can start the child process using the start command:
At this point the child process will begin executing with its stdio redirected to the specified file descriptors and the environment variable set as expected.
The advantages of this approach are numerous, we avoid COW issues with fork(), weirdness with vfork(), system call bloat with CreateProcess(), and we get a very flexible and powerful process creation system that can use any of the other file based APIs to modify the child process. In exchange, the only real price we pay is overhead from additional context switches, string parsing and path traversals, how much this matters in practice is debatable.
For more on spawn(), check the Userspace Process API Documentation and for more information on the /proc filesystem, check the Kernel Process Documentation.
The next feature to discuss is the "notes" system. Notes are PatchworkOS's equivalent to POSIX signals which asynchronously send strings to processes.
We will skip how to send and receive notes along with details like process groups (check the docs for that), instead focusing on the biggest advantage of the notes system, additional information.
Let's take an example. Say we are debugging a segmentation fault in a program, which is a rather common scenario. In a usual POSIX environment, we might be told "Segmentation fault (core dumped)" or even worse "SIGSEGV", which is not very helpful. The core limitation is that signals are just integers, so we can't provide any additional information.
In PatchworkOS, a note is a string where the first word of the string is the note type and the rest is arbitrary data. So in our segmentation fault example, the shell might produce output like:
Note that the output provided is from the "stackoverflow" program which intentionally causes a stack overflow through recursion.
All that happened is that the shell printed the exit status of the process, which is also a string and in this case is set to the note that killed the process. This is much more useful, we know the exact address and the reason for the fault.
For more details, see the Notes Documentation, Standard Library Process Documentation and the Kernel Process Documentation.
Namespaces are a set of mountpoints that are unique per process, allowing each process a unique view of the file system.
Think of it like this, in the common case, you can mount a drive to /mnt/mydrive and all processes can then open the /mnt/mydrive path and see the contents of that drive. However, for security reasons we might not want every process to be able to see that drive, this is what namespaces enable, allowing mounted file systems or directories to only be visible to a subset of processes.
As an example, the "id" directories mentioned in the socket example are a separate "sysfs" instance mounted in the namespace of the creating process, meaning that only that process and its children can see their contents.
For more information on how mounts are propagated or mounts inherited check the Documentation.
In cases where unrelated processes want to share a file or directory that is invisible to the other, they can voluntarily share a mountpoint in their namespaces using bind() in combination with two new system calls share() and claim().
For example, if process A wants to share its /net/local/5 directory from the socket example with process B, they can do
An interesting detail is that when process A opens the net/local/5 directory, the dentry underlying the file descriptor is the root of the mounted file system, if process B were to try to open this directory, it would still succeed as the directory itself is visible, however process B would instead retrieve the dentry of the directory in the parent superblock, and would instead see the content of that directory in the parent superblock. If this means nothing to you, don't worry about it.
I'm sure you have heard many an argument for and against the "everything is a file" philosophy. So I won't go over everything, but the primary reason for using it in PatchworkOS is "emergent behavior" or "composability" whichever term you prefer.
Take the spawn() example, notice how there is no specialized system for setting up a child after it's been created? Instead, we have a set of small, simple building blocks that when added together form a more complex whole. That is emergent behavior, by keeping things simple and most importantly composable, we can create very complex behavior without needing to explicitly design it.
Let's take another example, say you wanted to wait on multiple processes with a waitpid() syscall. Well, that's not possible. So now we suddenly need a new system call. Meanwhile, in an "everything is a file system" we just have a pollable /proc/[pid]/wait file that blocks until the process dies and returns the exit status, now any behavior that can be implemented with poll() can be used while waiting on processes, including waiting on multiple processes at once, waiting on a keyboard and a process, waiting with a timeout, or any weird combination you can think of.
Plus its fun.
PatchworkOS features a from-scratch ACPI implementation and AML parser, with the goal of being, at least by ACPI standards, easy to understand and educational. It is tested on the Tested Configurations below and against ACPICA's runtime test suite, but remains a work in progress (and probably always will be).
See ACPI Documentation for a progress checklist.
See ACPI specification Version 6.6 as the main reference.
ACPI or Advanced Configuration and Power Interface is used for a lot of things in modern systems but mainly power management and device enumeration/configuration. It's not possible to go over everything here, instead a brief overview of the parts most likely to cause confusion while reading the code will be provided.
It consists of two main parts, the ACPI tables and AML bytecode. If you have completed a basic operating systems tutorial, you have probably seen the ACPI tables before, for example the RSDP, FADT, MADT, etc. These tables are static in memory data structures storing information about the system, they are very easy to parse but are limited in what they can express.
AML or ACPI Machine Language is a Turing complete "mini language", and the source of much frustration, that is used to express more complex data, primarily device configuration. This is needed as its impossible for any specification to account for every possible hardware configuration that exists currently, much less that may exist in the future. So instead of trying to design that, what if we could just have a small program generate whatever data we wanted dynamically? Well that's more or less what AML is.
To demonstrate how ACPI is used for device configuration, we will use the PS/2 driver as an example.
If you have followed a basic operating systems tutorial, you have probably implemented a PS/2 keyboard driver at some point, and most likely you hardcoded the I/O ports 0x60 and 0x64 for data and commands respectively, and IRQ 1 for keyboard interrupts.
Using this hardcoded approach will work for the vast majority of systems, but, perhaps surprisingly, there is no standard that guarantees that these ports and IRQs will actually be used for PS/2 devices. It's just a silent agreement that pretty much all systems adhere to for legacy reasons.
But this is where the device configuration from AML comes in, it lets us query the system for the actual resources used by the PS/2 keyboard, so we don't have to rely on hardcoded values.
If you were to decompile the AML bytecode into its original ASL (ACPI Source Language), you might find something like this:
Note that just like C compiles to assembly, ASL compiles to AML bytecode, which is what the OS actually parses.
In the example ASL, we see a Device object representing a PS/2 keyboard. It has a hardware ID (_HID), which we can cross-reference with an online database to confirm that it is indeed a PS/2 keyboard, a status (_STA), which is just a bit field indicating if the device is present, enabled, etc., and finally the current resource settings (_CRS), which is the thing we are really after.
The _CRS might look a bit complicated but focus on the IO and IRQNoFlags entries. Notice how they are specifying the I/O ports and IRQ used by the keyboard? Which in this case are indeed 0x60, 0x64 and 1 respectively. So in this case the standard held true.
So how is this information used? During boot, the _CRS information of each device is parsed by the ACPI subsystem, it then queries the kernel for the needed resources, assigned them to each device and makes the final configuration available to drivers.
Then when the PS/2 driver is loaded, it gets told "you are handling a device with the name `\_SB_.PCI0.SF8_.KBD_` (which is just the full path to the device object in the ACPI namespace) and the type `PNP0303`", it can then query the ACPI subsystem for the resources assigned to that device, and use them instead of hardcoded values.
Having access to this information for all devices also allows us to avoid resource conflicts, making sure two devices are not trying to use the same IRQ(s) or I/O port(s).
Of course, it gets way, way worse than this, but hopefully this clarifies why the PS/2 driver and other drivers, might look a bit different than what you might be used to.
All benchmarks were run on real hardware using a Lenovo ThinkPad E495. For comparison, I've decided to use the Linux kernel, specifically Fedora since It's what I normally use.
Note that Fedora will obviously have a lot more background processes running and security features that might impact performance, so these benchmarks are not exactly apples to apples, but they should still give a good baseline for how PatchworkOS performs.
All code for benchmarks can be found in the benchmark program, all tests were run using the optimization flag -O3.
The test maps and unmaps memory in varying page amounts for a set amount of iterations using generic mmap and munmap functions. Below is the results from PatchworkOS as of commit 4b00a88 and Fedora 40, kernel version 6.14.5-100.fc40.x86_64.
We see that PatchworkOS performs better across the board, and the performance difference increases as we increase the page count.
There are a few potential reasons for this, one is that PatchworkOS does not use a separate structure to manage virtual memory, instead it embeds metadata directly into the page tables, and since accessing a page table is just walking some pointers, its highly efficient, additionally it provides better caching since the page tables are likely already in the CPU cache.
In the end we end up with a $O(1)$ complexity per page operation, we do of course get $O(n)$ complexity per allocation/mapping operation where $n$ is the number of pages.
Of course, there are limitations to this approach, for example, it is in no way portable (which isn't a concern in our case), each address space can only contain $2^8 - 1$ unique shared memory regions, and copy-on-write would not be easy to implement (however, the need for this is reduced due to PatchworkOS using a spawn() instead of a fork()).
All in all, this algorithm would not be a viable replacement for existing algorithms, but for PatchworkOS, it serves its purpose very efficiently.
The scheduler has not yet been properly benchmarked. However, testing using the "threadtest" program shows that the DOOM port remains more or less playable even with 1000+ threads running at 100% CPU load, so until proper benchmarking is done, we can conclude that performance is adequate.
PatchworkOS includes its own shell utilities designed around its file flags system, when file flags are used we also demonstrate the short form. Included is a brief overview with some usage examples. For convenience the shell utilities are named after their POSIX counterparts, however they are not drop-in replacements.
Opens a file path and then immediately closes it.
Reads from stdin or provided files and outputs to stdout.
Writes to stdout.
Reads the contents of a directory to stdout.
Removes a file or directory.
There are other utils available that work as expected, for example stat and link.
| Requirement | Details |
|---|---|
| OS | Linux (WSL might work, but I make no guarantees) |
| Tools | GCC, make, NASM, mtools, QEMU (optional) |
Source code can be found in the src/ directory, with public API headers in the include/ directory, private API headers are located alongside their respective source files.
For frequent testing, it might be inconvenient to frequently flash to a USB. You can instead set up the .img file as a loopback device in GRUB.
Add this entry to the /etc/grub.d/40_custom file:
Regenerate grub configuration using sudo grub2-mkconfig -o /boot/grub2/grub.cfg.
Finally copy the generated .img file to your /boot directory, this can also be done with make grub_loopback.
You should now see a new entry in your GRUB boot menu allowing you to boot into the OS, like dual booting, but without the need to create a partition.
Testing uses a GitHub action that compiles the project and runs it for some amount of time using QEMU with DEBUG=1, TESTING=1 and QEMU_EXIT_ON_PANIC=1 set. This will run some additional tests in the kernel (for example it will clone ACPICA and run all its runtime tests), and if QEMU has not crashed by the end of the allotted time, it is considered a success.
Note that the QEMU_EXIT_ON_PANIC flag will cause any failed test, assert or panic in the kernel to exit QEMU using their "-device isa-debug-exit" feature with a non-zero exit code, thus causing the GitHub action to fail.
Currently untested on Intel hardware (broke student, no access to hardware). Let me know if you have different hardware, and it runs (or doesn't) for you!
Contributions are welcome! Anything from bug reports/fixes, performance improvements, new features, or even just fixing typos or adding documentation.
If you are unsure where to start, check the Todo List.
Check out the contribution guidelines to get started.
The first Reddit post and image of PatchworkOS from back when getting to user space was a massive milestone and the kernel was supposed to be a UNIX-like microkernel.