Welcome to the Acumen Security Blog

Revisiting W^X with OpenBSD 6.0

Overview

OpenBSD 6.0 was released today, and with it some exciting new security features.  From my perspective, the chief among them is the technical enforcement of W^X in user-land. Since moving to a technical control rather than a policy statement for enforcing executable space protection was a result of discussions caused by my last blog post on the situation, I’m very excited about this development and thought that giving a demonstration and discussion would be in order. (In the spirit of not putting the headline on Page 1 and the retraction on Page 11, hopefully BSDNow will cover this as well).

Review

Recall that my interest in the subject stems from an objective requirement in the NIAP-sponsored Operating System Protection Profile v4.x, which states that the operating system should prevent an application from being able to map a page of memory with both Write and Execute permissions (protecting mmap(2)) and that once mapped, a page of memory should not be able to have permissions escalated (protecting mprotect(2)).

There is room for interpretation of the meaning of this requirement around two competing ways to implement it.  I made my case for my personal preference in my last post, however the competing case wherein we are merely concerned with a page of memory never being simultaneously W|X and not necessarily enforcing, say W!->X, does have merit and would likely meet the letter of the requirement.

OpenBSD, which is where the branded term “W^X” originates, now enforces the strict W^X definition, and not the PaX/grsec “once write never execute” type of policy in 6.0.

Looking Deeper – A side-by-side comparison with HardenedBSD’s PaX model

In OpenBSD 6.0, when an application attempts to map a page of memory W|X, the process will be terminated with a SIGABRT (signal 6), causing the process to die and drop core. Attempting to run my test application, cc-memtest, which has 3 cases defined, does not get past the first test on OpenBSD 6.0, as the application is aborted:

obsd60-hbsd-cap6

Looking in GDB, we see right where the fault is caught:

obsd60-cap2

Handling this same case, a PaX model implementation does something a little differently.  As we see in both GDB and the procstat output, the page of memory is actually created and mapped, however it is only writable, not executable (and, properly, not readable either, which we didn’t request).

obsd60-hbsd-cap3

The implementation returns a writable page whenever write and execute permissions are asked for.

The application is actually allowed to run, and may or may not be OK.  Attempts to perform an action on that page for which you don’t have permissions will fail (ie, attempting to execute code that is placed in the buffer). However, as we see with OpenBSD’s implementation, the fault is considered to be the allocation of the memory region itself and not the unapproved access to the page.

In this regards, the behavior of the OpenBSD implementation is more readily apparent. The program author requested an R|W|X page of memory and that request causes the program to terminate, as opposed to providing the page of memory with different permissions than was explicitly requested by the program author and then exhibiting failure behavior on a write that may or may not ever happen.

Because HardenedBSD doesn’t abort the process, we can set a new break point and continue onto additional tests to show how mprotect(2) hardening works:

obsd60-hbsd-cap4

Here, we see that the application has successfully mapped a page of memory as executable and then attempts to escalate permissions to include writable. mprotect(2) returns an error code, and we see from procstat that the page of memory in question was created as executable, and then was still ONLY executable after the mprotect(2) attempt.

My quick-and-dirty test program doesn’t have a handler for SIGABRT, and rather than write one in, I pulled test three out as its own program so we can verify mprotect hardening on OpenBSD:

obsd60-7

Again, we see that the page was created properly, but the attempt to make it both executable and writable caused the program to receive SIGABRT and die.

Other differences

One of the other major differences between the PaX/grsecurity model that HardenedBSD implements and OpenBSD 6.0’s implementation is around whether a page of memory should EVER be able to be writable once it has been executable and vice versa.  Some JITs for instance, may want to use mprotect(2) to switch between write and execute permissions on a page of memory in an attempt to not have simultaneously writable and executable permissions. On HardenedBSD, that does not work. On OpenBSD, you can do that:

obsd60-hbsd-cap6

For instance, on HardenedBSD, a piece of code like this will never set write permissions on the page:

obsd60-hbsd-cap5

obsd60-hbsd

So, in order to not run afoul of page exec protections under PaX, a JIT implementer (for instance) would have to take alternative approaches, such as:

  • Maintaining a ‘shadow mapping’ in the same process space and removing one map when it is no longer needed
  • Privilege separating the JIT into two processes with their own address space, each with their own mapping to a shared memory object containing the code to be run through the JIT

Conclusions

When I looked into this issue back in the spring, OpenBSD did not have technical controls in place in user-land, merely a policy that application authors should NOT map W|X memory. HardendBSD and NetBSD, both implemented the PaX model and were able to pass my tests (NetBSD needing a sysctl tweak to enable protection, HardenedBSD by default). However, discussions after the press the previous post received lead to the new technical control implementation coming in for 6.0.

Obviously, the two approaches have some functional differences. OpenBSD’s approach is more deterministic in how one might expect a program to behave than the PaX model. Both the OpenBSD approach and the PaX approach meet the “letter of the law” from the Common Criteria standpoint now, as neither will allow an application to map a page of memory with W|X permissions – OpenBSD merely shuts the door on that activity in a more spectacular fashion.

Over all, though, we are now in a position in which three operating systems exhibit correct behavior with regards to this requirement without requiring additional patches to be applied (i.e., to bring grsecurity into Linux), and all three are BSDs. My hope would be that FreeBSD might look into adopting the HardenedBSD patches up stream so that all three mainline BSDs could meet this requirement, but in the meantime the inclusion of OpenBSD in the club makes it harder for other OS vendors to not take the requirement seriously moving forward.

(and, in related news, NetBSD has actually finally removed the last RWX page of memory in its kernel.  Announcement can be seen here)

 

Intel RIP ROP, Hardware-Enforced CFI, and the Future of Trusted Computing

Recently, Intel made an exciting announcement regarding an anti-exploitation technology called Control-flow Enforcement Technology, or CET.  This builds on previous work on Control Flow Integrity (CFI) done by Microsoft researchers, as well as a paper by IAD researchers proposing hardware-enforced CFI.

While the IAD paper showed a modified Qemu and Linux kernel providing a proof-of-concept, an actual, in-silicon implementation in the major commodity CPU architecture would be a really big deal for trusted computing going forward.

A Brief History of the Exploitation Arms Race

For years, there has been an on-going arms race between attackers developing new exploitation techniques and security engineers developing exploit mitigation techniques to frustrate their attempts. CET is the natural evolution in this process.  The 30,000-foot view of how we got here so far can be summed up like this:

Exploit technique Mitigation technique(s)
Basic buffer overflow exploiting lack of bounds-checking in software. The attacker is able to over-write the return address of a function and take control over RIP/EIP. This allows the attacker to gain arbitrary code execution Stack protection technologies such as propolice, -f-stack-protector, etc., provide protection against simple buffer overflows
Advanced buffer overflow techniques allow an attacker to defeat stack protection Executable space protection like W^X, DEP, etc., prevent an attacker from writing their shell code into a buffer and then executing it
Return-to-libc attacks allow an attacker to take advantage of functions known to be in memory in certain locations to leverage library routine such as execve to bend to the attacker’s will Address-Space Layout Randomization (ASLR) and Position Independent Executables (PIE) make determining the runtime memory location of those helpful libc routines much more difficult for the attacker.
ASLR can be brute-forced and the attacker can still steer program execution Control Flow Integrity (CFI) can use a shadow stack to check against the return address in the main program’s execution flow to ensure that execution flow hasn’t been tampered with
CFI shadow stack is still in memory and can be corrupted by an attacker Hardware protection on the CFI shadow stack pages (CET).

Hardware-enforced CFI to the rescue

Control Flow Integrity is an obvious solution to the problems of an attacker corrupting data in memory to control program execution flow for malicious purposes.  However, a pure software CFI still has problems and could be exploited.  It is also not that widely supported.  The major problem is that if the CFI is implemented purely in software, only protected by the kernel’s protection mechanisms, then it isn’t really much safer than the memory space of the process being exploited.

NSA’s Information Assurance Directorate (IAD) proposed the natural solution to this: enforcement of CFI via hardware.  Essentially, the shadow stack would be in a region of memory that would be protected by either a modified MMU, or a new sort of chip like an MMU.  They modified the Qemu hypervisor to supply a simulation of such a chip, then modified the Linux kernel to work with this hardware.

Intel seems to be taking this to the next level, leveraging the MMU and building support into silicon, so that CFI can be enforced in hardware.

The shadow stack is placed in a region of memory inaccessible to the parent process and access is mediated via the MMU on behalf of the CPU.  The CPU checks the return address on ever RET against the address on the shadow stack, and if they don’t match, will throw an error.  Attempts by an attacker to access the shadow stack will result in a page fault, which will also need to be appropriately handled by the OS kernel.

CET and the future of the OSPP

Documents regarding the development of the Protection Profile for General Purpose Operating Systems have for a while referenced the IAD paper on hardware CFI.  Despite the theoretical proposals, there haven’t been “real” implementations. As Intel continues to flesh out their proposal and introduce support, I foresee future revisions of the PP taking this as an objective requirement.  It will take some time for competing CPU architecture manufacturers to implement similar functionality before it can be made a hard requirement, however.

That said, I see this is a natural evolution of exploit mitigation techniques and really the future of trusted computing.  CET combined with boot chain trust, application white listing and existing/new anti-exploitation techniques put us on track to developing trusted systems for which even more classes of threat can be eliminated.