[This post was conceived and co-authored by Andrew Ruef, Ph.D. student at the University of Maryland, working with me. –Mike]
As researchers, we are often asked to look into a crystal ball. We try to anticipate future problems so that work we begin now will help address those problems before they become acute. Sometimes, a researcher guesses the problem and its possible solution, but chooses not to pursue it. In a sense, she has found, and discarded, an idea ahead of its time.
Recently, a friend of Andrew’s pointed him to a 20-year-old email exchange on the “firewalls” mailing list that blithely suggests, and discards, problems and solutions that are now quite relevant, and on the cutting edge of software security research. The situation is both entertaining and instructive, especially in that the ideas are quite squarely in the domain of programming languages research, but were not considered by PL researchers at the time (as far as we know).
Problem: Exploits via buffer overflow
This 1995 e-mail exchange on the “firewalls” mailing list identifies the problem presented by buffer overflow-based exploits:
Someone can spend hours to develop an exploit for a particular target
platform, OS version, and CPU type. That exploit might need to use different
offsets on other versions or machine types, and it is clearly just a
It is almost getting boring or something.
Note this statement of boredom is being made just one year before the publication of Smashing the Stack for Fun and Profit, a watershed publication that systematizes the construction of stack smashing attacks. This statement also anticipates research that automates the discovery and construction of exploits. The author continues:
I just decided it would be pretty funny if we had a C compiler which would
randomize the location of variables on the stack for each compile. It
would also be nice if it could randomly stick in some unused short ints
into the stack in various places (especially around big strings used
by sprintf, strcpy, and friends ).
This would offer some protection against the next 10 years of cloned
stack-o-rama exploits. Raise the level of effort and all that.
It doesn’t change the price of the first exploit but it might prevent
anybody from making a profit on the volume!
Offhand solutions: 20 years of research
The amazing thing about this email exchange is the number of ideas that are offered almost tongue-in-cheek, but which in the 20 years since have turned into serious topics of study and, in some cases, mainstream deployment.
Stack randomization (2010 – active research)
The initial post quoted above suggests one possible defense — randomizing stack layouts. Some on the email list dismiss the idea (more on this below), but by 2010, Michael Franz makes the case that the idea should be revisited. In the last five years, his group and others have made a compelling case for automated software diversity, which includes the randomization of stack layouts.
Safe Stacks (2006 – now a compiler extension)
Others on the list respond with their own crazy ideas, such as the use of two stacks:
As long as we’re talking ‘blue sky’ solutions, why not do it _right_??
It’s well-known that the way you insure the integrity of code is keep it
separate from data. A single stack, containing both user-data, *and*
program ‘state’ information (function return addresses, etc.), fails to
maintain this separation. Solution: _two_ stacks. One for return-
addresses and processor ‘context’ info *only*, the other for all user-data.
Running with two stacks was seriously explored by the XFI system (which provided other protections, too) in 2006. The idea has matured to the point that it is now in clang, the production-quality front-end of LLVM, as SafeStack
Bounds Checking (1995 – going mainstream)
The author who proposed separate stacks also proposed preventing out-of-bounds accesses in the first place:
Another solution is to add ‘run-time array bounds checking’ to the compiler,
a la Pascal. Of course, then you have to deal with what happens when a
‘array bounds exception’ occurs. Default procedure, in Pascal for example,
is to announce it and _abort_ the program (wouldn’t *this* make for wonderful
‘denial-of-service’ attacks?!!). The way you prevent those aborts is to do
your _own_ checking *first*, and never reference past the end of an array.
Thus, you have _double_ checks on everything.
Jones and Kelly proposed backward compatible bounds checking for C that same year (1995), which also saw the release of the type-safe (and bounds-checked) Java programming language. Jones and Kelly’s solution was far too slow to be practical, but the benefits were deemed promising enough that a long line of research on type-safe (or memory-safe) C followed, including languages such as CCured and Cyclone in the early 2000s, and Softbound/CETS in 2009-2010. Hardware support for runtime bounds checking (based on work by Nagarakatte and others involved with Softbound) is slated to be released as an Intel ISA extension called MPX.
Stack Canaries (1998 – now mainstream)
Another author further riffs on these ideas, proposing what is now known as stack canaries:
I’ll do you one better on that. How about an integer or similiar type
allocated immediately after the buffer, whose value is set by the compiler
(“randomly”) and which is checked after the reads. If the buffer’s been over
written, it’s unlikely that the value will match what is put in. Note
that this doesn’t stop people on the machine from recompiling and noting
differences in the binaries to determine what value is being used, but it
will protect against most external attacks.
The paper that fleshes out the idea of stack canaries was published in 1998 by Cowan et al. Support for stack canaries would be added to mainstream compilers soon afterward (the original paper presented a gcc patch), with some small improvements such as a dynamically allocated integer as opposed to a static one.
Ideas not mentioned
While incredibly prescient, the authors of firewalls thread missed some important defenses against buffer overflow exploits. They did not identify the idea of Address Space Layout Randomization (ASLR), first proposed in 2001, though their idea of stack randomization gets at the same underlying concept. 1 They also did not identify Control Flow Integrity (CFI), first proposed in 2005, 2 which (like the other ideas proposed) compiles the program differently to limit the possible effects of an attack. Finding the right balance of precision and performance is still a research question for CFI, but it is making its way into the mainstream. More ideas explored in the last 20 years are nicely described in Szekeres et al’s 2013 paper, SOK: Eternal War in Memory.
How and why?
Seeing this email exchange, we were led to wonder: What has changed since 1995 that has upgraded what seemed at the time (to the authors) to be impractical ideas?
But when someone suggests fixing the problem by opening up and
hacking *another* large code-generating program, my worry alarm goes off.
Perhaps I’m too paranoid or too cranky, but I’m certainly not going to
accept on faith that hacking a compiler in the way described is a safe and
reliable way to proceed, without a lot of respected people walking through
the compiler and pronouncing it sane and safe on all applicable platforms.
This is a reasonable concern. In the last 20 years, the state of both the art and the practice of compiler construction has advanced quite a bit. There is now production-quality compiler, called CompCert, 3 that has been formally verified to be correct; extensions to this compiler should cause fewer concerns. Also, the emergence of LLVM, which provides a robust, modular infrastructure on which we can create exciting compiler extensions, gives greater confidence that those extensions are correct, and creates a quicker path to adoption. The Vellvm project looks to add verifiability to aspects of LLVM, which should also address “worry alarms.”
It also seems like there is an undercurrent in this discussion of “these runtime checks are expensive and wacky, why don’t we just get the code right the first time?” (which is part of the punchline of some CS stand-up comedy). Maybe it took a long time before the idea of “just write the C code right the first time” started to be replaced with some brutal realism about how hard it is to write C code right the first time. Maybe that realism is what we needed to start thinking about runtime defenses as something more than jokes told on a mailing list.
Another question that may come to mind is: Why were these ideas occurring on the “firewalls” list, and not within the programming languages community? The proposals are germane to language and compiler design, after all. The author closing out the email thread wondered the same thing:
All this said, methinks this is getting a bit far afield from _firewalls_.
and further discussion is probably more appropriate in one of the ‘comp.lang.*’
One possibility is that the firewalls community was closer to the exploits in the wild than the language community: they could identify the properties of the systems that the exploits relied on, and could conceive, at a high level, how those systems could be changed to break those dependencies while preserving the original semantics of the system.
Today, many of the developers of the next generation of runtime security systems either are also, or closely collaborate with, leading security experts that understand how modern exploits work. Indeed, as Mike has written before, there can be great synergy when applying programming languages ideas to problems identified by other fields, such as cryptography and machine learning.
Reading this, you might be inspired to ponder what technologies today seem like impossible long shots but actually could become relevant in 20 years. We’d love to see them, in the comments!
Here are some our off-the-tops-our-heads predictions:
Verification for the rest of us. Verification has existed as, charitably, a pipe dream, for many reasons: verification tools are slow and hard to use, it can be difficult to think about how to phrase properties you want to prove, and there always seems to be a gap between the proof and the implementation. However, systems like frama-c, F*, bedrock, and dafny seem to be closing gaps both in terms of usability and by strongly connecting the proof with the implementation. Projects like IronClad and Sel4, and funding efforts like HACMS, are pushing the state of the art.
Code injection will end. We have been plagued with low level security vulnerabilities that allow an adversary to inject a whole new program into a vulnerable program, usually abbreviated in the literature as buffer overflows. Technologies like those mentioned here have been closing the door on code injection attacks, e.g., the prevalence of stack-based attacks has greatly diminished in recent years. This trend will continue. This does not end the security story, but the concerns 20 years in the future should very different from the concerns of today.
Information disclosure will remain. Much more difficult than proving basic safety properties about C programs is proving properties about how the programs control and release the information they are charged with, such as (variants of) non-interference. Timing or resource usage based information leaks have received little attention in research (which motivated the recent DARPA STAC program). Perhaps if cloud computing continues to be adopted, disclosures of small secrets or keys are all that attackers will really need.
- While ASLR is of questionable value for 32-bit address spaces, it is now in common use and seems effective on modern (64-bit) processors. ↩
- Techniques for restricting a program’s control flow were also proposed prior to 2005, e.g., program shepherding in 2002. CFI is implemented via (re)compilation and defines its policy in terms of a control-flow graph, rather than (say) in terms of legal entry points to libraries. ↩
- Xavier Leroy, the leader of the CompCert project, just won the POPL 2006 Most Influential Paper Award for authoring the first paper on the work. ↩