The Tower of Babel

LANGSEC: Language-theoretic Security

"The View from the Tower of Babel"

The Fourth LangSec IEEE S&P Workshop at the IEEE Security & Privacy Symposium 2017 will be held in San Jose on May 25, 2017. This year, we are planning a LangSec Hackathon to go along with workshop, hacking on secure implementation of popular protocols with the Hammer parser construction kit, in any languages Hammer supports.

The Third LangSec IEEE S&P Workshop at the IEEE Security & Privacy Symposium 2016 was held in San Jose on May 26, 2016, keynoted by Doug McIlroy. The keynote, full papers, research reports, and presentation slides are posted at http://spw16.langsec.org/papers.html.

Our presentation at S4x16 applied the LangSec design principles and the Hammer tool kit to implementing a parser for DNP3, a popular ICS/SCADA protocol. Details & code: http://langsec.org/dnp3/

The Second Language-theoretic Security (LangSec) IEEE S&P Workshop at the IEEE Security & Privacy Symposium 2015 took place in San Jose on May 21, 2015, keynoted by Dan Geer. Workshop program and all presented papers and slides are now posted. The text of Dan Geer's keynote is also posted.

We released a series of video tutorials for Hammer, a LangSec secure(r) parser construction kit: the HammerPrimer on Github. Please help us beta-test this tutorial!

The First Language-theoretic Security (LangSec) IEEE S&P Workshop at the IEEE Security & Privacy Symposium 2014 took place in San Jose, May 18, 2014, keynoted by Caspar Bowden and Felix 'FX' Lindner. Workshop program and all presented papers are now posted.

The Language-theoretic approach (LANGSEC) regards the Internet insecurity epidemic as a consequence of ad hoc programming of input handling at all layers of network stacks, and in other kinds of software stacks. LANGSEC posits that the only path to trustworthy software that takes untrusted inputs is treating all valid or expected inputs as a formal language, and the respective input-handling routines as a recognizer for that language. The recognition must be feasible, and the recognizer must match the language in required computation power.

When input handling is done in ad hoc way, the de facto recognizer, i.e. the input recognition and validation code ends up scattered throughout the program, does not match the programmers' assumptions about safety and validity of data, and thus provides ample opportunities for exploitation. Moreover, for complex input languages the problem of full recognition of valid or expected inputs may be UNDECIDABLE, in which case no amount of input-checking code or testing will suffice to secure the program. Many popular protocols and formats fell into this trap, the empirical fact with which security practitioners are all too familiar.

LANGSEC helps draw the boundary between protocols and API designs that can and cannot be secured and implemented securely, and charts a way to building truly trustworthy protocols and systems. A longer summary of LangSec in this USENIX Security BoF hand-out, and in the talks, articles, and papers below.

LANGSEC in pictures: Occupy Babel!

How to get on the LANGSEC mailing list: subscribe at https://mail.langsec.org/list/

Articles: Talks:
2011 USENIX ;login:
  • "Exploit Programming: from Buffer Overflows to Weird Machines and Theory of Computation", Sergey Bratus, Michael E. Locasto, Meredith L. Patterson, Len Sassaman, Anna Shubina [PDF]

  • "The Halting Problems of Network Stack Insecurity", Len Sassaman, Meredith L. Patterson, Sergey Bratus, Anna Shubina [PDF], [PDF@USENIX]
(The first article explains the "weird machines" view of exploitation, the second one starts with a computation-theoretic view. We recommend reading both, and choosing the reading order based on your background.)

2012 IEEE S&P Journal:

  • "A Patch for Postel's Robustness Principle", Len Sassaman, Meredith L. Patterson, Sergey Bratus, [PDF]
2014 IEEE S&P Journal:
  • Beyond Planted Bugs in "Trusting Trust": The Input-Processing Frontier, Sergey Bratus, Trey Darley, Michael Locasto, Meredith L. Patterson, Rebecca ".bx" Shapiro, Anna Shubina [PDF]
2015 USENIX ;login:
  • The Bugs We Have to Kill, Sergey Bratus, Meredith L. Patterson, and Anna Shubina [PDF]

Papers:
  • Security Applications of Formal Language Theory, Len Sassaman, Meredith L. Patterson, Sergey Bratus, Michael E. Locasto, Anna Shubina [Dartmouth Computer Science Technical Report TR2011-709], published in IEEE Systems Journal, Volume 7, Issue 3, Sept. 2013

  • The Seven Turrets of Babel: A Taxonomy of LangSec Errors and How to Expunge Them, Falcon Darkstar Momot, Sergey Bratus, Sven M. Hallberg, Meredith L. Patterson, in IEEE SecDev 2016, Nov. 2016, Boston. [PDF]. (See alos Brucon 2012, Shmoocon 2013 talks for more LangSec-preventable weakness and vulnerability examples)

Theory:

Vulnerabilities & bugs:

Software practice:

  • "LANGSEC 2011-2016", CONFidence 2013 Keynote, Meredith L. Patterson, [slides], [video]
  • "Cats and Dogs Living Together: LangSec is Also About Usability", Meredith L. Patterson, [slides], [video]

LangSec for ICS/SCADA applications:

  • "Building a Literate Parser and Proxy for DNP3", Sven M. Hallberg, Sergey Bratus, Adam Crain, S4x16 [slides], [demos video]
  • "Taken Out of Context: Language Theoretic Security & Potential Applications for ICS", Darren Highfill, Sergey Bratus, Meredith L. Patterson, S4x14, [slides].

Tools:
  • Hammer, https://github.com/UpstandingHackers/hammer, is a parser construction kit with bindings for C/C++, Java, Ruby, Python, Perl, Go, .Net, and PHP. Like many modern parsing libraries, it provides a parser combinator interface for writing grammars as inline domain-specific languages, but Hammer also provides 5 different parsing back-ends. It's also bit-oriented rather than character-oriented, making it ideal for parsing binary data such as images, network packets, audio, and executables. Hammer grammars can include single-bit flags or multi-bit constructs that span character boundaries, with no hassle. Hammer is thread-safe and reentrant. HammerPrimer is a tutorial for Hammer, in a series of Youtube videos.
  • MCHammer, https://github.com/McHammerCoder, is a binary-capable parser and unparser generator for Java. It generates a parser and unparser based on a grammar defining their input and output language. Injection attacks for the defined language are prevented by the unparser through automatic extraction and generation of an encoding.

Please link to this page as http://langsec.org/.