Tenth LangSec Workshop

at IEEE Security & Privacy, May 23, 2024

Abstracts

Keynotes

"Towards Language-Theoretic Security for Dynamic Documents", Will Crichton and Shriram Krishnamurthi (Brown University)

The computational capabilities of documents present an eternal challenge to security, in part because computation is often an afterthought in the design of document languages and formats. Our recent work is about providing language-theoretic foundations for reasoning about how computation affects a document (encapsulated in "document calculi"). These foundations suggest an intriguing question: if one were to carefully design a new document language from scratch, could it be made secure by construction based on these foundations? This talk explores some possibilities based on combining the document calculi with prior linguistic approaches such as object capabilities, language layers, and sandboxing. We will argue that these design elements might provably eliminate a wide variety of document security issues while still being able to express rich dynamic documents.

"Beyond LogoFAIL: The Stubborn Security Challenge of Parser Vulnerabilities", Alex Matrosov (Binarly.io)

Parser vulnerabilities are the most common class of security problems that continues to pop up in all kinds of software. It gets worse over time, due to modern software's growing complexity. In this keynote, I will look at the recent LogoFAIL bugs through the prism of the ͏software and product security failures to explain why these parser vulnerabilities are still happening in 2024 just like they did in the 2000s, and why they still aren't fixed after many months of initial public disclosure.

I will argue that third-party transitive dependencies are particularly dangerous, because it is hard to verify them in compiled software binaries, and it always requires manual work. We need to change the mindset of the defenders and invest in product security automation to ensure we have enough transparency in the software supply chain to understand these evolving risks.

I will argue that we also need to rethink how we manage and assign CVEs. The way we do it today creates confusion, and doesn't help manage supply chains, especially when vendors assign multiple vulnerabilities to a single CVE. This is one of the reasons why many LogoFAIL vulnerabilities remain unfixed.

Last but not least, I will argue that the coming wave of AI-driven code generation tools carries the risk of creating deeply embedded parsing vulnerabilities faster than ever before. Whereas LangSec asserts that rolling your own parsers is as bad as rolling your own crypto algorithms, an AI code generator could be the ultimate faulty parser-rolling machine. Trained on buggy hand-rolled parsers, it would reproduce their badness while giving developers a false promise of safety in generated code. I will discuss some of the ways we could respond to this looming challenge.

Special Invited Session in Data Language Challenges in Health Care, by ARPA-H

"Medical Data Expression: Unlocking the Next Generation of Healthcare AI", Andrew Carney (ARPA-H)

The medical domain has multiple layers of data languages, from the domain specialized language of clinicians to the mid-level semi-structured languages of medical records, encounters, and tests, and to the low-level fully structured languages of electronic communications and storage. These languages express objects of complex structure, with complex relationships both within these objects and between the objects. What are these relationships, and how do we describe them formally (i.e., soundly, precisely, and unambiguously)?

This problem is "ARPA-H hard" and the current approaches are not sufficient. In the lower-level cyber domains, with complex electronic document and message formats, understanding these relationships needed SafeDocs, a full-scale DARPA program which was itself built on prior years of LangSec, took several years and required not only classic grammars but also dependent and liquid types. Modern electronic health record (EHR) systems are based on models and formats (e.g., relational databases, XML) that are ill suited to capture the complexities of medical data. Existing resources describing medical terminology and concepts (e.g., Unified Medical Language System) are not sufficiently robust or verifiable at scale.

To effectively leverage modern advancements in AI for the benefit of patients and clinical staff, we must have data layers that are composable and correct. To create those layers, we must define the languages that capture and express the subtleties of each element, attribute, and relationship involved. Solving these challenges will require considerable effort and innovation from the LangSec community, where should we start?

Invited talks

"ParDiff: Practical Static Differential Analysis of Network Protocol Parsers", Mingwei Zheng & Xiangyu Zhang (Purdue Univerity)

Network protocol implementations suffer from bugs, many of which only cause silent data corruption instead of crashes. Hence, existing automated bug-finding techniques focused on memory safety, such as fuzzing, can hardly detect them. In this talk, we introduce a static differential analysis called ParDiff to find protocol implementation bugs, especially those that hide in message parsers but do not cause crashes. Our key observation is that a network protocol often has multiple implementations and any semantic discrepancy between them may indicate bugs. However, different implementations are often written in disparate styles (e.g., using different data structures or written with different control structures), making it challenging to directly compare two implementations of even the same protocol.

To exploit this observation and effectively compare multiple protocol implementations, ParDiff automatically extracts finite state machines representing protocol format specifications from programs, and then leverages bi-simulation and SMT solvers to find fine-grained, semantic inconsistencies between them. We have extensively evaluated our approach using 14 network protocols, each with two different implementations. The results show that ParDiff exhibits higher precision in discovering bugs in protocol parsers compared to both differential symbolic execution and differential fuzzing tools. ParDiff had detected 41 logical bugs using in real-world protocol parser implementations.

"Security Challenges of Parsing Differentials on the Web", Bahruz Jabiyev (Dartmouth College)

The performance and security needs of the modern Web have necessitated functionalities such as caching, security filtering, and load balancing. Each of these functionalities requires a cooperation of multiple systems where each system is usually a product of one of various developer groups. This variety comes with an abundance of implementation differences in input parsing, also known as parsing differentials.

However, these seemingly benign parsing differentials between cooperating systems are not harmless; rather, they create a set of huge security challenges for the Web. In this talk, we will examine three types of these security challenges: 1) a fundamentally new class of cyber-attacks, 2) enhanced fingerprinting capabilities for attackers and 3) a new approach for bypassing security filters against web attacks.

Papers

"Vulnerability Flow Type Systems", Mohsen Lesani (UC Santa Cruz)

Pervasive and critical software systems have dormant vulnerabilities that attackers can trigger and cascade to make the program act as a "weird machine". In fact, they harbor exploitable programming models that can be used to compose vulnerabilities and mount attacks. This paper presents a type system to derive the abstract weird machines that programs expose. The type system tracks information flow types to detect vulnerabilities, and abstracts the control flow between vulnerabilities to capture weird machines. We formally prove that the inferred weird machine covers the weird runtime behaviors that the program can exhibit. The resulting machine can then be examined to detect patterns of attacks. An important observation is that attacks are often simple and recurring patterns. We model the abstract machines as regular expressions on vulnerability types. This abstract representation is platform-independent and can be used as a uniform description language for attacks. Further, language inclusion and similar decisions about regular expressions are remarkably more efficient than the same decisions for concrete programs or other formal languages.

"Weird Machines in Package Mangers: A Case Study of Input Language Complexity and Emergent Execution in Software Systems", Sameed Ali, Sean Smith (Dartmouth College) and Michael E. Locasto (NARF Industries)

Unexpected interactions of linguistic elements of software often produce unexpected composable computational artifacts called weird machines. Using the RPM package manager as a case study, we provide a systematic approach for discovering and classifying the semantics of latent functionality existing within the input sapce of complex systems. We demonstrate latent functionality present within the RPM package management infrastructure via the construction of a Turing-complete automaton residing within the RPM package management infrastructure.

"Robust Verification of PEG Parser Interpreters", Zephyr Lucas (Dartmouth College) and Natarajan Shankar (SRI)

We formalize a chart parsing approach to PEG parsing by defining a state machine that operates on a scaffold data structure. The scaffold maintains the state matrix with entries for each pair of positions and nonterminals. Our PEG parser interpreter parses an input string relative to a given PEG grammar in Chomsky Normal Form by lazily filling the entries in the scaffold according to whether the parse succeeded, failed, or looped. We thus admit possibly left-recursive PEG grammars and add loop detection to the parser. We define invariants on this state and define a parsing step function that preserves these invariants. We define a proof-of-parse representation that can be extracted from the end state of a parse to be independently checked. We formalize and prove chart parsers for an evolving series of PEG formalizations with extensive proof reuse across these formalizations. The mix of interaction and automation in PVS proofs tend to make them reasonably robust to certain kinds of specification and design changes. We discuss other steps that can be taken to proactively build robust proofs for important classes of problems like parsing.

Research Reports

"Research Report: An Optim(l) Approach to Parsing Random-Access Formats", Mark Tullsen, Sam Cowger, Mike Dodds (Galois, Inc.) and Peter Wyatt (PDF Association)

We introduce two Domain Specific Languages (DSLs), that in combination allow for the declarative specification of random access formats (formats that include offsets or require out of order parsing, e.g., zip, ICC, and PDF). We can compile such declarative specifications into an imperative implementation composed of multiple entry points. Each entry point provides caching, optimal access to a subset of the format. The first DSL is XRP (eXplicit Region Parser), a pure functional language for declaratively specifying random-access formats. XRP gives greater control by making parsing regions explicit; it reduces the posssibility of errors by making regions abstract. The second DSL is Optim(l), a language for capturing Directed Acyclic Graphs (DAGs) of computations (the nodes) with dependencies (the edges). From an Optim(l) program/DAG we generate an imperative module that provides an API for on-demand and incremental parsing of the random access format.

"Research Report: Not All Specifications Are Created Equal", Meng Xu

Software developers who are newly introduced to formal verification often have subtle but distinct interpretations on the roles of specifications (specs). Unsurprisingly, these different interpretations affect the types of specs developers tend to write, which, without coordination, can lead to fragmented assurance guarantee and ultimately impairs the overall effectiveness of the entire formal verification effort. This paper is an experience report on rolling out a formal verification system in a smart contract setting (where correctness matters financially). In this process, we discover three popular views about what “a spec is operationally” among experienced industrial developers yet novice in formal methods: 1) specs are contracts between implementation and functionalities conveyed to end users; 2) specs are extensions to the type system; and 3) specs are definitions of high-level state machines.

While which interpretation is closer to the real purpose(s) of specs is a judgement call, one important view is missing: some specs are abstracting specs that lock in requirements or intention (hence the more the better), some specs are proof assistance that facilitates the refinement proof between implementation and abstracting specs and hence, should be written on an as-need basis; while other specs are denotational specs and do not add assurance from a formal methods perspective. Absence this distinction, it is easy to fall into a trap in formal verification where specs accumulate but little assurance is added holistically.

"Parsing, Performance, and Pareto in Data Stream Security", J Peter Brady and Sean W Smith

Adding in-line LangSec filtering to network data streams can improve security (e.g., by protecting the receiving end from crafted input attacks) but can lead to considerable performance overhead. This paper presents our GUARDS (Global Unified Approach for Reliable Data Stream Security) approach to balance between system performance and the effectiveness of in-line filtering, by allowing dynamic prioritization of security measures in high-risk regions. Our research shows that this model and DFDL parsing can effectively validate and secure the Network File System version 4 (NFSv4) protocol, achieving a balance between parsing efficiency and data integrity. This contribution helps improve the ability of network communications to withstand and maintain functionality in response to changing data representation difficulties.

"Research Report: Enhanced eBPF Verification and eBPF-based Runtime Safety Protection", Guang Jin, Jason Li and Greg Briskin

The extended Berkeley Packet Filter (eBPF) technology has been extending the capabilities of current Operating Systems (OSs) rapidly in recent years. The eBPF community is well-aware of using formal verification methods to ensure the security of eBPF programs. However, the two primary kinds of formal methods, namely abstraction interpretation and symbolic execution, each come with their own set of pros and cons. First, this paper presents our formal eBPF verification approach, which combines the merits of both types of formal methods to ensure soundness, completeness, precision and recall for our solution. Second, this paper describes our eBPF-based solution to enhance the runtime security for prebuilt user-space programs. Grounded in a formally provable security foundation, our eBPF-based runtime safety monitoring solution avoids introducing new errors, offers customization to counter various vulnerabilities, and eliminates the need for offline instrumentation.

"Research Report: Testing and Evaluating Artificial Intelligence Applications", Paul Lintilhac, Joshua Ackerman and George Cybenko

In this paper, we describe and demonstrate TEAIS (Test and Evaluation of AI systems), a principled approach for assessing the performance of large language models. We walk through an educational example of TEAIS, where we leverage ideas from property testing of functions on the boolean hypercube to probabilistically verify a relevant security property of a PDF malware classifier. As part of this, we develop a novel property tester for monotonicity that works much like a mutation fuzzer. While the results in this report are preliminary, we hope to spark discussion around the challenges and opportunities associated with the testing and evaluation of complex AI systems.

Work-in-Progress and Posters

"Work-in-progress Presentations: On the Idea of Risk And Input Complexity: Embracing Insecurity Using Weird Machines Theory to Measure Attack Surfaces", Froylan Maldonado and Matthew Levy

We consider the idea of an attack surface and the ability to measure an attack surface as instrumental to understanding the magnitude of vulnerabilities in applications and systems that are critical to cybersecurity risk assessments in organizations. However, attack surfaces are a surprisingly ill-defined concept in academic literature. Thus, we submit to this workshop that the ideas inherent in LangSec, non-exploitability, and insecurity are promising in providing actuarial level precision in providing a topological understanding of a few key concepts: (1) Attack surfaces, (2) Conceptualizing theoretical applications that have attack surfaces of zero, and (3) Attack surfaces of non-zero orders of magnitude. We propose discussing attack surfaces from a LangSec perspective---as an actual topological surface that can potentially lead to interesting findings.

"Real-World Threats From In-Use “Blind Random” Block Corruption", Rodrigo Branco and Shay Gueron

We present a novel attack on encrypted data in use. Our threat model is an attacker who has compromised a cloud provider and has gained physical access to platform hardware. A realistic example is an insider who delivers an exploit and implant in stages. We demonstrate how this attacker subverts a running Linux guest OS with only 232 calls to a targeted corruption primitive. We compare this to a naive 2128 work estimate obtained using a standard cryptanalytic model for the encryption of data at rest. Beyond our novel attack surface, we follow [1] and illustrate how choosing an appropriate cryptanalytic model is essential for assessing, and then protecting, the real world security of memory encryption, especially for data in use.

[1] C. Bouillaguet, “Nice attacks — but what is the cost? computational models for cryptanalysis”