Synopsis

Practice:

Trust in computing systems is ultimately based on what computations they (provably) can and cannot perform.
Exploits are actually proofs by construction that a system trusted to not be capable of particular computations under any inputs or conditions can in fact perform it under given inputs and conditions. They are constructed as programs that produce the unexpected computation on the target. They are written as sequences of crafted inputs that trigger and compose the effects of the target's computational artifacts such as memory corruptions; the overall effect is that of assembly-level programming in "weird instructions", such as a carefully set up memory corruption implementing a desired mov with some undesirable side effects to be corrected with subsequent corruptions. The totality of these "instructions" makes up a "weird machine" that shares the target's memory and inputs.
A well-constructed, well-described exploit (more precisely, exploit-program) is, simultaneously, a description and proof of existence of a "weird machine" inside the target that performs the unexpected computation -- in mathematical terms, precisely a "proof by construction".
"Weird machines" that manifest their existence within the target in response to crafted input exploit-programs can be discovered by considering their containing platform as a recognizer machine for the language (in the formal languages theory sense) of all inputs expected or accepted by the target. See the "Exploiting the Forest with Trees" and "Towards a Formal Theory of Computer Insecurity: a Language-theoretic Approach" by Len Sassaman and Meredith L. Patterson.
Conversely, every software component that handles input includes a de facto recognizer for the de facto language of valid or expected inputs. If this language and this recognizer are not clearly understood as such, are only vaguely described, and the recognizer code is scattered throughout other code, ample unforeseen computations and "weird machines" are likely to exist.

Theory:

Eliminating unexpected computation and computational structures ("weird machines") from components that must recognize complex input languages starting at certain levels of complexity is hard, and may in fact be impossible.
When recognizing an input language is a solvable (decidable) problem, different classes of input language complexity require matching computational power to recognize them (see Chomsky Hierarchy of Languages). Whenever recognition is attempted with a weaker, inadequate de facto recognizer, some inputs will cause unexpected state and state transitions in the recognizer, most likely resulting in a "weird machine".
Protocol designers should stay with regular or context-free languages, as these require only finite state machines and pushdown automata respectively to recognize them, and these automata are the easiest to auto-generate and get right.
Protocol designers must understand that input complexity can make recognition of valid or expected structures (and rejection of invalid or unexpected ones, such as crafted exploit-programs) an undecidable problem, i.e., not solvable in any reasonably functional approximation by any amount of programming or testing effort.
Simply put, "There is no 80/20 engineering solution to the Halting Problem." There are (at least) two "natural" ways a design can run into the Halting Problem:
- "Type I": Input Recognition in an input-handling component. If the input language of valid or expected inputs is Turing-complete, its recognition is an undecidable problem. No "partial" solution to telling safe inputs from unsafe one will likely be good enough.
- "Type II": Endpoint Computational Equivalence between the components implementing a protocol. When exactly the same interpretation of messages is a security requirement, programming, testing, or verification effort must be taken to ensure the endpoint implementations must be computationally equivalent. Yet the problem of verifying that two recognizer automata are computationally equivalent is undecidable beyond finite state machines and deterministic pushdown automata (i.e., beyond regular and unambiguous context-free languages). Thus, no amount of effort and testing will likely help to verify the "functional" equivalence of two recognizer implementations for ambiguous context-free grammars or stronger.