Recently, it was brought up on Proggit that Chris Smith's "What to Know Before Debating Type Systems" was no longer online. This is a really great article, and in an effort to make sure it survives, I've grabbed the archive.org cache and am 'reprinting' it here. If you're into programming languages, read this and level up!
EDIT: Chris has placed it back online. You can find it here.
Without further ado, "What to Know Before Debating Type Systems":
What To Know Before Debating Type Systems
I would be willing to place a bet that most computer programmers have, on multiple occasions, expressed an opinion about the desirability of certain kinds of type systems in programming languages. Contrary to popular conception, that's a great thing! Programmers who care about their tools are the same programmers who care about their work, so I hope the debate rages on.
There are a few common misconceptions, though, that confuse these discussions. This article runs through those I've encountered that obscure the most important parts of the debate. My goal is to build on a shared understanding of some of the basic issues, and help people get to the interesting parts more quickly.
Classifying Type Systems
Type systems are commonly classified by several words, of which the most common are "static," "dynamic," "strong," and "weak." In this section, I address the more common kinds of classification. Some are useful, and some are not.
Strong and Weak Typing
Probably the most common way type systems are classified is "strong" or "weak." This is unfortunate, since these words have nearly no meaning at all. It is, to a limited extent, possible to compare two languages with very similar type systems, and designate one as having the stronger of those two systems. Beyond that, the words mean nothing at all.
Therefore: I give the following general definitions for strong and weak typing, at least when used as absolutes:
- Strong typing: A type system that I like and feel comfortable with
- Weak typing: A type system that worries me, or makes me feel uncomfortable
What about when the phrase is used in a more limited sense? Then strong typing, depending on the speaker or author, may mean anything on the spectrum from "static" to "sound," both of which are defined below.
Static and Dynamic Types
This is very nearly the only common classification of type systems that has real meaning. As a matter of fact, it's significance is frequently under- estimated. I realize that may sound ridiculous; but this theme will recur throughout this article. Dynamic and static type systems are two completely different things, whose goals happen to partially overlap.
A static type system is a mechanism by which a compiler examines source code and assigns labels (called "types") to pieces of the syntax, and then uses them to infer something about the program's behavior. A dynamic type system is a mechanism by which a compiler generates code to keep track of the sort of data (coincidentally, also called its "type") used by the program. The use of the same word "type" in each of these two systems is, of course, not really entirely coincidental; yet it is best understood as having a sort of weak historical significance. Great confusion results from trying to find a world view in which "type" really means the same thing in both systems. It doesn't. The better way to approach the issue is to recognize that:
- Much of the time, programmers are trying to solve the same problem with static and dynamic types.
- Nevertheless, static types are not limited to problems solved by dynamic types.
- Nor are dynamic types limited to problems that can be solved with static types.
- At their core, these two techniques are not the same thing at all.
Observing the second of these four simple facts is a popular pass-time in some circles. Consider this set of presentation notes, with a rather complicated "the type system found my infinite loop" comment. From a theoretical perspective, preventing infinite loops is in a very deep sense the most basic possible thing you can do with static types! The simply-typed lambda calculus, on which all other type systems are based, proves that programs terminate in a finite amount of time. Indeed, the more interesting question is how to usefully extend the type system to be able to describe programs that don't terminate! Finding infinite loops, though, is not in the class of things most people associate with "types," so it's surprising. It is, indeed, provably impossible with dynamic types (that's called the halting problem; you've probably heard of it!). But it's nothing special for static types. Why? Because they are an entirely different thing from dynamic types.
The dichotomy between static and dynamic types is somewhat misleading. Most languages, even when they claim to be dynamically typed, have some static typing features. As far as I'm aware, alllanguages have some dynamic typing features. However, most languages can be characterized as choosing one or the other. Why? Because of the first of the four facts listed above: many of the problems solved by these features overlap, so building in strong versions of both provides little benefit, and significant cost.
Other Distinctions
There are many other ways to classify type systems. These are less common, but here are some of the more interesting ones: