$$ % Typography and symbols \newcommand{\msf}[1]{\mathsf{#1}} \newcommand{\ctx}{\Gamma} \newcommand{\qamp}{&\quad} \newcommand{\qqamp}{&&\quad} \newcommand{\Coloneqq}{::=} \newcommand{\proves}{\vdash} \newcommand{\star}[1]{#1^{*}} \newcommand{\eps}{\varepsilon} \newcommand{\brc}[1]{\{{#1}\}} % Untyped lambda calculus \newcommand{\fun}[2]{\lambda ~ {#1} ~ . ~ {#2}} \newcommand{\app}[2]{#1 ~ #2} % Typed lambda calculus - expressions \newcommand{\funt}[3]{\lambda ~ \left(#1 : #2\right) ~ . ~ #3} \newcommand{\lett}[4]{\msf{let} ~ \hasType{#1}{#2} = #3 ~ \msf{in} ~ #4} \newcommand{\rec}[3]{\msf{rec}(#1; ~ x.y.#2)(#3)} \newcommand{\case}[5]{\msf{case} ~ {#1} ~ \{ L(#2) \to #3 \mid R(#4) \to #5 \}} \newcommand{\pair}[2]{\left({#1},{#2}\right)} \newcommand{\proj}[2]{#1 . #2} \newcommand{\inj}[3]{\msf{inj} ~ #1 = #2 ~ \msf{as} ~ #3} \newcommand{\letv}[3]{\msf{let} ~ {#1} = {#2} ~ \msf{in} ~ {#3}} % Typed lambda calculus - types \newcommand{\tprod}[2]{#1 \times #2} \newcommand{\tsum}[2]{#1 + #2} % WebAssembly \newcommand{\wconst}[1]{\msf{i32.const}~{#1}} \newcommand{\wbinop}[1]{\msf{i32}.{#1}} \newcommand{\wgetlocal}[1]{\msf{get\_local}~{#1}} \newcommand{\wsetlocal}[1]{\msf{set\_local}~{#1}} \newcommand{\wload}{\msf{i32.load}} \newcommand{\wstore}{\msf{i32.store}} \newcommand{\wsize}{\msf{memory.size}} \newcommand{\wgrow}{\msf{memory.grow}} \newcommand{\wunreachable}{\msf{unreachable}} \newcommand{\wblock}[1]{\msf{block}~{#1}} \newcommand{\wblockr}[2]{\msf{block}~{#1}~{#2}} \newcommand{\wloop}[1]{\msf{loop}~{#1}} \newcommand{\wbr}[1]{\msf{br}~{#1}} \newcommand{\wbrif}[1]{\msf{br\_if}~{#1}} \newcommand{\wreturn}{\msf{return}} \newcommand{\wcall}[1]{\msf{call}~{#1}} \newcommand{\wlabel}[3]{\msf{label}_{#1}~\{#2\}~{#3}} \newcommand{\wframe}[1]{\msf{frame}~{#1}} \newcommand{\wtrapping}{\msf{trapping}} \newcommand{\wbreaking}[2]{\msf{breaking}_{#1}~{#2}} \newcommand{\wreturning}[1]{\msf{returning}~{#1}} \newcommand{\wconfig}[5]{\{\msf{module}~{#1};~\msf{mem}~{#2};~\msf{locals}~{#3};~\msf{stack}~{#4};~\msf{instrs}~{#5}\}} \newcommand{\wfunc}[3]{\{\msf{params}~{#1};~\msf{locals}~{#2};~\msf{body}~{#3}\}} \newcommand{\wmodule}[1]{\{\msf{funcs}~{#1}\}} \newcommand{\semi}[2]{{#1};~{#2}} \newcommand{\semii}[3]{{#1};~{#2};~{#3}} \newcommand{\semiii}[4]{{#1};~{#2};~{#3};~{#4}} \newcommand{\semiiii}[5]{{#1};~{#2};~{#3};~{#4};~{#5}} \newcommand{\wci}{\msf{instrs}} \newcommand{\wcs}{\msf{stack}} \newcommand{\wcl}{\msf{locals}} \newcommand{\wcm}{\msf{mem}} \newcommand{\wcmod}{\msf{module}} \newcommand{\wsteps}[2]{\steps{\brc{#1}}{\brc{#2}}} % Inference rules \newcommand{\inferrule}[3][]{\cfrac{#2}{#3}\;{#1}} \newcommand{\ir}[3]{\inferrule[\text{(#1)}]{#2}{#3}} \newcommand{\s}{\hspace{1em}} \newcommand{\nl}{\\[2em]} \newcommand{\steps}[2]{#1 \boldsymbol{\mapsto} #2} \newcommand{\subst}[3]{[#1 \rightarrow #2] ~ #3} \newcommand{\dynJ}[2]{#1 \proves #2} \newcommand{\dynJC}[1]{\dynJ{\ctx}{#1}} \newcommand{\typeJ}[3]{#1 \proves \hasType{#2}{#3}} \newcommand{\typeJC}[2]{\typeJ{\ctx}{#1}{#2}} \newcommand{\hasType}[2]{#1 : #2} \newcommand{\val}[1]{#1~\msf{val}} \newcommand{\num}[1]{\msf{Int}(#1)} \newcommand{\err}[1]{#1~\msf{err}} \newcommand{\trans}[2]{#1 \leadsto #2} \newcommand{\size}[1]{\left|#1\right|} $$


What is Systems Programming, Really?

Will Crichton   —   September 9, 2018
I have a gripe with the phrase "systems programming." To me, it always seemed to unnecessarily combine two ideas: low-level programming (dealing with implementation details of the machine) and systems design (creating and managing a complex set of interoperating components). Why is that the case? How long has this been true? And what could we gain from redefining the idea of systems?

1970s: Improving on assembly

Let’s travel back to the origins of modern computer systems to understand how the term evolved. I don’t know who coined the phrase originally, but my searches suggest that serious effort in defining “computer systems” started around the early 70s. In Systems Programming Languages (Bergeron1 et al. 1972), the authors say:

A system program is an integrated set of subprograms, together forming a whole greater than the sum of its parts, and exceeding some threshold of size and/or complexity. Typical examples are systems for multiprogramming, translating, simulating, managing information, and time sharing. […] The following is a partial set of properties, some of which are found in non-systems, not all of which need be present in a given system.

  1. The problem to be solved is of a broad nature consisting of many, and usually quite varied, sub-problems.
  2. The system program is likely to be used to support other software and applications programs, but may also be a complete applications package itself.
  3. It is designed for continued “production” use rather than a one-shot solution to a single applications problem.
  4. It is likely to be continuously evolving in the number and types of features it supports.
  5. A system program requires a certain discipline or structure, both within and between modules (i.e. , “communication”) , and is usually designed and implemented by more than one person.

This definition is fairly agreeable—computer systems are large-scale, long-used, and time-varying. However, while this definition is largely descriptive, a key idea in the paper is prescriptive: advocating for the separation of low-level languages from systems languages (at the time, contrasting assembly with FORTRAN).

The goal of a systems programming language is to provide a language which can be used without undue concern for “bit twiddling” considerations, yet will generate code that is not appreciably worse than that generated by hand. Such a language should combine the conciseness and readability of high level languages with the space and time efficiency and the ability to “get at” machine and operating system facilities obtainable in assembler language. Designing, writing, and debugging time should be minimized without imposing unnecessary overhead on systems resources.

At the same time, researchers from CMU published BLISS: A Language for Systems Programming (Wulf et al. 1972), describing it as:

We refer to BLISS as an “implementation language”, although we admit that the term is somewhat ambiguous since, presumably, all computer languages are used to implement something. To us the phrase connotes a general purpose, higher-level language in which the primary emphasis has been placed upon a specific application, namely the writing of large, production software systems for a specific machine. Special purpose languages, such as compiler-compilers, do not fall into this category, nor do we necessarily assume that these languages need be machine-independent. We stress the word “implementation” in our definition and have not used words such as “design” and “documentation.” We do not necessarily expect that an implementation language will be an appropriate vehicle for expressing the initial design of a large system nor for the exclusive documentation of that system. Concepts such as machine independence, expressing the design and implementation in the same notation, self-documentation, and others, are clearly desirable goals and are criteria by which we evaluated various languages.

Here, the authors contrast the idea of an “implementation language” as being higher-level than assembly, but lower-level than a “design language”. This resists the definition in the preceding paper, advocating that designing a system and implementing a system should have separate languages.

Both of these papers are research artifacts or advocacies. The last entry to consider (also from 1972, a productive year!) is Systems Programming (Donovan 1972), an educational text for learning systems programming.

What is systems programming? You may visualize a computer as some sort of beast that obeys all commands. It has been said that computers are basically people made out of metal or, conversely, people are computers made out of flesh and blood. However, once we get close to computers, we see that they are basically machines that follow very specific and primitive instructions. In the early days of computers, people communicated with them by on and off switches denoting primitive instructions. Soon people wanted to give more complex instructions. For example, they wanted to be able to say X = 30 * Y; given that Y = 10, what is X? Present day computers cannot understand such language without the aid of systems programs. Systems programs (e.g. compilers, loaders, macro processors, operating systems) were developed to make computers better adapted to the needs of their users. Further, people wanted more assistance in the mechanics of preparing their programs.

I like that this definition reminds us that systems are in service of people, even if they’re just infrastructure not directly exposed to the end user.

1990s: The rise of scripting

In the 70s and 80s, it seems like most researchers saw systems programming usually as a contrast to assembly programming. There simply were no other good tools to build systems. (I’m not sure where Lisp was in all this? None of the resources I read cited Lisp, although I’m vaguely aware that Lisp machines existed however briefly.)

However, in the mid 90s, a major sea change occurred in programming languages with the rise of dynamically-typed scripting languages. Improving on earlier shell scripting systems like Bash, languages like Perl (1987), Tcl (1988), Python (1990), Ruby (1995), PHP (1995), and Javascript (1995) worked their way into the mainstream. This culminated in the influential article “Scripting: Higher Level Programming for the 21st Century” (Ousterhout 1998). This articulated “Ousterhout’s dichotomy” between “system programming languages” and “scripting languages.”

Scripting languages are designed for different tasks than system programming languages, and this leads to fundamental differences in the languages. System programming languages were designed for building data structures and algorithms from scratch, starting from the most primitive computer elements such as words of memory. In contrast, scripting languages are designed for gluing: they assume the existence of a set of powerful components and are intended primarily for connecting components together. System programming languages are strongly typed to help manage complexity, while scripting languages are typeless to simplify connections between components and provide rapid application development. […] Several recent trends, such as faster machines, better scripting languages, the increasing importance of graphical user interfaces and component architectures, and the growth of the Internet, have greatly increased the applicability of scripting languages.

On a technical level, Ousterhout contrasted scripting vs. systems along the axes of type-safety and instructions-per-statement, as shown above. On a design level, he characterized the new roles for each language class: systems programming is for creating components, and scripting is for gluing them together.

Around this time, statically typed but garbage collected languages also started to gain popularity. Java (1995) and C# (2000) turned into the titans we know today. While these two aren’t traditionally considered “systems programming languages,” they have been used to design many of the world’s biggest software systems. Ousterhout even explicitly mentioned “in the Internet world that is taking shape now, Java is used for system programming.”

2010s: Boundaries blur

In the last decade, the line between scripting languages and systems programming languages has started to blur. Companies like Dropbox were able to build surprisingly large and scalable systems on just Python. Javascript is used to render real-time, complex UIs in billions of web pages. Gradual typing has gained steam in Python, Javascript, and other scripting languages, enabling a transition from “prototype” code to “production” code by incrementally adding static type information.

At the same time, massive engineering resources poured into JIT compilers for both static languages (e.g. Java’s HotSpot) and dynamic language (e.g. Lua’s LuaJIT, Javascript’s V8, Python’s PyPy) have made their performance competitive with traditional systems programming languages (C, C++). Large-scale distributed systems like Spark are written in Scala2. New programming languages like Julia, Swift, and Go continue to push performance boundaries on garbage-collected languages.

A panel called Systems Programming in 2014 and Beyond featured the biggest minds behind today’s self-identified systems languages: Bjarne Stroustrup (creator of C++), Rob Pike (creator of Go), Andrei Alexandrescu (D developer), and Niko Matsakis (Rust developer). When asked “what is a systems programming language in 2014,” they said (edited transcription):

Is systems programming about high performance then? Resource constraints? Hardware control? Cloud infrastructure? It seems like, broadly speaking, that languages in the category of C, C++, Rust, and D are distinguished in terms of their level of abstraction from the machine. These languages expose details of the underlying hardware like memory allocation/layout and fine-grained resource management.

Another way to think about it: when you have an efficiency problem, how much freedom do you have to solve it? The wonderful part of low-level programming langauges is that when you identify an inefficiency, it is within your power to eliminate the bottleneck by careful control over machine details. Vectorize this instruction, resize that data structure to keep it in cache, and so on. In the same way static types provide more confidence3 like “these two things I’m trying to add are definitely integers,” low-level languages provide more confidence that “this code will execute on the machine as I specified.”

By contrast, optimizing interpreted languages is an absolute jungle. It’s incredibly hard to know whether the runtime will consistently execute your code in the way you expect. This is the exact same issue with auto-parallelizing compilers—“auto-vectorization is not a programming model” (see The story of ispc). It’s like writing an interface in Python, thinking “well I certainly hope whoever calls this function gives me an int.”

Today: …so what is systems programming?

This brings me back to my original gripe. What many people call systems programming, I think about just as low-level programming—exposing details of the machine. But what about systems then? Recall our 1972 definition:

  1. The problem to be solved is of a broad nature consisting of many, and usually quite varied, sub-problems.
  2. The system program is likely to be used to support other software and applications programs, but may also be a complete applications package itself.
  3. It is designed for continued “production” use rather than a one-shot solution to a single applications problem.
  4. It is likely to be continuously evolving in the number and types of features it supports.
  5. A system program requires a certain discipline or structure, both within and between modules (i.e. , “communication”) , and is usually designed and implemented by more than one person.

These seem a lot more like software engineering issues (modularity, reuse, code evolution) than low-level performance issues. Which means that any programming language that prioritizes addressing these problems is a systems programming language! That still doesn’t mean every language is a systems programming language. Dynamic programming languages are arguably still far from systems languages, since dynamic types and idioms like “ask forgiveness, not permission” are not conducive for good code quality.

What does this definition get us, then? Here’s a hot take: functional languages like OCaml and Haskell are far more systems-oriented than low-level languages like C or C++. When we teach systems programming to undergraduates, we should include functional programming principles like the value of immutability, the impact of rich type systems in improving interface design, and the utility of higher-order functions. Schools should teach both systems programming and low-level programming.

As advocated, is there a distinction between systems programming and good software engineering? Not really, but an issue here is that software engineering and low-level programming are often taught in isolation. While most software engineering classes are usually Java-centric “write good interfaces and tests,” we should also teach students about how to design systems that have significant resource constraints. Perhaps we call low-level programming “systems” because many of the most interesting software systems are low-level (e.g. databases, networks, operating systems, etc.). Since low-level systems have many constraints, they require its designers to think creatively.

Another framing is that low-level programmers should seek to understand what ideas in systems design could be adapted to deal with the reality of modern hardware. I think the Rust community has been exceedingly innovative in this respect, looking at how good software design/functional programming principles can be applied to low-level problems (e.g. futures, error handling, or of course memory safety).

To summarize, what we call “systems programming” I think should be called “low-level programming.” Computer systems design as a field is too important not to have its own name. Clearly separating these two ideas provides a greater conceptual clarity on the space of programming language design, and it also opens the door to sharing insights across the two spaces: how can we design the system around the machine, and vice versa?

Please direct comments to my inbox at wcrichto@cs.stanford.edu or Hacker News.

  1. Cool fact here: two of the authors on this paper, R. Bergeron and Andy Van Dam, are founding members of the graphics community and the SIGGRAPH conference. Part of a continuing pattern where graphics researchers set the trend in systems design, c.f. the origin of GPGPU

  2. Mandatory link to Scalability! But at what COST?

  3. Ideally static types are 100% guarantees (or your money back), but in practice most languages permit some amount of Obj.magic