Will Crichton

I am a computer scientist. My research combines cognitive psychology and programming language theory to understand how people program, and to design better tools for programmers.

I am currently a postdoc at Brown University advised by Shriram Krishnamurthi, working on making Rust easier to learn. I completed my Ph.D. at Stanford University advised by Pat Hanrahan and Maneesh Agrawala. I will be on the academic job market in the 2023-24 cycle.

In my research I build theories and systems about programming. A theory tries to understand a concept in a simplified setting where its properties can be identified via logical or statistical analysis. A system applies theory to a real-world setting where unexpected constraints and engineering concerns influence the theory's application.

Theories

I developed an algorithm for using ownership types to statically / modularly analyze information flow, and proved its soundness as noninterference [1].

I showed that working memory limits a person's ability to remember the value of variables while program tracing [3].

I articulated a design space for documentation generators based on a cognitive task analysis of how developers search through APIs [6].

I demonstrated that machine learning models can classify a program's high-level structure to a high degree of accuracy [4].

Systems

I built Flowistry, an information flow analyzer for Rust, to help programmers identify code unrelated to their current task [1].

I redesigned CS 242, a graduate-level course on programming languages, to help students better understand how PL theory works in the real world [11].

I extended Rustdoc to automatically find examples for documented functions, to help programmers find examples when learning a new API.

I created Nota, a document language for the browser, to make a world where I never have to write LaTeX again [2].

Publications

Conference publications

Will Crichton, Marco Patrignani, Maneesh Agrawala, and Pat Hanrahan.

PLDI 2022.

Will Crichton, Maneesh Agrawala, and Pat Hanrahan.

CHI 2021.

Will Crichton, Georiga Gabriela Sampaio, and Pat Hanrahan.

SIGCSE 2021.

James Hong, Will Crichton, Haotian Zhang, Daniel Y. Fu, Jacob Ritchie, Jeremy Barenholtz, Ben Hannel, Xinwei Yao, Michaela Murray, Geraldine Moriba, Maneesh Agrawala, and Kayvon Fatahalian.

KDD 2021.

Fait Poms, Will Crichton, Pat Hanrahan, and Kayvon Fatahalian.

SIGGRAPH 2019.

Workshop publications

I love workshops! I enjoy discussing my half-baked or weirdly-niche ideas with other researchers. PLATEAU and SNAPL were a big part of the reason I got into PL/HCI research.

Will Crichton.

PLATEAU @ SPLASH 2020.

Will Crichton.

HATRA @ SPLASH 2020.

Will Crichton.

PLATEAU @ UIST 2019.

Daniel Y. Fu, Will Crichton, James Hong, Xinwei Yao, Haotian Zhang, Anh Truong, Avanika Narayan, Maneesh Agrawala, Christopher RĂ©, and Kayvon Fatahalian.

AI Systems @ SOSP 2019.

Anna Zeng, and Will Crichton.

PLATEAU @ SPLASH 2018.

Good Ideas For Free

Here are some ideas that I worked on a bit, but haven't had time to keep exploring. If any of them are exciting to you, please reach out and I would love to chat!

  1. It should be possible to build a garbage-collected programming language on top of a non-GC language, and get the standard library for free! For instance, there should be a version of OCaml built on Rust that doesn't need to reimplement Vec or HashMap. See: Gradual Programming, Rust: The New LLVM and willcrichton/lia.
  2. When a CS teacher gets 100 solutions from students to a new assignment, the teacher should have tools that help them explore the commonalities and differences between solutions. I worked on this a bit [4]. See also "Generative Grading: Neural Approximate Parsing for Automated Student Feedback" for another cool concept in this space.
  3. Trait-based languages like Rust have a Turing-complete logic-programming-esque computation model embedded into their type system. I built Tyrade to show that it's possible to translate a sensible functional language into traits. But the translation is limited and I never formally proved the kind of relationship between these languages.
  4. Probabilistic programming languages offer an unprecedented new capability to model problems involving uncertainty. But most PPL texts focus on teaching the mechanics of the language rather than how to map a domain task into a probabilistic program. Can we systematically describe a translation from uncertain tasks to probabilistic programs? See: Compiling Knowledge into Probabilistic Programs.
  5. Programmers often intuit that a particular API has a more direct mapping to a domain than another API, such as how many Python libraries are marketed as for humans. How can we quantify the concept of direct mapping? For example, I created the Expressiveness Benchmark as a first pass on this question for tabular data processing APIs. Here's a really cool idea: imagine if you (1) embedded a bunch of APIs into a theorem prover, (2) implemented the same task in each API, and (3) wrote a proof of correctness for each implementation. Does the smallest proof corresponds to the most direct mapping?
  6. Intro to discrete math and intro to probability courses share the same problem: students get caught up in the syntax, and then fail to understand the underlying concepts. (Example: what is the type of the operator in probability?) Is it possible to improve students' learning experience with the introduction of a theorem prover or probabilistic programming language, respectively? For example, see Logic and Proof, a logic textbook written using Lean, and the Let's Chance project from MIT Media Lab.
  7. When programmers have questions about an API, they only use documentation and StackOverflow, rarely looking at the source code. That's not unreasonable: production-grade libraries have multiple layers of abstraction that make it hard to explore code. What if we equipped programmers with spelunking tools that could break through these abstractions? For example, I worked on Inliner, a tool that uses source-to-source compiler optimizations to show how a library works in the context of an example. See also: Partial evaluation as an aid to the comprehension of Fortran programs and Amorphous program slicing.
  8. Scientists learning to program in R have to deal with the language's god-awful ergonomics and error messages. How can tooling provide contextual assistance to help learners better understand how the langauge works? I built a prototype Auto TA for RStudio which learners seemed to really enjoy.
  9. System designers have a vocabulary of common system components that recur across domains: state machines, event registries, access control, so on. I've always thought that design patterns should be articulated in terms of these kinds of components, not stuff like a visitor. For example, I wrote a mini-book Type-Driven API Design in Rust that shows how to idiomatically express system components in Rust, and what mistakes you can catch at compile-time with careful design. But I think a more systematic enumeration of these kinds of system idioms would be really interesting. See also: domain-driven design.
  10. For sufficiently well-defined and low-level programming tasks, it should be possible to build a probabilistic cognitive model that could accurately predict a person's behavior in solving the task. For example, after publishing The Role of Working Memory in Program Tracing [3], I explored the idea of modeling a person as a lossy interpreter, and whether you could tune the lossiness parameters to explain the range of observed human traces. The models actually worked, although I never figured out what a useful application of such a cognitive model would be. Maybe you can!