PLISS 2022: Lectures

Lectures

Performance Matters

Emery Berger

Moore's Law and Denning scaling – formerly the twin drivers of ever-increasing hardware performance – have essentially run out of steam. Programmers aiming to achieve high performance must rely on profilers to help them identify where they should focus their optimization efforts. Unfortunately, past profilers fall short in many ways that often make them useless for modern workloads. I'll explain why this is the case, and then will present recent profiling approaches from our group – for native code and scripting languages – that provide actionable information to programmers that actually help them optimize their code.

Compiler Bugs: Detecting Them and Understanding Their Impact and Dynamic Symbolic Execution with KLEE

Cristian Cadar

(1) In this talk, I will first describe the challenges of meaningfully testing compilers, and then survey some of the main techniques that have been successfully applied in this space. I will also discuss the impact of compiler bugs from several perspectives: the security risks of compiler bugs, the extent to which they affect the correctness of deployed software, and the difficulty of debugging software affected by them. The talk is primarily targeted to those who are new to the area of compiler testing, but it will also include several parts useful to a more specialist audience.

(2) Dynamic symbolic execution has gathered a lot of attention in recent years as an effective technique for generating high-coverage test suites and finding deep errors in complex software applications. In this tutorial-style presentation, I will introduce the main concepts of dynamic symbolic execution and exemplify them in the context of our KLEE symbolic execution infrastructure. The talk is primarily targeted to those who have no direct experience with dynamic symbolic execution and KLEE, but the talk will also include several parts useful to a more specialist audience.

Easy SSA Building - C2 Style and High Performance from understanding the Low Levels

Cliff Click

(1) How the HotSpot C2 JIT builds and optimizes SSA form quickly and simply. No fancy algorithm needed!

(2) A talk on X86 processor performance, with a live code-along session looking at the performance tradeoff between Bandwidth and Garbage Collection. Bring your laptop & fav Java IDE; Java code and ~2.7G dataset provided.

Types for Dynamic Languages

Jeffrey S. Foster

Dynamic languages are flexible, fun to use, and have a great power-to-code-weight ratio. But the lack of static typing can impede software development, make code maintenance harder, and lead to bugs that lurk in code for a long time. In these lectures, I will discuss some of the technical challenges and solutions for adding static typing to Ruby, a dynamic programming language. I will begin by presenting the basics of static type systems and type checking. Then I will discuss extensions required to statically type check the language constructs in Ruby and other dynamic languages. Next, I will show how to use type-level computations to perform static typing in the presence of powerful dynamic language features like metaprogramming. Finally, I will talk about type inference and discuss challenges and solutions for inferring usable, rather than most general, types.

Sparse abstract interpretation for compilers

Laure Gonnord

Proving the absence of bugs in a given software (problem which has been known to be intrinsically hard since Turing and Cook) is not the only challenge in software development. Indeed, the ever growing complexity of software increases the need for more trustable optimisations. Solving these two problems (reliability, optimisation) implies the development of safe (without false negative answers) and efficient (wrt memory and time) analyses, yet precise enough (with few false positive answers).
In a first part of the lecture, I will present a basic crash course on classical abstract interpretation, that has demonstrated its pertinence for software verification, and make a first assessment on the (quasi)impossibility of using the framework "as this" in production compilers.
In the second part, I will present some experiences in the design of scalable "sparse" static analyses inside compilers, and try to make a synthesis about the general framework we, together with my coauthors, used to develop them. I will also show some experimental evidence of the impact of this work on real-world compilers, as well as future perspective for this area of research.

What should a programming language implementation be? and Where the wild pointers are

Stephen Kell

(1) A box that goes fast? A comfortable space to work in? A unifying overlord that "rules" all software? An obedient servant? An intelligent collaborator? A window onto a wider world? In this relatively high-level lecture I'll examine different attitudes to this question, as manifest in different real language implementations over history, and use these to explain or hypothesise some directions of ongoing and future work. One theme will be how the goal of serving human beings is, curiously, slipping further down the agenda even while machine resources are more plentiful than ever.

(2) "A linker, a debugger and a garbage collector walked into a bar, all looking for the manager." In this relatively low-level lecture I'll cover why these systems are all different facets of the same strangely-shaped coin, why the folklore dichotomy of "managed vs unmanaged" is obscuring useful points in the design space, why this is hurting our ability to build tools and offer usable interoperability among languages/implementations, and one possible evolutionary approach for doing better. An emergent theme will be that language implementations are arguably in need of an "hourglass architecture" rather like the Internet's, and that handling of pointers, in their many guises, is the main unresolved challenge.

Beyond polyhedra: optimizing irregular programs

Milind Kulkarni

In these lectures, I will focus on the problem of compile-time scheduling optimizations that restructure computations to improve locality and parallelims. Traditional approaches to this problem reason about loop structures and schedules in terms of polyhedra, but there has been little success in extending such approaches beyond loop-based programs. I will start by briefly introducing the basic concepts used in reasoning about and transforming loop programs. I will then argue that there are interesting scheduling transformations that can be done for non–loop based programs, in particular recursive traversla programs such as those that arise in graphics, data mining, and simulation. I will survey recent work from the last decade on designing and implementing these transformations. Then, I will discuss recent advances in developing unified frameworks for representing, reasoning about, and transforming programs that deal with loops and recursion. Time permitting, we will also use one of these frameworks to explore and design a new transformation for recursive programs.

Formal Methods for Machine Learning Pipelines

Caterina Urban

Formal methods can provide rigorous correctness guarantees on hardware and software systems. Thanks to the availability of mature tools, their use is well established in the industry, and in particular to check safety-critical applications as they undergo a stringent certification process. As machine learning is becoming more and more popular, machine-learned components are now considered for inclusion in critical systems. This raises the question of their safety and their verification. Yet, established formal methods are limited to classic, i.e. non machine-learned software. Applying formal methods to verify systems developed by machine learning pipelines has only been considered recently and poses novel challenges in soundness, precision, and scalability. In this lecture, we will provide an overview of the formal methods developed so far for machine learning pipelines, highlighting their strengths and limitations. We will present several approaches through the lens of different software properties and targeting software across all phases of a machine learning pipeline, with a focus on abstract interpretation-based techniques. We will then conclude by offering perspectives for future research directions in this context.

Introduction to reactive programming in SKIP

Julien Verlaguet

Ever wanted to codegen something only to realize that your workflow had become unusable? Who wants to wait minutes for their build to finish?
Nobody ...
We want our tools in general (and our programming languages in particular) to be fast. We want to be able to iterate on our code without waiting. In other words, we want our tools to be reactive!
In this lecture, we will explore how to build a reactive compiler for a small programming language using the programming language SKIP. By the end of the class, you should be able to write tools that incrementally update, not in seconds, but in milliseconds when a change occurs.