Frank Tip

My research is at the boundary of Programming Languages and Software Engineering, with particular focus on tools for improving software quality and programmer productivity, and has been supported in part by grants from the National Science Foundation and the Office of Naval Research, and by generous gifts from Oracle and Amazon. Below is an overview of some recent projects.

Program Analysis for Asynchronous Software

Modern software needs to accommodate asynchrony in situations such as web-based user-interfaces, communicating with servers through HTTP requests, and non-blocking I/O. Event-based programming is the most popular approach for managing asynchrony, but is highly error-prone due to problems such as event races, inadequate support for error handling, and unintuitive, deeply nested control flow ("callback hell"). Other mechanisms for managing asynchrony such as promises/futures pose their own set of challenges to programmers. We are developing static and dynamic program analysis techniques that are capable of reasoning precisely about asynchrony, and tools that assist programmers with testing, profiling and detecting and repairing errors in asynchronous applications.

More info: [OOPSLA15], [ICSE16], [OOPSLA17a], [OOPSLA17b], [OOPSLA18a], [ECOOP21], [OOPSLA21] [ICSE22a] [ICSE22b] [ASE22] [ESEC/FSE23] [ASE23]

Automated Unit Test Generation using Large Language Models

Unit tests check the correctness of individual units of source code (e.g., functions), and are an integral part of modern software development. However, manually creating unit tests is labor-intensive and tedious, causing some developers to skip writing tests altogether. There has been extensive research on automated test generation, (see e.g., our previous work on Nessie, a feedback-directed random test generation tool for JavaScript). While traditional test generation techniques have been very successful at exposing faults, they tend to generate tests that are much less readable and understandable than manually written tests, and the generated tests often lack assertions, or contain only very generic assertions. In a recent project with collaborators at GitHub, we developed TestPilot, an LLM-based unit test generation tool for JavaScript. In our approach, an LLM is given a prompt that includes the signature and implementation of a function under test, along with usage examples extracted from documentation and scaffolding code required by the testing framework. In response, the LLM completes these prompts into a test that resembles one that a human programmer could have written, often containing meaningful assertions. In an empirical evaluation, tests generated by TestPilot achieved significantly higher statement and branch coverage than those generated with Nessie. Moreover, the majority of the generated tests contain nontrivial assertions that reference the package under test.

More info: [IEEE TSE24]. See also the open-source release of TestPilot.

Securing WebAssembly using Static Analysis and Binary Instrumentation

WebAssembly (Wasm) is a web technology that has rapidly been gaining popularity. It is a low-level bytecode format that was introduced in 2017, and was originally designed for computationally-intensive tasks in the browser (e.g., codecs, cryptography, games, etc.). Today, as envisioned, WebAssembly is widely used on the client-side where it is supported by all modern browsers, and heavily used by applications. However, a number of critical security concerns in WebAssembly binaries have been identified recently, for which no adequate solutions exist. For example, because of the way WebAssembly has been designed, older, well-studied vulnerabilities such as buffer overflows still remain a viable and significant threat. We plan to develop a comprehensive suite of tools for detecting and mitigating security vulnerabilities in applications that rely on WebAssembly using both static analysis and binary instrumentation techniques. As a first step in this project, we recently completed a study of the challenges that WebAssembly posed to static analysis.

More info: [ISSTA23].

Mutation Testing using Large Language Models

In mutation testing, the quality of a test suite is evaluated by introducing faults into a program and determining whether the program's tests detect them. Most existing approaches for mutation testing involve the application of a fixed set of mutation operators, e.g., replacing a "+" with a "-" or removing a function's body. However, certain types of real-world bugs cannot easily be simulated by such approaches, limiting their effectiveness. We present a technique where a Large Language Model (LLM) is prompted to suggest mutations by asking it what placeholders that have been inserted in source code could be replaced with. The technique is implemented in LLMorpheus, a mutation testing tool for JavaScript, and evaluated on 13 subject packages, considering several variations on the prompting strategy, and using several LLMs. We find LLMorpheus to be capable of producing mutants that resemble existing bugs that cannot be produced by StrykerJS, a state-of-the-art mutation testing tool. Moreover, we report on the running time, cost, and number of mutants produced by LLMorpheus, demonstrating its practicality.

More info: [paper accepted for IEEE TSE]. See also the open-source release of LLMorpheus and a repository containing our experimental results