My research is at the boundary of Programming Languages and Software Engineering, with particular focus on tools for improving software quality and programmer productivity. Below is an overview of some recent projects.

Programming Tools for JavaScript

JavaScript is an increasingly popular programming language, but most programming tools and IDEs for JavaScript provide only limited support for features such as refactoring, navigation, and smart completion that enhance programmer productivity. Such features are difficult to implement because, in the absence of static type information, programming tools must resort to using static pointer analysis to reason about program behavior. Our [OOPSLA11] paper showed how a number of refactorings of JavaScript programs can be implemented on top of a static pointer analysis. Pointer analysis of JavaScript programs is a harder problem than it is for traditional object-oriented programming languages because JavaScript allows properties (fields) of objects to be accessed reflectively using dynamic property access expressions of the form e[x], which fundamentally increases the complexity of pointer analysis. In [ECOOP12], we present a technique for mitigating this complexity by analyzing correlated access pairs using a custom context-sensitivity policy. In a [PLDI13] paper, we explored using a combination of dynamic and static analysis to "unroll" reflective features into equivalent static constructs. In contrast to earlier work, this approach is based on purely dynamic analysis, yet is still sound, as we show in our paper. For certain applications, where absolute soundness is not required, an approximate analysis may be sufficient. Our [ICSE13] paper shows how a field-based static pointer analysis can be used to implement features such as "jump to declaration" in a JavaScript IDE. In an [OOPSLA15] paper, we present an approach for finding errors in event-driven JavaScript programs. In this work, we extend the traditional notion of a call graph to capture features related to event-handling and rely on novel context-sensitivity policies to retain sufficient information about the order in which event handlers are executed.

Test Generation and Fault Localization

Web applications are often written in dynamically typed programming languages with highly permissive execution models, e.g. PHP on the server side and JavaScript on the client side. Web applications are difficult to test thoroughly because of the high degree of user interaction, and errors are therefore quite common. We have extended dynamic symbolic execution to the domain of web applications to generate thousands of tests that expose hundreds of errors in open-source PHP applications [ISSTA08] [TSE10]. Furthermore, we showed how to localize such faults effectively in the source code, using statistical methods to detect a correlation between statements and test failures [ICSE2010] [ISSTA2010] [TSE12]. A follow-up project studied HTML generation errors (i.e., programming errors that cause a web application to generate malformed HTML). Such errors may trigger security vulnerabilities, and may significantly degrade a user's experience (e.g., by causing important information to be omitted when displaying a web page). HTML errors are typically easy to repair in a static HTML page, but no solution was previously available for fixing the generating program. In an [ICSE12] paper, we showed how to repair HTML generation errors automatically using string constraint solving.

Advanced Refactoring Tools

Refactoring is the process of applying behavior-preserving transformations to a program, in order to improve its design, and has become very popular among software developers. Together with colleagues at IBM, I have worked with the Eclipse JDT team on type-related refactorings such as Extract Interface and Infer Generic Type Arguments, which have been incorporated in the standard distribution of Eclipse. An overview of this work can be found in [TOPLAS11]. I also worked on refactorings that manipulate concurrent Java programs, by making existing refactorings safe in the presence of concurrency [ECOOP10a], making programs reentrant [FSE09], and migrating between different types of locks [ICSE11]. A [TSE 12] paper shows how type-based refactorings can be made more reliable by integrating them in a common framework that also handles issues related to name binding and accessibility.

Data-Centric Synchronization

In current object-oriented programming languages, regions of executable code are protected by locks (as implemented via, e.g., Java's synchronized blocks) to prevent concurrency-related errors such as data races or atomicity violations. Such control-centric protection of shared data requires error-prone nonlocal reasoning since the burden is on the programmer to protect all accesses to shared locations consistently. In [POPL06], we proposed atomic sets, a data-centric approach to synchronization, in which synchronization constraints are specified on data (fields in Java classes), from which the compiler automatically infers where locks need to be inserted using static program analysis. We also developed a type-based variant of this approach [ECOOP10b] that permits separate compilation. Atomic sets have an associated correctness criterion, which generalizes upon a standard notion of serializability. Violations of this criterion correspond to subtle concurrency errors, and we developed a dynamic analysis tool that found many such errors [ICSE08]. An overview of the project appeared in [TOPLAS12]. In an [ICSE13] paper, we presented a static analysis for detecting possible deadlock in AJ programs.

Static Program Analysis

Several recent projects are concerned with algorithms for static program analysis. Two recent papers [ECOOP14] [TOSEM15] are concerned with type-based call graph construction algorithm for Scala. Similar to my earlier work with Jens Palsberg (see [OOPSLA00]), these algorithms only consider type information. However, they are designed to infer precise call graphs in the presence of Scala features such as traits and abstract type members. Another project is focused on performing precise interprocedural data flow analysis in the presence of correlated method calls. Our [SAS15] paper shows how an IFDS problem can be transformed into an IDE problem that avoids imprecision that arises due to correlated method calls.