Research

Broadly speaking, I study distributed systems—how to make them faster, more robust, and more secure. Much of my work focuses on large-scale web services, and how to design principled system interfaces for those services. Here are some of the specific topics which currently interest me:

Client-side web security: A web browser supports interactive content via the JavaScript runtime and the DOM model. Unfortunately, these interfaces are complex, and their security is governed by a leviathan same-origin policy that has many corner cases, and generally lacks formal guarantees about isolation. I am interested in various approaches for improving browser security. For example: 

  • Private browsing modes ostensibly hide evidence of browsing activity. Unfortunately, implementations of incognito browsing still leak information. Veil is a new system which allows web developers to reduce the likelihood of such leaks. Developers pass their web content to the Veil compiler; the compiler outputs a new version of a page which intentionally limits the spread of sensitive information. For example, Veil pages only store encrypted data in the traditional browser cache. Veil pages also garble in-memory RAM artifacts, to prevent the likelihood that greppable page content leaks to the swap file.
  • How can developers isolate distrusted JavaScript code while still allowing rich interactions between third-party libraries and the enclosing web page? The Pivot system is one attempt at a solution. However, I believe that there are more fundamental solutions which involve the creation of a new scripting language for the web (a JavaScript++, if you like analogies to C++, which would be troubling because C++ is a nightmarish Pandora’s box of emotional trauma, but I clearly digress). Creating this new scripting language will require contributions from systems research as well as programming language research.

Secure delegation of sensitive user data: On the server-side, users have little influence on how their data is shared within different parts of an application, or across different applications that may belong to different companies. Access control mechanisms like OAuth provide users with a modicum of control, but those mechanisms are plagued with security vulnerabilities, and they do not provide strong, cryptographic limits on how third parties can manipulate user data. Thus, in practice, users cede control of their data to service providers. I'm interested in using techniques like attribute-based encryption and remote attestation to provide users with cryptographically strong control over which third parties gain access to particular pieces of user data.

Web performance and analysis: To load a web page, a client-side browser must fetch a large number of objects (e.g., HTML files, images, and JavaScript files). Understanding how network conditions impact fetch performance is crucial for understanding the overall page load process. Once a page is loaded, that page generates a large number of JavaScript events; in turn, those events may trigger server-side events. By studying these asynchronous, wide-area event chains, we can identify which parts of the application pipeline are slow, and try to optimize them. Using data flow analysis of the dependencies between client-side HTML, CSS, and JavaScript files, we can present browsers with a fetch schedule for those files which minimizes page load time while still respecting the data flows.

Storage architectures for large-scale web services: What is the best way to organize user data for services that must scale to millions of users? For example, how can we maximize IO throughput, and minimize IO latency, for block-based storage abstractions? How can datacenters take advantage of new storage technologies like SSDs and shingled magnetic drives? How does application design change when cloud storage is user-centric instead of application-centric, i.e., when a user's data is located in a single, user-controlled storage silo, instead of scattered across multiple, application-controlled silos?