Host-based anomaly detectors generate alarms by inspecting audit logs for suspicious behavior. Unfortunately, evaluating these anomaly detectors is hard. There are few high quality, publicly-available audit logs, and there are no pre-existing frameworks that enable push-button creation of realistic system traces. To make trace generation easier, we created Xanthus, an automated tool that orchestrates virtual machines to generate realistic audit logs. Using Xanthus' simple management interface, administrators select a base VM image, configure a particular tracing framework to use within that VM, and define post-launch scripts that collect and save trace data. Once data collection is finished, Xanthus creates a self-describing archive which contains the VM, its configuration parameters, and the collected trace data. We demonstrate that Xanthus hides many of the tedious (yet subtle) orchestration tasks that humans often get wrong; Xanthus avoids mistakes that lead to non-replicable experiments.
Advanced Persistent Threats (APTs) are difficult to detect due to their "low-and-slow" attack patterns and frequent use of zero-day exploits. We present UNICORN, an anomaly-based APT detector that effectively leverages data provenance analysis. From modeling to detection, UNICORN tailors its design specifically for the unique characteristics of APTs. Through extensive yet time-efficient graph analysis, UNICORN explores provenance graphs that provide rich contextual and historical information to identify stealthy anomalous activities without predefined attack signatures. Using a graph-sketching technique, it summarizes long-running system execution with space efficiency to combat slow-acting attacks that take place over a long time span. UNICORN further improves its detection capability using a novel modeling approach to understand long-term behavior as the system evolves. Our evaluation shows that UNICORN outperforms an existing state-of-the-art APT detection system and detects real-life APT scenarios with high accuracy.
Remote dependency resolution (RDR) is a proxy-driven scheme for reducing mobile page load times; a proxy loads a requested page using a local browser, fetching the page’s resources over fast proxy-origin links instead of a client’s slow last-mile links. In this paper, we describe two fundamental challenges to efficient RDR proxying: the increasing popularity of encrypted HTTPS content, and the fact that, due to time-dependent network conditions and page properties, RDR proxying can actually increase load times. We solve these problems by introducing a new, secure proxying scheme for HTTPS traffic, and by implementing WatchTower, a selective proxying system that uses dynamic models of network conditions and page structures to only enable RDR when it is predicted to help. WatchTower loads pages 21.2%–41.3% faster than state-of-the-art proxies and server push systems, while preserving end-to-end HTTPS security.
Riverbed is a new framework for building privacy-respecting web services. Using a simple policy language, users define restrictions on how a remote service can process and store sensitive data. A transparent Riverbed proxy sits between a user's front-end client (e.g., a web browser) and the back-end server code. The back-end code remotely attests to the proxy, demonstrating that the code respects user policies; in particular, the server code attests that it executes within a Riverbed-compatible managed runtime that uses IFC to enforce user policies. If attestation succeeds, the proxy releases the user's data, tagging it with the user-defined policies. On the server-side, the Riverbed runtime places all data with compatible policies into the same universe (i.e., the same isolated instance of the full web service). The universe mechanism allows Riverbed to work with unmodified, legacy software; unlike prior IFC systems, Riverbed does not require developers to reason about security lattices, or manually annotate code with labels. Riverbed imposes only modest performance overheads, with worst-case slowdowns of 10% for several real applications.
Virtualization enables datacenter operators to safely run computations that belong to untrusted tenants. An ideal virtual machine has three properties: a small memory footprint; strong isolation from other VMs and the host OS; and the ability to maintain in-memory state across client requests. Unfortunately, modern virtualization technologies cannot provide all three properties at once. In this paper, we explain why, and propose a new virtualization approach, called Alto, that virtualizes at the layer of a managed runtime interface. Through careful design of (1) the application-facing managed interface and (2) the internal runtime architecture, Alto provides VMs that are small, secure, and stateful. Conveniently, Alto also simplifies VM operations like suspension, migration, and resumption. We provide several details about the proposed design, and discuss the remaining challenges that must be solved to fully realize the Alto vision.
In an IoT deployment, network access policies are a critical determinant of security. Devices that run unpatched or insecure code should not be allowed to communicate with the outside world; furthermore, unknown external hosts should not be able to contact sensitive devices that reside within an IoT deployment.
In this paper, we introduce DeadBolt, a new security framework for managing IoT network access. DeadBolt hides all of the devices in an IoT deployment behind an access point that implements deny-by-default policies for both incoming and outgoing traffic. The DeadBolt AP also forces high-end IoT devices to use remote attestation to gain network access; attestation allows the devices to prove that they run up-to-date, trusted software. For lightweight IoT devices which lack the ability to attest, the DeadBolt AP uses virtual drivers (essentially, security-focused virtual network functions) to protect lightweight device traffic. For example, a virtual driver might provide network intrusion detection, or encrypt device traffic that is natively cleartext. Using these techniques, and several others, DeadBolt can prevent realistic attacks while imposing only modest performance costs.
Mobile browsers suffer from unnecessary cache misses. The same binary object is often named by multiple URLs which correspond to different cache keys. Furthermore, servers frequently mark objects as uncacheable, even though the objects' content is stable over time.
In this paper, we quantify the excess network traffic that mobile devices generate due to inefficient caching logic. We demonstrate that mobile page loads suffer from more redundant transfers than reported by prior studies which focused on desktop page loads. We then propose a new scheme, called Remote-Control Caching (RC2), in which web proxies (owned by mobile carriers or device manufacturers) track the aliasing relationships between the objects that a client has fetched, and the URLs that were used to fetch those objects. Leveraging knowledge of those aliases, a proxy dynamically rewrites the URLs inside of pages, allowing the client's local browser cache to satisfy a larger fraction of requests. Using a concrete implementation of RC2, we show that, for two loads of a page separated by 8 hours, RC2 reduces bandwidth consumption by a median of 52%. As a result, mobile browsers can save a median of 469 KB per warm-cache page load.
All popular web browsers offer a "private browsing mode." After a private session terminates, the browser is supposed to remove client-side evidence that the session occurred. Unfortunately, browsers still leak information through the file system, the browser cache, the DNS cache, and on-disk reflections of RAM such as the swap file.
In theory, remote attestation is a powerful primitive for building distributed systems atop untrusting peers. Unfortunately, the canonical attestation framework defined by the Trusted Computing Group is insufficient to express rich contextual relationships between client-side software components. Thus, attestors and verifiers must rely on ad-hoc mechanisms to handle real-world attestation challenges like attestors that load executables in nondeterministic orders, or verifiers that require attestors to track dynamic information flows between attestor-side components.
In this paper, we survey these practical attestation challenges. We then describe a newattestation framework, named Cobweb, which handles these challenges. The key insight is that real-world attestation is a graph problem. An attestation message is a graph in which each vertex is a software component, and has one or more labels, e.g., the hash value of the component, or the raw file data, or a signature over that data. Each edge in an attestation graph is a contextual relationship, like the passage of time, or a parent/child fork() relationship, or a sender/receiver IPC relationship. Cobweb’s verifier-side policies are graph predicates which analyze contextual relationships. Experiments with real, complex software stacks demonstrate that Cobweb’s abstractions are generic and can support a variety of real-world policies.
Modern web services rob users of low-level control over cloud storage—a user’s single logical data set is scattered across multiple storage silos whose access controls are set by web services, not users. The consequence is that users lack the ultimate authority to determine how their data is shared with other web services.
In this paper, we introduce Sieve, a new platform which selectively (and securely) exposes user data to web services. Sieve has a user-centric storage model: each user uploads encrypted data to a single cloud store, and by default, only the user knows the decryption keys. Given this storage model, Sieve defines an infrastructure to support rich, legacy web applications. Using attribute-based encryption, Sieve allows users to define intuitively understandable access policies that are cryptographically enforceable. Using key homomorphism, Sieve can reencrypt user data on storage providers in situ, revoking decryption keys from web services without revealing new keys to the storage provider. Using secret sharing and two factor authentication, Sieve protects cryptographic secrets against the loss of user devices like smartphones and laptops. The result is that users can enjoy rich, legacy web applications, while benefiting from cryptographically strong controls over which data a web service can access.
In a modern web application, a single high-level action like a mouse click triggers a flurry of asynchronous events on the client browser and remote web servers. We introduce Domino, a new tool which automatically captures and analyzes end-to-end, asynchronous causal relationship of events that span clients and servers. Using Domino, we found uncharacteristically long event chains in Bing Maps, discovered data races in the WinJS implementation of promises, and developed a new server-side scheduling algorithm for reducing the tail latency of server responses.
User-generated content is becoming increasingly common on the Web, but current web applications isolate their users’ data, enabling only restricted sharing and cross-service integration. We believe users should be able to share their data seamlessly between their applications and with other users. To that end, we propose Amber, an architecture that decouples users’ data from applications, while providing applications with powerful global queries to find user data. We demonstrate how multi-user applications, such as e-mail, can use these global queries to efficiently collect and monitor relevant data created by other users. Amber puts users in control of which applications they use with their data and with whom it is shared, and enables a new class of applications by removing the artificial partitioning of users’ data by application.
This paper presents Mahimahi, a framework to record traffic from HTTP-based applications, and later replay it under emulated network conditions. Mahimahi improves upon prior record-and-replay frameworks in three ways. First, it is more accurate because it carefully emulates the multi-server nature of Web applications, present in 98% of the Alexa US Top 500 Web pages. Second, it isolates its own network traffic, allowing multiple Mahimahi instances emulating different networks to run concurrently without mutual interference. And third, it is designed as a set of composable shells, providing ease-of-use and extensibility.
We evaluate Mahimahi by: (1) analyzing the performance of HTTP/1.1, SPDY, and QUIC on a corpus of 500 sites, (2) using Mahimahi to understand the reasons why these protocols are suboptimal, (3) developing Cumulus, a cloud-based browser designed to overcome these problems, using Mahimahi both to implement Cumulus by extending one of its shells, and to evaluate it, (4) using Mahimahi to evaluate HTTP multiplexing protocols on multiple performance metrics (page load time and speed index), and (5) describing how others have used Mahimahi.
Conventional wisdom suggests that rich, large-scale web applications are difficult to build and maintain. An implicit assumption behind this intuition is that a large web application requires massive numbers of servers, and complicated, one-off back-end architectures. We provide empirical evidence to disprove this intuition. We then propose new programming abstractions and a new deployment model that reduce the overhead of building and running web services.
Blizzard is a high-performance block store that exposes cloud storage to cloud-oblivious POSIX and Win32 applications. Blizzard connects clients and servers using a network with full-bisection bandwidth, allowing clients to access any remote disk as fast as if it were local. Using a novel striping scheme, Blizzard exposes high disk parallelism to both sequential and random workloads; also, by decoupling the durability and ordering requirements expressed by flush requests, Blizzard can commit writes out-of-order, providing high performance and crash consistency to applications that issue many small, random IOs. Blizzard’s virtual disk drive, which clients mount like a normal physical one, provides maximum throughputs of 1200 MB/s, and can improve the performance of unmodified, cloud-oblivious applications by 2x–10x. Compared to EBS, a commercially available, state-of-the-art virtual drive for cloud applications, Blizzard can improve SQL server IOp rates by seven-fold while still providing crash consistency.