ACM CCS 2020 - November 9-13, 2020
CCS '20: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications SecurityFull Citation in the ACM Digital Library
SESSION: Keynote Talk I
I would like to share my thoughts on the interactions between machine learning and security.
The good: We now have more data, more powerful machines and algorithms, and better yet, we don't need to always manually engineered the features. The ML process is now much more automated and the learned models are more powerful, and this is a positive feedback loop: more data leads to better models, which lead to more deployments, which lead to more data. All security vendors now advertise that they use ML in their products.
The bad: There are more unknowns. In the past, we knew the capabilities and limitations of our security models, including the ML-based models, and understood how they can be evaded. But the state-of-the-art models such as deep neural networks are not as intelligible as classical models such as decision trees. How do we decide to deploy a deep learning-based model for security when we don't know for sure it is learned correctly? Data poisoning becomes easier. On-line learning and web-based learning use data collected in run-time and often from an open environment. Since such data is often resulted from human actions, it can be intentionally polluted, e.g., in misinformation campaigns. How do we make it harder for attackers to manipulate the training data?
The ugly: Attackers will keep on exploiting the holes in ML, and automate their attacks using ML. Why don't we just secure ML? This would be no different than trying to secure our programs, and systems, and networks, so we can't. We have to prepare for ML failures. Ultimately, humans have to be involved. The question is how and when? For example, what information should a ML-based system present to humans and what input can humans provide to the system?
SESSION: Session 1A: Anonymous Routing and Censorship
Tor exit blocking, in which websites disallow clients arriving from Tor, is a growing and potentially existential threat to the anonymity network. This paper introduces HebTor, a new and robust architecture for exit bridges---short-lived proxies that serve as alternative egress points for Tor. A key insight of HebTor is that exit bridges can operate as Tor onion services, allowing any device that can create outbound TCP connections to serve as an exit bridge, regardless of the presence of NATs and/or firewalls. HebTor employs a micropayment system that compensates exit bridge operators for their services, and a privacy-preserving reputation scheme that prevents freeloading. We show that HebTor effectively thwarts server-side blocking of Tor, and we describe the security, privacy, and legal implications of our design.
Much research has investigated improving the security and performance of Tor by having Tor clients choose paths through the network in a way that depends on the client's location. However, this approach has been demonstrated to lead to serious deanonymization attacks. Moreover, we show how in some scenarios it can lead to significant performance degradation. For example, we demonstrate that using the recently-proposed Counter-RAPTOR system when guard bandwidth isn't abundant could increase median download times by 28.7%. We propose the CLAPS system for performing client-location-aware path selection, which fixes the known security and performance issues of existing designs. We experimentally compare the security and performance of CLAPS to Counter-RAPTOR and DeNASA. CLAPS puts a strict bound on the leakage of information about the client's location, where the other systems could completely reveal it after just a few connections. It also guarantees a limit on the advantage that an adversary can obtain by strategic relay placement, which we demonstrate to be overwhelming against the other systems. Finally, due to a powerful formalization of path selection as an optimization problem, CLAPS is approaching or even exceeding the original goals of algorithms to which it is applied, while solving their known deficiencies.
Poking a Hole in the Wall: Efficient Censorship-Resistant Internet Communications by Parasitizing on WebRTC
Many censorship circumvention tools rely on trusted proxies that allow users within censored regions to access blocked Internet content by tunneling it through a covert channel (e.g,. piggybacking on Skype video calls). However, building tools that can simultaneously (i) provide good bandwidth capacity for accommodating the typical activities of Internet users, and (ii) be secure against traffic analysis attacks has remained an open problem and a stumbling block to the practical adoption of such tools for censorship evasion.
We present Protozoa, a censorship-resistant tunneling tool featuring both high-performing covert channels and strong traffic analysis resistance. To create a covert channel, a user only needs to make a video call with a trusted party located outside the censored region using a popular WebRTC streaming service, e.g., Whereby. Protozoa can then covertly tunnel all IP traffic from unmodified user applications (e.g., Firefox) through the WebRTC video stream.
This is achieved by hooking into the WebRTC stack and replacing the encoded video frame data with IP packet payload, while ensuring that the payload of the WebRTC stream remains encrypted, and the stream's statistical properties remain in all identical to those of any common video call. This technique allows for sustaining enough throughput to enable common-use Internet applications, e.g., web browsing or bulk data transfer, and avoid detection by state-of-the-art traffic analysis attacks. We show that Protozoa is able to evade state-level censorship in China, Russia, and India.
Remote censorship measurement techniques offer capabilities for monitoring Internet reachability around the world. However, operating these techniques continuously is labor-intensive and requires specialized knowledge and synchronization, leading to limited adoption. In this paper, we introduce Censored Planet, an online censorship measurement platform that collects and analyzes measurements from ongoing deployments of four remote measurement techniques (Augur, Satellite/Iris, Quack, and Hyperquack). Censored Planet adopts a modular design that supports synchronized baseline measurements on six Internet protocols as well as customized measurements that target specific countries and websites. Censored Planet has already collected and published more than 21.8 billion data points of longitudinal network observations over 20 months of operation. Censored Planet complements existing censorship measurement platforms such as OONI and ICLab by offering increased scale, coverage, and continuity. We introduce a new representative censorship metric and show how time series analysis can be applied to Censored Planet's longitudinal measurements to detect 15 prominent censorship events, two-thirds of which have not been reported previously. Using trend analysis, we find increasing censorship activity in more than 100 countries, and we identify 11 categories of websites facing increasing censorship, including provocative attire, human rights issues, and news media. We hope that the continued publication of Censored Planet data helps counter the proliferation of growing restrictions to online freedom.
SESSION: Session 1B: Attacking and Defending ML Systems
Deep neural networks (DNN) are known to be vulnerable to adversarial attacks. Numerous efforts either try to patch weaknesses in trained models, or try to make it difficult or costly to compute adversarial examples that exploit them. In our work, we explore a new "honeypot" approach to protect DNN models. We intentionally inject trapdoors, honeypot weaknesses in the classification manifold that attract attackers searching for adversarial examples. Attackers' optimization algorithms gravitate towards trapdoors, leading them to produce attacks similar to trapdoors in the feature space. Our defense then identifies attacks by comparing neuron activation signatures of inputs to those of trapdoors.
In this paper, we introduce trapdoors and describe an implementation of a trapdoor-enabled defense. First, we analytically prove that trapdoors shape the computation of adversarial attacks so that attack inputs will have feature representations very similar to those of trapdoors. Second, we experimentally show that trapdoor-protected models can detect, with high accuracy, adversarial examples generated by state-of-the-art attacks (PGD, optimization-based CW, Elastic Net, BPDA), with negligible impact on normal classification. These results generalize across classification domains, including image, facial, and traffic-sign recognition. We also present significant results measuring trapdoors' robustness against customized adaptive attacks (countermeasures).
Despite their tremendous success in a range of domains, deep learning systems are inherently susceptible to two types of manipulations: adversarial inputs -- maliciously crafted samples that deceive target deep neural network (DNN) models, and poisoned models -- adversely forged DNNs that misbehave on pre-defined inputs. While prior work has intensively studied the two attack vectors in parallel, there is still a lack of understanding about their fundamental connections: what are the dynamic interactions between the two attack vectors? what are the implications of such interactions for optimizing existing attacks? what are the potential countermeasures against the enhanced attacks? Answering these key questions is crucial for assessing and mitigating the holistic vulnerabilities of DNNs deployed in realistic settings.
Here we take a solid step towards this goal by conducting the first systematic study of the two attack vectors within a unified framework. Specifically, (i) we develop a new attack model that jointly optimizes adversarial inputs and poisoned models; (ii) with both analytical and empirical evidence, we reveal that there exist intriguing "mutual reinforcement" effects between the two attack vectors -- leveraging one vector significantly amplifies the effectiveness of the other; (iii) we demonstrate that such effects enable a large design spectrum for the adversary to enhance the existing attacks that exploit both vectors (e.g., backdoor attacks), such as maximizing the attack evasiveness with respect to various detection methods; (iv) finally, we discuss potential countermeasures against such optimized attacks and their technical challenges, pointing to several promising research directions.
Deep neural networks (DNNs) have become one of the enabling technologies in many safety-critical applications, e.g., autonomous driving and medical image analysis. DNN systems, however, suffer from various kinds of threats, such as adversarial example attacks and fault injection attacks. While there are many defense methods proposed against maliciously crafted inputs, solutions against faults presented in the DNN system itself (e.g., parameters and calculations) are far less explored. In this paper, we develop a novel lightweight fault-tolerant solution for DNN-based systems, namely DeepDyve, which employs pre-trained neural networks that are far simpler and smaller than the original DNN for dynamic verification. The key to enabling such lightweight checking is that the smaller neural network only needs to produce approximate results for the initial task without sacrificing fault coverage much. We develop efficient and effective architecture and task exploration techniques to achieve optimized risk/overhead trade-off in DeepDyve. Experimental results show that DeepDyve can reduce 90% of the risks at around 10% overhead.
With the prevalent use of Deep Neural Networks (DNNs) in many applications, security of these networks is of importance. Pre-trained DNNs may contain backdoors that are injected through poisoned training. These trojaned models perform well when regular inputs are provided, but misclassify to a target output label when the input is stamped with a unique pattern called trojan trigger. Recently various backdoor detection and mitigation systems for DNN based AI applications have been proposed. However, many of them are limited to trojan attacks that require a specific patch trigger. In this paper, we introduce composite attack, a more flexible and stealthy trojan attack that eludes backdoor scanners using trojan triggers composed from existing benign features of multiple labels. We show that a neural network with a composed backdoor can achieve accuracy comparable to its original version on benign data and misclassifies when the composite trigger is present in the input. Our experiments on 7 different tasks show that this attack poses a severe threat. We evaluate our attack with two state-of-the-art backdoor scanners. The results show none of the injected backdoors can be detected by either scanner. We also study in details why the scanners are not effective. In the end, we discuss the essence of our attack and propose possible defense.
SESSION: Session 1C: Binary Analysis/Policy and Access Control
The complexities that arise from the implementation of object-oriented concepts in C++ such as virtual dispatch and dynamic type casting have attracted the attention of attackers and defenders alike. Binary-level defenses are dependent on full and precise recovery of class inheritance tree of a given program. While current solutions focus on recovering single and multiple inheritances from the binary, they are oblivious of virtual inheritance. The conventional wisdom among binary-level defenses is that virtual inheritance is uncommon and/or support for single and multiple inheritances provides implicit support for virtual inheritance. In this paper, we show neither to be true. Specifically, (1) we present an efficient technique to detect virtual inheritance in C++ binaries and show through a study that virtual inheritance can be found in non-negligible number (more than 10% on Linux and 12.5% on Windows) of real-world C++ programs including Mysql and Libstdc++. (2) We show that failure to handle virtual inheritance introduces both false positives and false negatives in the hierarchy tree. These falses either introduce attack surface when the hierarchy recovered is used to enforce CFI policies, or make the hierarchy difficult to understand when it is needed for program understanding (e.g., during decompilation). (3) We present a solution to recover virtual inheritance from COTS binaries. We recover a maximum of 95% and 95.5% (GCC -O0) and a minimum of 77.5% and 73.8% (Clang -O2) of virtual and intermediate bases respectively in the virtual inheritance tree.
Software patching is one of the most significant mechanisms to combat vulnerabilities. To demystify underlying patch details, the techniques of patch differential analysis (a.k.a. patch diffing) are proposed to find differences between patched and unpatched programs' binary code. Considering the sophisticated security patches, patch diffing is expected to not only correctly locate patch changes but also provide sufficient explanation for understanding patch details and the fixed vulnerabilities. Unfortunately, none of the existing patch diffing techniques can meet these requirements. In this study, we first perform a large-scale study on code changes of security patches for better understanding their patterns. We then point out several challenges and design principles for patch diffing. To address the above challenges, we design a dynamic patch diffing technique PatchScope. Our technique is motivated by two key observations: 1) the way that a program processes its input reveals a wealth of semantic information, and 2) most memory corruption patches regulate the handling of malformed inputs via updating the manipulations of input-related data structures. The core of PatchScope is a new semantics-aware program representation, memory object access sequence, which characterizes how a program references data structures to manipulate inputs. The representation can not only deliver succinct patch differences but also offer rich patch context information such as input-patch correlations. Such information can interpret patch differences and further help security analysts understand patch details, locate vulnerability root causes, and even detect buggy patches.
Today, Bluetooth 4.0, also known as Bluetooth Low Energy (BLE), has been widely used in many IoT devices (e.g., smart locks, smart sensors, and wearables). However, BLE devices could contain a number of vulnerabilities at the BLE link layer during broadcasting, pairing, and message transmission. To detect these vulnerabilities directly from the bare-metal firmware, we present FirmXRay, the first static binary analysis tool with a set of enabling techniques including a novel base address identification algorithm for robust firmware disassembling, precise data structure recognition, and configuration value resolution. As a proof-of-concept, we focus on the BLE firmware from two leading SoC vendors (i.e., Nordic and Texas Instruments), and implement a prototype of FirmXRay atop Ghidra. We have evaluated FirmXRay with 793 unique firmware (corresponding to 538 unique devices) collected using a mobile app based approach, and our experiment results show that 98.1% of the devices have configured random static MAC addresses, 71.5% Just Works pairing, and 98.5% insecure key exchanges. With these vulnerabilities, we demonstrate identity tracking, spoofing, and eavesdropping attacks on real-world BLE devices.
We present Privaros, a framework to enforce privacy policies on drones. Privaros is designed for commercial delivery drones, such as the ones that will likely be used by Amazon Prime Air. Such drones visit various host airspaces, each of which may have different privacy requirements. Privaros uses mandatory access control to enforce the policies of these hosts on guest delivery drones. Privaros is tailored for ROS, a middleware popular in many drone platforms. This paper presents the design and implementation of Privaros's policy-enforcement mechanisms, describes how policies are specified, and shows that policy specification can be integrated with India's Digital Sky portal. Our evaluation shows that a drone running Privaros can robustly enforce various privacy policies specified by hosts, and that its core mechanisms only marginally increase communication latency and power consumption.
SESSION: Session 1D: Applied Cryptography and Cryptanalysis
Primality testing is a basic cryptographic task. But developers today are faced with complex APIs for primality testing, along with documentation that fails to clearly state the reliability of the tests being performed. This leads to the APIs being incorrectly used in practice, with potentially disastrous consequences. In an effort to overcome this, we present a primality test having a simplest-possible API: the test accepts a number to be tested and returns a Boolean indicating whether the input was composite or probably prime. For all inputs, the output is guaranteed to be correct with probability at least 1 - 2-128. The test is performant: on random, odd, 1024-bit inputs, it is faster than the default test used in OpenSSL by 17%. We investigate the impact of our new test on the cost of random prime generation, a key use case for primality testing. The OpenSSL developers have adopted our suggestions in full; our new API and primality test are scheduled for release in OpenSSL 3.0.
ProMACs: Progressive and Resynchronizing MACs for Continuous Efficient Authentication of Message Streams
Efficiently integrity verification of received data requires Message Authentication Code (MAC) tags. However, while security calls for rather long tags, in many scenarios this contradicts other requirements. Examples are strict delay requirements (e.g., robot or drone control) or resource-scarce settings (e.g., LoRaWAN networks with limited battery capacity).
Prior techniques suggested truncation of MAC tags, thus trading off linear performance gain for exponential security loss. To achieve security of full-length MACs with short(er) tags, we introduce Progressive MACs (ProMACs) -- a scheme that uses internal state to gradually increase security upon reception of subsequent messages. We provide a formal framework and propose a provably secure, generic construction called Whips. We evaluate applicability of ProMACs in several realistic scenarios and demonstrate example settings where ProMACs can be used as a drop-in replacement for traditional MACs.
Although it is one of the most popular signature schemes today, ECDSA presents a number of implementation pitfalls, in particular due to the very sensitive nature of the random value (known as the nonce) generated as part of the signing algorithm. It is known that any small amount of nonce exposure or nonce bias can in principle lead to a full key recovery: the key recovery is then a particular instance of Boneh and Venkatesan's hidden number problem (HNP). That observation has been practically exploited in many attacks in the literature, taking advantage of implementation defects or side-channel vulnerabilities in various concrete ECDSA implementations. However, most of the attacks so far have relied on at least 2 bits of nonce bias (except for the special case of curves at the 80-bit security level, for which attacks against 1-bit biases are known, albeit with a very high number of required signatures). In this paper, we uncover LadderLeak, a novel class of side-channel vulnerabilities in implementations of the Montgomery ladder used in ECDSA scalar multiplication. The vulnerability is in particular present in several recent versions of OpenSSL. However, it leaks less than 1 bit of information about the nonce, in the sense that it reveals the most significant bit of the nonce, but with probability <1. Exploiting such a mild leakage would be intractable using techniques present in the literature so far. However, we present a number of theoretical improvements of the Fourier analysis approach to solving the HNP (an approach originally due to Bleichenbacher), and this lets us practically break LadderLeak-vulnerable ECDSA implementations instantiated over the sect163r1 and NIST P-192 elliptic curves. In so doing, we achieve several significant computational records in practical attacks against the HNP.
We analyze the multi-user security of the streaming encryption in Google's Tink library via an extended version of the framework of nonce-based online authenticated encryption of Hoang et al. (CRYPTO'15) to support random-access decryption. We show that Tink's design choice of using random nonces and a nonce-based key-derivation function indeed improves the concrete security bound. We then give two better alternatives that are more robust against randomness failure. In addition, we show how to efficiently instantiate the key-derivation function via AES, instead of relying on HMAC-SHA256 like the current design in Tink. To accomplish this we give a multi-user analysis of the XOR-of-permutation construction of Bellare, Krovetz, and Rogaway (EUROCRYPT'98).
SESSION: Session 1E: Cyberphysical Systems
We propose a new type of vulnerability for Robotic Vehicles (RVs), called Cyber-Physical Inconsistency. These vulnerabilities target safety checks in RVs (e.g., crash detection). They can be exploited by setting up malicious environment conditions such as placing an obstacle with a certain weight and a certain angle in the RV's trajectory. Once exploited, the safety checks may fail to report real physical accidents or report false alarms (while the RV is still operating normally). Both situations could lead to life-threatening consequences. The root cause of such vulnerabilities is that existing safety checks are mostly using simple range checks implemented in general-purpose programming languages, which are incapable of describing the complex and delicate physical world. We develop a novel technique that requires the interplay of program analysis, vehicle modeling, and search-based testing to identify such vulnerabilities. Our experiment on 4 real-world control software and 8 vehicles including quadrotors, rover, and fixed-wing airplane has discovered 10 real vulnerabilities. Our technique does not have false positives as it only reports when an exploit can be generated.
Industrial Control Systems (ICS) provide management and control capabilities for mission-critical utilities such as the nuclear, power, water, and transportation grids. Within ICS, Programmable Logic Controllers (PLCs) play a key role as they serve as a convenient bridge between the cyber and the physical worlds, e.g., controlling centrifuge machines in nuclear power plants. The critical roles that ICS and PLCs play have made them the target of sophisticated cyberattacks that are designed to disrupt their operation, which creates both social unrest and financial losses. In this context, honeypots have been shown to be highly valuable tools for collecting real data, e.g., malware payload, to better understand the many different methods and strategies that attackers use. However, existing state-of-the-art honeypots for PLCs lack sophisticated service simulations that are required to obtain valuable data. Worse, they cannot adapt while ICS malware keeps evolving, and attack patterns become more sophisticated. To overcome these shortcomings, we present HoneyPLC, a high-interaction, extensible, and malware collecting honeypot supporting a broad spectrum of PLCs models and vendors. Results from our experiments show that HoneyPLC exhibits a high level of camouflaging: it is identified as real devices by multiple widely used reconnaissance tools, including Nmap, Shodan's Honeyscore, the Siemens Step7 Manager, PLCinject, and PLCScan, with a high level of confidence. We deployed HoneyPLC on Amazon AWS and recorded a large amount of interesting interactions over the Internet, showing not only that attackers are in fact targeting ICS systems, but also that HoneyPLC can effectively engage and deceive them while collecting data samples for future analysis.
In this paper, we investigate "split-second phantom attacks," a scientific gap that causes two commercial advanced driver-assistance systems (ADASs), Telsa Model X (HW 2.5 and HW 3) and Mobileye 630, to treat a depthless object that appears for a few milliseconds as a real obstacle/object. We discuss the challenge that split-second phantom attacks create for ADASs. We demonstrate how attackers can apply split-second phantom attacks remotely by embedding phantom road signs into an advertisement presented on a digital billboard which causes Tesla's autopilot to suddenly stop the car in the middle of a road and Mobileye 630 to issue false notifications. We also demonstrate how attackers can use a projector in order to cause Tesla's autopilot to apply the brakes in response to a phantom of a pedestrian that was projected on the road and Mobileye 630 to issue false notifications in response to a projected road sign. To counter this threat, we propose a countermeasure which can determine whether a detected object is a phantom or real using just the camera sensor. The countermeasure (GhostBusters) uses a "committee of experts" approach and combines the results obtained from four lightweight deep convolutional neural networks that assess the authenticity of an object based on the object's light, context, surface, and depth. We demonstrate our countermeasure's effectiveness (it obtains a TPR of 0.994 with an FPR of zero) and test its robustness to adversarial machine learning attacks.
Secure pairing is key to trustworthy deployment and application of Internet of Things (IoT) devices. However, IoT devices lack conventional user interfaces, such as keyboards and displays, which makes many traditional pairing approaches inapplicable. Proximity-based pairing approaches are very usable, but can be exploited by co-located malicious devices. Approaches based on a user's physical operations on IoT devices are more secure, but typically require inertial sensors, while many devices do not satisfy this requirement. A secure and usable pairing approach that can be applied to heterogeneous IoT devices still does not exist. We develop a technique, Universal Operation Sensing, which allows an IoT device to sense the user's physical operations on it without requiring inertial sensors. With this technique, a user holding a smartphone or wearing a wristband can finish pairing in seconds through some very simple operations, e.g., pressing a button or twisting a knob. Moreover, we reveal an inaccuracy issue in original fuzzy commitment and propose faithful fuzzy commitment to resolve it. We design a pairing protocol using faithful fuzzy commitment, and build a prototype system named Touch-to-Pair (T2Pair, for short). The comprehensive evaluation shows that it is secure and usable.
SESSION: Session 2A: ML and Information Leakage
We present CrypTFlow2, a cryptographic framework for secure inference over realistic Deep Neural Networks (DNNs) using secure 2-party computation. CrypTFlow2 protocols are both correct -- i.e., their outputs are bitwise equivalent to the cleartext execution -- and efficient -- they outperform the state-of-the-art protocols in both latency and scale. At the core of CrypTFlow2, we have new 2PC protocols for secure comparison and division, designed carefully to balance round and communication complexity for secure inference tasks. Using CrypTFlow2, we present the first secure inference over ImageNet-scale DNNs like ResNet50 and DenseNet121. These DNNs are at least an order of magnitude larger than those considered in the prior work of 2-party DNN inference. Even on the benchmarks considered by prior work, CrypTFlow2 requires an order of magnitude less communication and 20x-30x less time than the state-of-the-art.
Deep learning has achieved overwhelming success, spanning from discriminative models to generative models. In particular, deep generative models have facilitated a new level of performance in a myriad of areas, ranging from media manipulation to sanitized dataset generation. Despite the great success, the potential risks of privacy breach caused by generative models have not been analyzed systematically. In this paper, we focus on membership inference attack against deep generative models that reveals information about the training data used for victim models. Specifically, we present the first taxonomy of membership inference attacks, encompassing not only existing attacks but also our novel ones. In addition, we propose the first generic attack model that can be instantiated in a large range of settings and is applicable to various kinds of deep generative models. Moreover, we provide a theoretically grounded attack calibration technique, which consistently boosts the attack performance in all cases, across different attack settings, data modalities, and training configurations. We complement the systematic analysis of attack performance by a comprehensive experimental study, that investigates the effectiveness of various attacks w.r.t. model type and training configurations, over three diverse application scenarios (i.e., images, medical data, and location data).
To continuously improve quality and reflect changes in data, machine learning applications have to regularly retrain and update their core models. We show that a differential analysis of language model snapshots before and after an update can reveal a surprising amount of detailed information about changes in the training data. We propose two new metrics---differential score and differential rank---for analyzing the leakage due to updates of natural language models. We perform leakage analysis using these metrics across models trained on several different datasets using different methods and configurations. We discuss the privacy implications of our findings, propose mitigation strategies and evaluate their effect.
Embeddings are functions that map raw input data to low-dimensional vector representations, while preserving important semantic information about the inputs. Pre-training embeddings on a large amount of unlabeled data and fine-tuning them for downstream tasks is now a de facto standard in achieving state of the art learning in many domains.
We demonstrate that embeddings, in addition to encoding generic semantics, often also present a vector that leaks sensitive information about the input data. We develop three classes of attacks to systematically study information that might be leaked by embeddings. First, embedding vectors can be inverted to partially recover some of the input data. As an example, we show that our attacks on popular sentence embeddings recover between 50%--70% of the input words (F1 scores of 0.5--0.7). Second, embeddings may reveal sensitive attributes inherent in inputs and independent of the underlying semantic task at hand. Attributes such as authorship of text can be easily extracted by training an inference model on just a handful of labeled embedding vectors. Third, embedding models leak moderate amount of membership information for infrequent training data inputs. We extensively evaluate our attacks on various state-of-the-art embedding models in the text domain. We also propose and evaluate defenses that can prevent the leakage to some extent at a minor cost in utility.
SESSION: Session 2B: Applied Cryptography
Pairing-based cryptography is widely used for its efficiency and functionality. When designing pairing-based schemes, one common task is to devise algorithms for verifying a set of untrusted group elements with respect to a set of trusted group elements. One might be searching for a verification algorithm for a signature scheme or a method for verifying an IBE/ABE private key with respect to the IBE/ABE public parameters. In ACM CCS 2019 Hohenberger Vusirikala, the AutoPPE software tool was introduced for automatically generating a set of pairing product equations (PPEs) that can verify the correctness of a set of pairing group elements with respect to a set of trusted group elements. This task is non-trivial. Some schemes (e.g., those based on dual system encryption) provably do not support any efficient algorithm for verifying the private keys with respect to the public parameters. Other schemes (e.g., the Boyen-Waters anonymous IBE) were left in a gray area by Hohenberger-Vusirikala (CCS 19) -- no conjunction of PPEs was known for testing them, but no proof of untestability either.
In this work, we significantly generalize and expand on the foundation of Hohenberger-Vusirikala (CCS 19). Specifically, we consider a larger space of verification algorithms, which we call PPE Circuits, to verify a set of untrusted group elements with respect to a set of trusted group elements. Informally, a PPE Circuit supports AND, OR, NOT and PPE gates, thus capturing all of the capability of AutoPPE while novelly enabling the verification algorithm to include arbitrary logic (as opposed to only conjunctions of PPEs). Our contributions include a formalization of PPE circuits, a provably-correct algorithm for searching for a PPE circuit given a description of the trusted and untrusted elements to be verified, and a new open-source software tool called AutoCircuitPPE that realizes this algorithm. AutoCircuitPPE was tested on a host of test cases and it output PPE circuits for all "gray area" schemes left unresolved in Hohenberger-Vusirikala (CCS 19) as well as several new test cases, usually in 100 seconds or less.
Password-hardened encryption (PHE) was introduced by Lai et al. at USENIX 2018 and immediately productized by VirgilSecurity. PHE is a password-based key derivation protocol that involves an oblivious external crypto service for key derivation. The security of PHE protects against offline brute-force attacks, even when the attacker is given the entire database. Furthermore, the crypto service neither learns the derived key nor the password. PHE supports key-rotation meaning that both the server and crypto service can update their keys without involving the user. While PHE significantly strengthens data security, it introduces a single point of failure because key-derivation always requires access to the crypto service. In this work, we address this issue and simultaneously increase security by introducing threshold password-hardened encryption. Our formalization of this primitive revealed shortcomings of the original PHE definition that we also address in this work. Following the spirit of prior works, we give a simple and efficient construction using lightweight tools only. We also implement our construction and evaluate its efficiency. Our experiments confirm the practical efficiency of our scheme and show that it is more efficient than common memory-hard functions, such as scrypt. From a practical perspective this means that threshold PHE can be used as an alternative to scrypt for password protection and key-derivation, offering better security in terms of offline brute force attacks.
Symmetric password-authenticated key exchange (sPAKE) can be seen as an extension of traditional key exchange where two parties agree on a shared key if and only if they share a common secret (possibly low-entropy) password. We present the first sPAKE protocol to simultaneously achieve the following properties: only two exponentiations per party, the same as plain unauthenticated Diffie-Hellman key agreement (and likely optimal); optimal round complexity: a single flow (one message from each party that can be sent in parallel) to achieve implicit authentication, or two flows to achieve explicit mutual authentication; security in the random oracle model, rather than ideal cipher or generic group model; UC security, rather than game-based. Our protocol is a generalization of the seminal EKE protocol of Bellovin & Merritt (S&P 1992).
We also present a UC-secure 1-out-of-N oblivious transfer (OT) protocol, for random payloads. Its communication complexity is independent of N, meaning that N can even be exponential in the security parameter. Such a protocol can also be considered a kind of oblivious PRF (OPRF). Our protocol improves over the leading UC-secure 1-out-of-N OT construction of Masny & Rindal (CCS 2019) for all N>2, and has essentially the same cost for N=2. The new technique underlying these results is a primitive we call programmable-once public function (POPF). Intuitively, a POPF is a function whose output can be programmed by one party on exactly one point. All other outputs of the function are outside of any party's control, in a provable sense.
In the past few years, we have seen multiple attacks on one-dimensional databases that support range queries. These attacks achieve full database reconstruction by exploiting access pattern leakage along with known query distribution or search pattern leakage. We are the first to go beyond one dimension, exploring this threat in two dimensions. We unveil an intrinsic limitation of reconstruction attacks by showing that there can be an exponential number of distinct databases that produce equivalent leakage. Next, we present a full database reconstruction attack. Our algorithm runs in polynomial time and returns a poly-size encoding of all databases consistent with the given leakage profile. We implement our algorithm and observe real-world databases that admit a large number of equivalent databases, which aligns with our theoretical results.
SESSION: Session 2C: Browser Security
In this paper, we conduct the largest to-date analysis of browser extensions, by investigating 922,684 different extension versions collected in the past six years, and using this data to discover malicious versions of extensions. We propose a two-stage system that first identifies malicious extensions based on anomalous extension ratings and locates the code that was added to a benign extension in order to make it malicious. We encode these code deltas according to the APIs that they abuse and search our historical dataset for other similar deltas of extensions which have not yet been flagged, neither by users nor by Chrome's Web Store. We were able to discover 143 malicious extensions belonging to 21 malicious clusters, exhibiting a wide range of abuse, from history stealing and ad injection, to the hijacking of new tabs and search engines. Our results show that our proposed techniques operate in an abuse-agnostic way and can identify malicious extensions that are evading detection.
The Web has become a platform in which sites rely on intricate interactions that span across the boundaries of origins. While the Same-Origin Policy prevents direct data exchange with documents from other origins, the postMessage API offers one relaxation that allows developers to exchange data across these boundaries. While prior manual analysis could show the presence of issues within postMessage handlers, unfortunately, a steep increase in postMessage usage makes any manual approach intractable. To deal with this increased work load, we set out to automatically find issues in postMessage handlers that allow an attacker to execute code in the vulnerable sites, alter client-side state, or leak sensitive information. To achieve this goal, we present an automated analysis framework running inside the browser, which uses selective forced execution paired with lightweight dynamic taint tracking to find traces in the analyzed handlers that end in sinks allowing for code-execution or state alterations. We use path constraints extracted from the program traces and augment them with Exploit Templates, i.e., additional constraints, ascertaining that a valid assignment that solves all these constraints produces a code-invoking or state-manipulating behavior. Based on these constraints, we use Z3 to generate postMessages aimed at triggering the insecure functionality to prove exploitability, and validate our findings at scale. We use this framework to conduct the most comprehensive experiment studying the security issues of postMessage handlers found throughout the top 100,000 most influential sites yet, which allows us to find potentially exploitable data flows in 252 unique handlers out of which 111 were automatically exploitable.
Providing functionality that streamlines the more tedious aspects of website interaction is of paramount importance to browsers as it can significantly improve the overall user experience. Browsers' autofill functionality exemplifies this goal, as it alleviates the burden of repetitively typing the same information across websites. At the same time, however, it also presents a significant privacy risk due to the inherent disparity between the browser's interpretation of a given web page and what users can visually perceive. In this paper we present the first, to our knowledge, comprehensive exploration of the privacy threats of autofill functionality. We first develop a series of new techniques for concealing the presence of form elements that allow us to obtain sensitive user information while bypassing existing browser defenses. Alarmingly, our large-scale study in the Alexa top 100K reveals the widespread use of such deceptive techniques for stealthily obtaining user-identifying information, as they are present in at least 5.8% of the forms that are autofilled by Chrome. Subsequently, our in-depth investigation of browsers' autofill functionality reveals a series of flaws and idiosyncrasies, which we exploit through a series of novel attack vectors that target specific aspects of browsers' behavior. By chaining these together we are able to demonstrate a novel invasive side-channel attack that exploits browser's autofill preview functionality for inferring sensitive information even when users choose to not utilize autofill. This attack affects all major Chromium-based browsers and allows attackers to probe users' autofill profiles for over a hundred thousand candidate values (e.g., credit card and phone numbers). Overall, while the preview mode is intended as a protective measure for enabling more informed decisions, ultimately it creates a new avenue of exposure that circumvents a user's choice to not divulge their information. In light of our findings, we have disclosed our techniques to the affected vendors, and have also created a Chrome extension that can prevent our attacks and mitigate this threat until our countermeasures are incorporated into browsers.
SESSION: Session 2D: Mobile Security
Fake base station (FBS) has been exploited by criminals to attack mobile users by spamming fraudulent messages for over a decade. Despite that prior work has proposed several techniques to mitigate this issue, FBS spam is still a long-standing challenging issue in some countries, such as China, and causes billions of dollars of financial loss every year. Therefore, understanding and exploring the thematic strategies in the FBS spam ecosystem at a large scale would improve the defense mechanisms.
In this paper, we present the first large-scale characterization of FBS spam ecosystem by collecting three-month real-world FBS detection results. First, at "macro-level'', we uncover the characteristics of FBS spammers, including their business categories, temporal patterns and spatial patterns. Second, at "micro-level'', we investigate how FBS ecosystem is organized and how fraudulent messages are constructed by campaigns to trap users and evade detection. Collectively, the results expand our understanding of the FBS spam ecosystem and provide new insights into improved mitigation mechanisms for the security community.
Repackaging popular benign apps with malicious payload used to be the most common way to spread Android malware. Nevertheless, since 2016, we have observed an alarming new trend to Android ecosystem: a growing number of Android malware samples abuse recent app-virtualization innovation as a new distribution channel. App-virtualization enables a user to run multiple copies of the same app on a single device, and tens of millions of users are enjoying this convenience. However, cybercriminals repackage various malicious APK files as plugins into an app-virtualization platform, which is flexible to launch arbitrary plugins without the hassle of installation. This new style of repackaging gains the ability to bypass anti-malware scanners by hiding the grafted malicious payload in plugins, and it also defies the basic premise embodied by existing repackaged app detection solutions.
As app-virtualization-based apps are not necessarily malware, in this paper, we aim to make a verdict on them prior to run time. Our in-depth study results in two key observations: 1) the proxy layer between plugin apps and the Android framework is the core of app-virtualization mechanism, and it reveals the feature of finite state transitions; 2) malware typically loads plugins stealthily and hides malicious behaviors. These insights motivate us to develop a two-layer detection approach, called VAHunt. First, we design a stateful detection model to identify the existence of an app-virtualization engine in APK files. Second, we perform data flow analysis to extract fingerprinting features to differentiate between malicious and benign loading strategies. Since October 2019, we have tested VAHunt in Antiy AVL Mobile Security, a leading mobile security company, to detect more than 139K app-virtualization-based samples. Compared with the ground truth, VAHunt achieves 0.7% false negatives and zero false positive. Our automated detection frees security analysts from the burden of reverse engineering.
Deploying Android Security Updates: an Extensive Study Involving Manufacturers, Carriers, and End Users
Android's fragmented ecosystem makes the delivery of security updates and OS upgrades cumbersome and complex. While Google initiated various projects such as Android One, Project Treble, and Project Mainline to address this problem, and other involved entities (e.g., chipset vendors, manufacturers, carriers) continuously strive to improve their processes, it is still unclear how effective these efforts are on the delivery of updates to supported end-user devices. In this paper, we perform an extensive quantitative study (Aug. 2015 to Dec. 2019) to measure the Android security updates and OS upgrades rollout process. Our study leverages multiple data sources: the Android Open Source Project (AOSP), device manufacturers, and the top four U.S. carriers (AT&T, Verizon, T-Mobile, and Sprint). Furthermore, we analyze an end-user dataset captured in 2019 (152M anonymized HTTP requests associated with 9.1M unique user identifiers) from a U.S.-based social network. Our findings include unique measurements that, due to the fragmented and inconsistent ecosystem, were previously challenging to perform. For example, manufacturers and carriers introduce a median latency of 24 days before rolling out security updates, with an additional median delay of 11 days before end devices update. We show that these values alter per carrier-manufacturer relationship, yet do not alter greatly based on a model's age. Our results also delve into the effectiveness of current Android projects. For instance, security updates for Treble devices are available on average 7 days faster than for non-Treble devices. While this constitutes an improvement, the security update delay for Treble devices still averages 19 days.
App-in-app is a new and trending mobile computing paradigm in which native app-like software modules, called sub-apps, are hosted by popular mobile apps such as Wechat, Baidu, TikTok and Chrome, to enrich the host app's functionalities and to form an "all-in-one app" ecosystem. Sub-apps access system resources through the host, and their functionalities come close to regular mobile apps (taking photos, recording voices, banking, shopping, etc.). Less clear, however, is whether the host app, typically a third-party app, is capable of securely managing sub-apps and their access to system resources. In this paper, we report the first systematic study on the resource management in app-in-app systems. Our study reveals high-impact security flaws, which allow the adversary to stealthily escalate privilege (e.g., accessing the camera, photo gallery, microphone, etc.) or acquire sensitive data (e.g., location, passwords of Amazon, Google, etc.). To understand the impacts of those flaws, we developed an analysis tool that automatically assesses 11 popular app-in-app platforms on both Android and iOS. Our results brought to light the prevalence of the security flaws. We further discuss the lessons learned and propose mitigation strategies.
SESSION: Session 2E: Smart Contracts and Cryptocurrencies
Smart contracts are programmable, decentralized and transparent financial applications. Because smart contract platforms typically support Turing-complete programming languages, such systems are often said to enable arbitrary applications. However, the current permissionless smart contract systems impose heavy restrictions on the types of computations that can be implemented. For example, the globally-replicated and sequential execution model of Ethereum requires low gas limits that make many computations infeasible.
In this paper, we propose a novel system called ACE whose main goal is to enable more complex smart contracts on permissionless blockchains. ACE is based on an off-chain execution model where the contract issuers appoint a set of service providers to execute the contract code independent from the consensus layer. The primary advantage of ACE over previous solutions is that it allows one contract to safely call another contract that is executed by a different set of service providers. Thus, ACE is the first solution to enable off-chain execution of interactive smart contracts with flexible trust assumptions. Our evaluation shows that ACE enables several orders of magnitude more complex smart contracts than standard Ethereum.
Proof-of-work (PoW) cryptocurrency blockchains like Bitcoin secure vast amounts of money. Their operators, called miners, expend resources to generate blocks and receive monetary rewards for their effort. Blockchains are, in principle, attractive targets for Denial-of-Service (DoS) attacks: There is fierce competition among coins, as well as potential gains from short selling. Classical DoS attacks, however, typically target a few servers and cannot scale to systems with many nodes. There have been no successful DoS attacks to date against prominent cryptocurrencies. We present Blockchain DoS (BDoS), the first incentive-based DoS attack that targets PoW cryptocurrencies. Unlike classical DoS, BDoS targets the system's mechanism design: It exploits the reward mechanism to discourage miner participation. Previous DoS attacks against PoW blockchains require an adversary's mining power to match that of all other miners. In contrast, BDoS can cause a blockchain to grind to a halt with significantly fewer resources, e.g., 21% as of March 2020 in Bitcoin, according to our empirical study. We find that Bitcoin's vulnerability to BDoS increases rapidly as the mining industry matures and profitability drops. BDoS differs from known attacks like Selfish Mining in its aim not to increase an adversary's revenue, but to disrupt the system. Although it bears some algorithmic similarity to those attacks, it introduces a new adversarial model, goals, algorithm, and game-theoretic analysis. Beyond its direct implications for operational blockchains, BDoS introduces the novel idea that an adversary can manipulate miners' incentives by proving the existence of blocks without actually publishing them.
Ethereum has emerged as the most popular smart contract platform, with hundreds of thousands of contracts stored on the blockchain and covering diverse application scenarios, such as auctions, trading platforms, or elections. Given the financial nature of smart contracts, security vulnerabilities may lead to catastrophic consequences and, even worse, can hardly be fixed as data stored on the blockchain, including the smart contract code itself, are immutable. An automated security analysis of these contracts is thus of utmost interest, but at the same time technically challenging. This is as e.g., Ethereum's transaction-oriented programming mechanisms feature a subtle semantics, and since the blockchain data at execution time, including the code of callers and callees, are not statically known. In this work, we present eThor, the first sound and automated static analyzer for EVM bytecode, which is based on an abstraction of the EVM bytecode semantics based on Horn clauses. In particular, our static analysis supports reachability properties, which we show to be sufficient for capturing interesting security properties for smart contracts (e.g., single-entrancy) as well as contract-specific functional properties. Our analysis is proven sound against a complete semantics of EVM bytecode, and a large-scale experimental evaluation on real-world contracts demonstrates that eThor is practical and outperforms the state-of-the-art static analyzers: specifically, eThor is the only one to provide soundness guarantees, terminates on 94% of a representative set of real-world contracts, and achieves an F-measure (which combines sensitivity and specificity) of 89%.
The problem of fair exchange consists of interchanging goods between two parties that do not trust each other. Despite known impossibility results, recent works leverage the block-chain and zero-knowledge proofs to implement zero-knowledge contingent payment (zkCP) systems that make fair exchange of digital goods possible. Implementing these systems in a secure and efficient way is a big challenge, as evidenced by several unsuccessful attempts from the literature. Campanelli et al. (ACM CCS 2017) discovered a vulnerability on an existing zkCP proposal based on SNARKs (succinct non-interactive arguments of knowledge) and suggested several repairs. Fuchsbauer (ACM CCS 2019) found a flaw in the mentioned countermeasures. In particular, he showed that witness-indistinguishability (WI) is not sufficient for the zkCP schemes proposed by Campanelli et al. to be secure. In this work, we observe that a slightly stronger notion of WI, that we coin trapdoor subversion WI (tS-WI), rules out Fuchsbauer's attack. We formally define security properties for CP systems and show that, under tS-WI, Campanelli et al.'s proposal indeed satisfies these properties. Additionally, we explore alternative approaches to implement ZK (other than SNARKs) and develop a prototype, using it to demonstrate their potential. Our new ideas result in a protocol to sell ECDSA signatures with contingent payment that can be executed in less than $150$ milliseconds over a LAN network.
SESSION: Session 3A: Privacy
The shuffle model of differential privacy (Erlingsson et al. SODA 2019; Cheu et al. EUROCRYPT 2019) and its close relative encode-shuffle-analyze (Bittau et al. SOSP 2017) provide a fertile middle ground between the well-known local and central models. Similarly to the local model, the shuffle model assumes an untrusted data collector who receives privatized messages from users, but in this case a secure shuffler is used to transmit messages from users to the collector in a way that hides which messages came from which user. An interesting feature of the shuffle model is that increasing the amount of messages sent by each user can lead to protocols with accuracies comparable to the ones achievable in the central model. In particular, for the problem of privately computing the sum of n bounded real values held by n different users, Cheu et al. showed that O(sqrtn ) messages per user suffice to achieve O(1) error (the optimal rate in the central model), while Balle et al. (CRYPTO 2019) recently showed that a single message per user leads to Theta(n^1/3 ) MSE (mean squared error), a rate strictly in-between what is achievable in the local and central models. This paper introduces two new protocols for summation in the shuffle model with improved accuracy and communication trade-offs. Our first contribution is a recursive construction based on the protocol from Balle et al. mentioned above, providing poly(log log n) error with O(log log n) messages per user. The second contribution is a protocol with O(1) error and O(1) messages per user based on a novel analysis of the reduction from secure summation to shuffling introduced by Ishai et al. (FOCS 2006) (the original reduction required O(log n) messages per user). We also provide a numerical evaluation showing that our protocols provide good trade-offs between privacy, accuracy and communication for realistic values of n.
R2DP: A Universal and Automated Approach to Optimizing the Randomization Mechanisms of Differential Privacy for Utility Metrics with No Known Optimal Distributions
Differential privacy (DP) has emerged as a de facto standard privacy notion for a wide range of applications. Since the meaning of data utility in different applications may vastly differ, a key challenge is to find the optimal randomization mechanism, i.e., the distribution and its parameters, for a given utility metric. Existing works have identified the optimal distributions in some special cases, while leaving all other utility metrics (e.g., usefulness and graph distance) as open problems. Since existing works mostly rely on manual analysis to examine the search space of all distributions, it would be an expensive process to repeat such efforts for each utility metric. To address such deficiency, we propose a novel approach that can automatically optimize different utility metrics found in diverse applications under a common framework. Our key idea that, by regarding the variance of the injected noise itself as a random variable, a two-fold distribution may approximately cover the search space of all distributions. Therefore, we can automatically find distributions in this search space to optimize different utility metrics in a similar manner, simply by optimizing the parameters of the two-fold distribution. Specifically, we define a universal framework, namely, randomizing the randomization mechanism of differential privacy (R2DP), and we formally analyze its privacy and utility. Our experiments show that R2DP can provide better results than the baseline distribution (Laplace) for several utility metrics with no known optimal distributions, whereas our results asymptotically approach to the optimality for utility metrics having known optimal distributions. As a side benefit, the added degree of freedom introduced by the two-fold distribution allows R2DP to accommodate the preferences of both data owners and recipients.
This paper considers the problem of estimating the information leakage of a system in the black-box scenario, i.e. when the system's internals are unknown to the learner, or too complicated to analyze, and the only available information are pairs of input-output data samples, obtained by submitting queries to the system or provided by a third party. The frequentist approach relies on counting the frequencies to estimate the input-output conditional probabilities, however this method is not accurate when the domain of possible outputs is large. To overcome this difficulty, the estimation of the Bayes error of the ideal classifier was recently investigated using Machine Learning (ML) models, and it has been shown to be more accurate thanks to the ability of those models to learn the input-output correspondence. However, the Bayes vulnerability is only suitable to describe one-try attacks. A more general and flexible measure of leakage is the g-vulnerability, which encompasses several different types of adversaries, with different goals and capabilities. We propose a novel approach to perform black-box estimation of the g-vulnerability using ML which does not require to estimate the conditional probabilities and is suitable for a large class of ML algorithms. First, we formally show the learnability for all data distributions. Then, we evaluate the performance via various experiments using k-Nearest Neighbors and Neural Networks. Our approach outperform the frequentist one when the observables domain is large.
Despite excellent theoretical support, Differential Privacy (DP) can still be a challenge to implement in practice. In part, this challenge is due to the concerns associated with translating arbitrary- or infinite-precision theoretical mechanisms to the reality of floating point or fixed-precision. Beginning with the troubling result of Mironov demonstrating the security issues of using floating point for implementing the Laplace mechanism, there have been many reasonable questions raised concerning the vulnerabilities of real-world implementations of DP. In this work, we examine the practicalities of implementing the exponential mechanism of McSherry and Talwar. We demonstrate that naive or malicious implementations can result in catastrophic privacy failures. To address these problems, we show that the mechanism can be implemented exactly for a rich set of utility functions and values of the privacy parameter epsilon with limited practical overhead in running time and minimal code complexity. How do we achieve this result? We employ a simple trick of switching from base-e to base 2, allowing us to perform precise base-2 arithmetic. A short, precise expression is always available for epsilon, and the only approximation error we incur is the conversion of the base-2 privacy parameter back to base-e for reporting purposes. The core base-2 arithmetic of the mechanism can be simply and efficiently implemented using open-source high precision arithmetic libraries. Furthermore, the exact nature of the implementation lends itself to simple monitoring of correctness and proofs of privacy.
SESSION: Session 3B: Malware
Using hundreds of thousands of compromised IoT devices, the Mirai botnet emerged in late 2016 as a game changing threat actor, capable of temporarily taking down major Internet service providers and Internet infrastructure. Since then, dozens of variants of IoT-based botnets have sprung up, and in today's Internet distributed denial-of-service attacks from IoT devices have become a major attack vector. This proliferation was significantly driven by the public distribution of the Mirai source code, which other actors used to create their own, customized version of the original Mirai botnet. In this paper we provide a comprehensive view into the ongoing battle over the Internet of Things fought by Mirai and its many siblings. Using 7,500 IoT honeypots, we show that we can use 300,000,000 compromisation attempts from infected IoT devices as well as a design flaw in Mirai's random number generator to obtain insights into Mirai infections worldwide. We find that networks and the particular malware strains that plague them are tightly connected, and malware authors over time take over strategies from their competitors. The most surprising finding is that epidemiologically, IoT botnets are not self-sustaining: were it not for continuous pushes from bootstrapping, Mirai and its variants would die out.
Machine learning (ML) classifiers have been widely deployed to detect Android malware, but at the same time the application of ML classifiers also faces an emerging problem. The performance of such classifiers degrades---or called ages---significantly over time given the malware evolution. Prior works have proposed to use retraining or active learning to reverse and improve aged models. However, the underlying classifier itself is still blind, unaware of malware evolution. Unsurprisingly, such evolution-insensitive retraining or active learning comes at a price, i.e., the labeling of tens of thousands of malware samples and the cost of significant human efforts. In this paper, we propose the first framework, called APIGraph, to enhance state-of-the-art malware classifiers with the similarity information among evolved Android malware in terms of semantically-equivalent or similar API usages, thus naturally slowing down classifier aging. Our evaluation shows that because of the slow-down of classifier aging, APIGraph saves significant amounts of human efforts required by active learning in labeling new malware samples.
Malicious developers may succeed at publishing their apps in mobile markets, including the official ones. If reported, the apps will be taken down and the developer accounts possibly be banned. Unfortunately, such take-downs do not prevent the attackers to use other developer accounts to publish variations of their malicious apps. This work presents a novel approach for identifying developer accounts, and other indicators of compromise (IOCs) in mobile markets, that belong to the same operation, i.e., to the same owners. Given a set of seed IOCs, our approach explores app and version metadata to identify new IOCs that belong to the same operation. It outputs an attribution graph, which details the attribution inferences, so that they can be reviewed. We have implemented our approach into Retriever, a tool that supports multiple mobile markets including the official GooglePlay and AppleStore. We have evaluated Retriever on 17 rogueware and adware operations. In 94% of the operations, Retriever discovers at least one previously unknown developer account. Furthermore, Retriever reveals that operations that look dead still have active developer accounts.
Compromising a website that is routinely visited by employees of a targeted organization has become a popular technique for nation-state level adversaries to penetrate an enterprise's network. This technique, dubbed a "watering hole" attack, leverages a compromised website to serve as a stepping stone into the true victims' network. Despite watering hole attacks being one of the main techniques used by attackers to achieve the initial compromise stage of the cyber kill chain, there has been relatively little research related to detecting or investigating complex watering hole attacks. While there is existing work that seeks to detect malicious modifications made to an otherwise benign website, we argue that simply detecting that the website is compromised is only the first stage of the investigation. In this paper, we propose Mnemosyne, a postmortem forensic analysis engine that relies on browser-based attack provenance to accurately reconstruct, investigate, and assess the ramifications of watering hole attacks. Mnemosyne relies on a lightweight browser-modification-free auditing daemon to passively collect causality logs related to the browser's execution. Next, Mnemosyne applies a set of versioning techniques on top of these causality logs to precisely pinpoint when the website was compromised and what modifications were made by the adversary. Following this step, Mnemosyne relies on a novel user-level analysis to assess how the malicious modifications affected the targeted enterprise and seeks to identify exactly which employees fell victim to the attack. Throughout our extensive evaluation, we found that Mnemosyne's forensic analysis engine was able to identify the true victims in all seven real-world watering hole scenarios, while also reducing the amount of manual analysis required by the forensic analyst by 98.17% on average.
SESSION: Session 3C: Consensus
HoneyBadgerBFT, proposed by Miller et al.  as the first practical asynchronous atomic broadcast protocol, demonstrated impressive performance. The core of HoneyBadgerBFT (HB-BFT) is to achieve batching consensus using asynchronous common subset protocol (ACS) of Ben-Or et al., constituted with n reliable broadcast protocol (RBC) to have each node propose its input, followed by n asynchronous binary agreement protocol (ABA) to make a decision for each proposed value (n is the total number of nodes).
In this paper, we propose two new atomic broadcast protocols (called Dumbo1, Dumbo2) both of which have asymptotically and practically better efficiency. In particular, the ACS of Dumbo1 only runs a small κ (independent of n) instances of ABA, while that of Dumbo2 further reduces it to constant! At the core of our techniques are two major observations: (1) reducing the number of ABA instances significantly improves efficiency; and (2) using multi-valued validated Byzantine agreement (MVBA) which was considered sub-optimal for ACS in  in a more careful way could actually lead to a much more efficient ACS.
We implement both Dumbo1, Dumbo2 and deploy them as well as HB-BFT on 100 Amazon EC2 t2.medium instances uniformly distributed throughout 10 different regions across the globe, and run extensive experiments in the same environments. The experimental results show that our protocols achieve multi-fold improvements over HoneyBadgerBFT on both latency and throughput, especially when the system scale becomes moderately large.
We establish the optimal security threshold for the Bitcoin protocol in terms of adversarial hashing power, honest hashing power, and network delays. Specifically, we prove that the protocol is secure if [ra < 1/Δ0 + 1/rh,,] where rh is the expected number of honest proof-of-work successes in unit time, ra is the expected number of adversarial successes, and no message is delayed by more than Δ0 time units. In this regime, the protocol guarantees consistency and liveness with exponentially decaying failure probabilities. Outside this region, the simple private chain attack prevents consensus. Our analysis immediately applies to any Nakamoto-style proof-of-work protocol; in the full version of this paper we also present the adaptations needed to apply it in the proof-of-stake setting, establishing a similar threshold there.
Synchronous consensus protocols, by definition, have a worst-case commit latency that depends on the bounded network delay. The notion of optimistic responsiveness was recently introduced to allow synchronous protocols to commit instantaneously when some optimistic conditions are met. In this work, we revisit this notion of optimistic responsiveness and present optimal latency results. We present a lower bound for Byzantine Broadcast that relates the latency of optimistic and synchronous commits when the designated sender is honest and while the optimistic commit can tolerate some faults. We then present two matching upper bounds for tolerating f faults out of $n = 2f+1$ parties. Our first upper bound result achieves optimal optimistic and synchronous commit latency when the designated sender is honest and the optimistic commit can tolerate at least one fault. We experimentally evaluate this protocol and show that it achieves throughput comparable to state-of-the-art synchronous and partially synchronous protocols and under optimistic conditions achieves latency better than the state-of-the-art. Our second upper bound result achieves optimal optimistic and synchronous commit latency when the designated sender is honest but the optimistic commit does not tolerate any faults. The presence of matching lower and upper bound results make both of the results tight for $n = 2f+1$. Our upper bound results are presented in a state machine replication setting with a steady-state leader who is replaced with a view-change protocol when they do not make progress. For this setting, we also present an optimistically responsive protocol where the view-change protocol is optimistically responsive too.
Nakamoto invented the longest chain protocol, and claimed its security by analyzing the private double-spend attack, a race between the adversary and the honest nodes to grow a longer chain. But is it the worst attack? We answer the question in the affirmative for three classes of longest chain protocols, designed for different consensus models: 1) Nakamoto's original Proof-of-Work protocol; 2) Ouroboros and SnowWhite Proof-of-Stake protocols; 3) Chia Proof-of-Space protocol. As a consequence, exact characterization of the maximum tolerable adversary power is obtained for each protocol as a function of the average block time normalized by the network delay. The security analysis of these protocols is performed in a unified manner by a novel method of reducing all attacks to a race between the adversary and the honest nodes.
SESSION: Session 3D: Formal Methods
Contactless systems, such as the EMV (Europay, Mastercard and Visa) payment protocol, are vulnerable to relay attacks. The typical countermeasure to this relies on distance bounding protocols, in which a reader estimates an upper bound on its physical distance from a card by doing round-trip time (RTT) measurements. However, these protocols are trivially broken in the presence of rogue readers. At Financial Crypto 2019, we proposed two novel EMV-based relay-resistant protocols: they integrate distance-bounding with the use of hardware roots of trust (HWRoT) in such a way that correct RTT-measurements can no longer be bypassed.
Our contributions are threefold: first, we design a calculus to model this advanced type of distance-bounding protocols integrated with HWRoT; as an additional novelty, our calculus is also the first to allow for mobility of cards and readers during a proximity-checking phase. Second, to make it possible to analyse these protocols via more standard mechanisms and tools, we consider a 2018 characterisation of distance-bounding security that does away with physical aspects and relies only on the causality of events; we cast it in our richer calculus and extend its theoretical guarantees to our more expressive models (with mobility, potentially rogue readers, and HWRoT). Due to this extension, we can carry out the security analysis in the standard protocol verification tool ProVerif. Third, we provide the first implementation of Mastercard's relay-resistant EMV protocol PayPass-RRP as well as one of its 2019 extension with HWRoT called PayBCR. We evaluate their efficiency and their robustness to relay attacks, in presence of both honest and rogue readers. Our experiments are the first to show that Mastercard's PayPass-RRP and its HWRoT-based extension PayBCR are both practical in preventing relay attacks of the magnitude shown thus-far in EMV.
We present a new methodology for building formally verified cryptographic libraries that are optimized for multiple architectures. In particular, we show how to write and verify generic crypto code in the F* programming language that exploits single-instruction multiple data (SIMD) parallelism. We show how this code can be compiled to platforms that support vector instructions, including ARM Neon and Intel AVX, AVX2, and AVX512. We apply our methodology to obtain verified vectorized implementations on all these platforms for the ChaCha20 encryption algorithm, the Poly1305 one-time MAC, and the SHA-2 and Blake2 families of hash algorithms.
A distinctive feature of our approach is that we aggressively share code and verification effort between scalar and vectorized code, between vectorized code for different platforms, and between implementations of different cryptographic primitives. By doing so, we significantly reduce the manual effort needed to add new implementations to our verified library. In this paper, we describe our methodology and verification results, evaluate the performance of our code, and describe its integration into the HACL* crypto library. Our vectorized code has already been incorporated into several software projects, including the Firefox web browser.
CheckDP: An Automated and Integrated Approach for Proving Differential Privacy or Finding Precise Counterexamples
We propose CheckDP, an automated and integrated approach for proving or disproving claims that a mechanism is differentially private. CheckDP can find counterexamples for mechanisms with subtle bugs for which prior counterexample generators have failed. Furthermore, it was able to automatically generate proofs for correct mechanisms for which no formal verification was reported before. CheckDP is built on static program analysis, allowing it to be more efficient and precise in catching infrequent events than sampling based counterexample generators (which run mechanisms hundreds of thousands of times to estimate their output distribution). Moreover, its sound approach also allows automatic verification of correct mechanisms. When evaluated on standard benchmarks and newer privacy mechanisms, CheckDP generates proofs (for correct mechanisms) and counterexamples (for incorrect mechanisms) within 70 seconds without any false positives or false negatives.
WebAuthn, forming part of FIDO2, is a W3C standard for strong authentication, which employs digital signatures to authenticate web users whilst preserving their privacy. Owned by users, WebAuthn authenticators generate attested and unlinkable public-key credentials for each web service to authenticate users. Since the loss of authenticators prevents users from accessing web services, usable recovery solutions preserving the original WebAuthn design choices and security objectives are urgently needed. We examine Yubico's recent proposal for recovering from the loss of a WebAuthn authenticator by using a secondary backup authenticator. We analyse the cryptographic core of their proposal by modelling a new primitive, called Asynchronous Remote Key Generation (ARKG), which allows some primary authenticator to generate unlinkable public keys for which the backup authenticator may later recover corresponding private keys. Both processes occur asynchronously without the need for authenticators to export or share secrets, adhering to WebAuthn's attestation requirements. We prove that Yubico's proposal achieves our ARKG security properties under the discrete logarithm and PRF-ODH assumptions in the random oracle model. To prove that recovered private keys can be used securely by other cryptographic schemes, such as digital signatures or encryption schemes, we model compositional security of ARKG using composable games by Brzuska et al. (ACM CCS 2011), extended to the case of arbitrary public-key protocols. As well as being more general, our results show that private keys generated by ARKG may be used securely to produce unforgeable signatures for challenge-response protocols, as used in WebAuthn. We conclude our analysis by discussing concrete instantiations behind Yubico's ARKG protocol, its integration with the WebAuthn standard, performance, and usability aspects.
SESSION: Session 3E: Fuzzing/Trusted Execution Environments
Fuzzing is an increasingly popular technique for verifying software functionalities and finding security vulnerabilities. However, current mutation-based fuzzers cannot effectively test database management systems (DBMSs), which strictly check inputs for valid syntax and semantics. Generation-based testing can guarantee the syntax correctness of the inputs, but it does not utilize any feedback, like code coverage, to guide the path exploration.
In this paper, we develop Squirrel, a novel fuzzing framework that considers both language validity and coverage feedback to test DBMSs. We design an intermediate representation (IR) to maintain SQL queries in a structural and informative manner. To generate syntactically correct queries, we perform type-based mutations on IR, including statement insertion, deletion and replacement. To mitigate semantic errors, we analyze each IR to identify the logical dependencies between arguments, and generate queries that satisfy these dependencies. We evaluated Squirrel on four popular DBMSs: SQLite, MySQL, PostgreSQL and MariaDB. Squirrel found 51 bugs in SQLite, 7 in MySQL and 5 in MariaDB. 52 of the bugs are fixed with 12 CVEs assigned. In our experiment, Squirrel achieves 2.4×-243.9× higher semantic correctness than state-of-the-art fuzzers, and explores 2.0×-10.9× more new edges than mutation-based tools. These results show that Squirrel is effective in finding memory errors of database management systems.
The DOM engine of a web browser is a popular attack surface and has been thoroughly fuzzed during its development. A common approach adopted by the latest DOM fuzzers is to generate new inputs based on context-free grammars. However, such a generative approach fails to capture the data dependencies in the inputs of a DOM engine, namely, HTML documents. Meanwhile, it is unclear whether or not coverage-guided mutation, which is well-known to be effective in fuzzing numerous software, still remains to be effective against DOM engines. Worse yet, existing DOM fuzzers cannot adopt a coverage-guided approach because they are unable to fully support HTML mutation and suffer from low browser throughput. To scientifically understand the effectiveness and limitations of the two approaches, we propose FreeDom, a full-fledged cluster-friendly DOM fuzzer that works with both generative and coverage-guided modes. FreeDom relies on a context-aware intermediate representation to describe HTML documents with proper data dependencies. FreeDom also exhibits up to 3.74x higher throughput through browser self-termination. FreeDom has found 24 previously unknown bugs in commodity browsers including Safari, Firefox, and Chrome, and 10 CVEs has been assigned so far. With the context-aware generation, FreeDom finds 3x more unique crashes in WebKit than the state-of-the-art DOM fuzzer, Domato. FreeDom guided by coverage is more effective in revealing new code blocks (2.62%) and finds three complex bugs that its generative approach fails to find. However, coverage-guided mutation that bootstraps with an empty corpus triggers 3.8x fewer unique crashes than the generative approach. The newly revealed coverage, more often than not, negatively affects the effectiveness of DOM fuzzers in bug finding. Therefore, we consider context-aware generation the best practice to find more DOM engine bugs and expect further improvement on coverage-guided DOM fuzzing facilitated by FreeDom.
Online gaming, with a reported 152 billion US dollar market, is immensely popular today. One of the critical issues in multiplayer online games is cheating, in which a player uses an illegal methodology to create an advantage beyond honest game play. For example, wallhacks, the main focus of this work, animate enemy objects on a cheating player's screen, despite being actually hidden behind walls (or other occluding objects). Since such cheats discourage honest players and cause game companies to lose revenue, gaming companies deploy mitigation solutions alongside game applications on the player's machine. However, their solutions are fundamentally flawed since they are deployed on a machine where the attacker has absolute control.
The traditional usage of ARM TrustZone has difficulty on solving the conflicts between the manufacturers that want to minimize the trusted computing base by constraining the installation of third-party applications in the secure world and the third-party application developers who prefer to have the freedom of installing their applications into the secure world. To address this issue, researchers propose to create Isolated Execution Environments (called IEEs) in the normal world to protect the security-sensitive applications. In this paper, we perform a systematic study on the IEE data protection models and the ARM cache attributes, and discover three cache-based attacks called CITM that can be leveraged to manipulate the sensitive data protected in IEEs. Specifically, due to the inefficient and incoherent security measures on the cache that maps to the IEE memory (i.e., memory designated for IEEs), attackers in the normal world may compromise the security of IEE data by manipulating the IEE memory during concurrent execution, bypassing the security measures enforced when a security-sensitive application is suspended or finished, or misusing the incomplete security measures during IEE's context switching processes. We conduct case studies of CITM attacks on three well-known IEE systems including SANCTUARY, Ginseng, and TrustICE to illustrate the feasibility to exploit them on real hardware testbeds. Finally, we analyze the root causes of the CITM attacks and propose a countermeasure to defeat them. The experimental results show that our defense scheme has a small overhead.
SESSION: Session 4A: Post-Quantum Cryptography
Most blockchain solutions are susceptible to quantum attackers as they rely on cryptography that is known to be insecure in the presence of quantum adversaries. In this work we advance the study of quantum-resistant blockchain solutions by giving a quantum-resistant construction of a deterministic wallet scheme. Deterministic wallets are frequently used in practice in order to secure funds by storing the sensitive secret key on a so-called cold wallet that is not connected to the Internet. Recently, Das et al. (CCS'19) developed a formal model for the security analysis of deterministic wallets and proposed a generic construction from certain types of signature schemes that exhibit key rerandomization properties. We revisit the proposed classical construction in the presence of quantum adversaries and obtain the following results.
First, we give a generic wallet construction with security in the quantum random oracle model (QROM) if the underlying signature scheme is secure in the QROM. We next design the first post-quantum secure signature scheme with rerandomizable public keys by giving a construction from generic lattice-based Fiat-Shamir signature schemes. Finally, we show and evaluate the practicality by analyzing an instantiation of the wallet scheme based on the signature scheme qTESLA (ACNS'20).
MPC-in-the-head based protocols have recently gained much popularity and are at the brink of seeing widespread usage. With widespread use come the spectres of implementation issues and implementation attacks such as side-channel attacks. We show that implementations of protocols implementing the MPC-in-the-head paradigm are vulnerable to side-channel attacks. As a case study, we choose the ZKBoo-protocol of Giacomelli, Madsen, and Orlandi (USENIX 2016) and show that even a single leaked value is sufficient to break the security of the protocol. To show that this attack is not just a theoretical vulnerability, we apply differential power analysis to show the vulnerabilities via a simulation.
In order to remedy this situation, we extend and generalize the ZKBoo-protocol by making use of the notion of strong non-interference of Barthe et al. (CCS 2016). To apply this notion to ZKBoo, we construct novel versions of strongly non-interfering gadgets that balance the randomness across the different branches evenly. Finally, we show that each circuit can be decomposed into branches using only these balanced strongly non-interfering gadgets. This allows us to construct a version of ZKBoo, called $(n+1)$-ZKBoo which is secure against side-channel attacks with limited overhead in both signature-size and running time. Furthermore, $(n+1)$-ZKBoo is scalable to the desired security against adversarial probes. We experimentally confirm that the attacks successful against ZKBoo no longer work on $(n+1)$-ZKBoo. Moreover, we present an extensive performance analysis and quantify the overhead of our scheme using a practical implementation.
We present a novel lattice-based zero-knowledge proof system for showing that (arbitrary-sized) committed integers satisfy additive and multiplicative relationships. The proof sizes of our schemes are between two to three orders of magnitude smaller than in the lattice proof system of Libert et al. (CRYPTO 2018) for the same relations. Because the proof sizes of our protocols grow linearly in the integer length, our proofs will eventually be longer than those produced by quantum-safe succinct proof systems for general circuits (e.g. Ligero, Aurora, etc.). But for relations between reasonably-sized integers (e.g. $512$-bit), our proofs still result in the smallest zero-knowledge proof system based on a quantum-safe assumption. Of equal importance, the run-time of our proof system is at least an order of magnitude faster than any other quantum-safe scheme.
Post-quantum schemes are expected to replace existing public-key schemes within a decade in billions of devices. To facilitate the transition, the US National Institute for Standards and Technology (NIST) is running a standardization process. Multivariate signatures is one of the main categories in NIST's post-quantum cryptography competition. Among the four candidates in this category, the LUOV and Rainbow schemes are based on the Oil and Vinegar scheme, first introduced in 1997 which has withstood over two decades of cryptanalysis. Beyond mathematical security and efficiency, security against side-channel attacks is a major concern in the competition. The current sentiment is that post-quantum schemes may be more resistant to fault-injection attacks due to their large key sizes and the lack of algebraic structure. We show that this is not true. We introduce a novel hybrid attack, QuantumHammer, and demonstrate it on the constant-time implementation of LUOV currently in Round 2 of the NIST post-quantum competition. The QuantumHammer attack is a combination of two attacks, a bit-tracing attack enabled via Rowhammer fault injection and a divide and conquer attack that uses bit-tracing as an oracle. Using bit-tracing, an attacker with access to faulty signatures collected using Rowhammer attack, can recover secret key bits albeit slowly. We employ a divide and conquer attack which exploits the structure in the key generation part of LUOV and solves the system of equations for the secret key more efficiently with few key bits recovered via bit-tracing. We have demonstrated the first successful in-the-wild attack on LUOV recovering all 11K key bits with less than 4 hours of an active Rowhammer attack. The post-processing part is highly parallel and thus can be trivially sped up using modest resources. QuantumHammer does not make any unrealistic assumptions, only requires software co-location (no physical access), and therefore can be used to target shared cloud servers or in other sandboxed environments.
SESSION: Session 4B: Physical Attacks
This study presents a new TEMPEST threat that an attacker can surreptitiously obtain original plain audio information from a distance by exploiting recently emerging unintentional electromagnetic (EM) radiations. As lightweight sensor-based Internet of things (IoT) services become widespread, a mixed-signal system on chip (MSoC) spontaneously integrates all components, such as digital, analog, and even power circuits, into a single chipset to minimize the size of IoT devices. Accordingly, we pay attention to the accelerated integration of a switching regulator (SWREG), which is one of the typical power circuits and may substantially increase the unintentional EM leakages, re-enabling the audio TEMPEST attack.
In this paper, we posit that a root cause of new audio coupled EM leakages is the unavoidable integration of SWREG which innately has strong and low-frequency (i.e., several MHz) switching noises; an audio signal is conductively coupled on the single common substrate of an MSoC with a system clock and the newly emerging the SWREG noises. The unique features of the suggested EM leakages compared to previous leakages are that their frequency distribution is dense (i.e., at frequency intervals of the SWREG noise), wideband (i.e., from several MHz to over 1 GHz), and static (i.e., time-invariant center frequencies). These features make the new TEMPEST attack due to the SWREG noise have a longer attack range and be more robust to interferences. Consequently, the presented TEMPEST attack becomes considerably practical.
To verify the new TEMPEST attack due to the SWREG noise, we first perform a feasibility analysis by measuring and analyzing the audio-conveyed EM emanations of the popular MSoCs in an anechoic chamber. Next, we demonstrate how critical and practical the threat is by capturing the leakages from the commercial devices in an office environment. Furthermore, we propose a new signal reinforcement method with the three benefits (dense, wideband, and static) of the suggested radiations: the spectral addition of phase-aligned signals. The experimental results show that the test sweep tones of the Sogou voice recorder (nRF52810 chipset) and Xiaomi earbuds (CSR8640 chipset) can be reconstructed over 10 meters. Additionally, an attack feasibility analysis on digital signal (I2C) is performed in a short-range. The overall results indicate that the new TEMPEST attack becomes more practical than the previous side-channel analysis. Finally, we suggest several technical countermeasures that help to design safe IoT devices.
When the Differences in Frequency Domain are Compensated: Understanding and Defeating Modulated Replay Attacks on Automatic Speech Recognition
Automatic speech recognition (ASR) systems have been widely deployed in modern smart devices to provide convenient and diverse voice-controlled services. Since ASR systems are vulnerable to audio replay attacks that can spoof and mislead ASR systems, a number of defense systems have been proposed to identify replayed audio signals based on the speakers' unique acoustic features in the frequency domain. In this paper, we uncover a new type of replay attack called modulated replay attack, which can bypass the existing frequency domain based defense systems. The basic idea is to compensate for the frequency distortion of a given electronic speaker using an inverse filter that is customized to the speaker's transform characteristics. Our experiments on real smart devices confirm the modulated replay attacks can successfully escape the existing detection mechanisms that rely on identifying suspicious features in the frequency domain. To defeat modulated replay attacks, we design and implement a countermeasure named DualGuard. We discover and formally prove that no matter how the replay audio signals could be modulated, the replay attacks will either leave ringing artifacts in the time domain or cause spectrum distortion in the frequency domain. Therefore, by jointly checking suspicious features in both frequency and time domains, DualGuard~can successfully detect various replay attacks including the modulated replay attacks. We implement a prototype of DualGuard~on a popular voice interactive platform, ReSpeaker Core v2. The experimental results show DualGuard~can achieve 98% accuracy on detecting modulated replay attacks.
AdvPulse: Universal, Synchronization-free, and Targeted Audio Adversarial Attacks via Subsecond Perturbations
Existing efforts in audio adversarial attacks only focus on the scenarios where an adversary has prior knowledge of the entire speech input so as to generate an adversarial example by aligning and mixing the audio input with corresponding adversarial perturbation. In this work we consider a more practical and challenging attack scenario where the intelligent audio system takes streaming audio inputs (e.g., live human speech) and the adversary can deceive the system by playing adversarial perturbations simultaneously. This change in attack behavior brings great challenges, preventing existing adversarial perturbation generation methods from being applied directly. In practice, (1) the adversary cannot anticipate what the victim will say: the adversary cannot rely on their prior knowledge of the speech signal to guide how to generate adversarial perturbations; and (2) the adversary cannot control when the victim will speak: the synchronization between the adversarial perturbation and the speech cannot be guaranteed. To address these challenges, in this paper we propose AdvPulse, a systematic approach to generate subsecond audio adversarial perturbations, that achieves the capability to alter the recognition results of streaming audio inputs in a targeted and synchronization-free manner. To circumvent the constraints on speech content and time, we exploit penalty-based universal adversarial perturbation generation algorithm and incorporate the varying time delay into the optimization process. We further tailor the adversarial perturbation according to environmental sounds to make it inconspicuous to humans. Additionally, by considering the sources of distortions occurred during the physical playback, we are able to generate more robust audio adversarial perturbations that can remain effective even under over-the-air propagation. Extensive experiments on two representative types of intelligent audio systems (i.e., speaker recognition and speech command recognition) are conducted in various realistic environments. The results show that our attack can achieve an average attack success rate of over 89.6% in indoor environments and 76.0% in inside-vehicle scenarios even with loud engine and road noises.
Wearable devices that capture user's rich information regarding their health conditions and daily activities have unmet pairing needs. Today's solutions, which primarily rely on human involvement, are cumbersome, error-prone, and do not scale well. Despite some prior efforts trying to fill this gap, they either rely on some sophisticated sensors, such as electromyogram (EMG) or electrocardiogram (ECG) pads that may not universally exist, or non-trivial design of communication transceivers that cannot be found easily on current commercial devices. Therefore, a pairing scheme for wearable devices that is secure, practical, and convenient is in dire need. In this paper, we propose a novel approach that leverages ambient radio frequency (RF) noise. Our design is based on a key observation that received RF noise power measured in the logarithmic scale at different parts of a human body surface experience the same variation trend, whereas those from different human bodies or off the body are distinct. Wearables make use of the observed noise as the entropy source for the proposed pairing protocol. Extensive experiments show that our scheme has an equal error rate (EER) as low as 1.4% for pairing. Its key generation rate reaches 138 bits/sec, which beats so-far existing pairing schemes. Besides, our scheme can be efficiently executed within 0.97 s. Its incurred energy consumption is as low as 0.27 J for the entire pairing procedure.
SESSION: Session 4C: Kernel Security
Open-source kernels have been adopted by massive downstream vendors on billions of devices. However, these vendors often omit or delay the adoption of patches released in the mainstream version. Even worse, many vendors are not publicizing the patching progress or even disclosing misleading information. However, patching status is critical for groups (e.g., governments and enterprise users) that are keen to security threats. Such a practice motivates the need for reliable patch presence testing for downstream kernels. Currently, the best means of patch presence testing is to examine the existence of a patch in the target kernel by using the code signature match. However, such an approach cannot address the key challenges in practice. Specifically, downstream vendors widely customize the mainstream code and use non-standard building configurations, which often change the code around the patching sites such that the code signatures are ineffective.
In this work, we propose PDiff, a system to perform highly reliable patch presence testing with downstream kernel images. Technically speaking, PDiff generates summaries carrying the semantics related to a target patch. Based on the semantic summaries, PDiff compares the target kernel with its mainstream version before and after the adoption of the patch, preferring the closer reference version to determine the patching status. Unlike previous research on patch presence testing, our approach examines similarity based on the semantics of patches and therefore, provides high tolerance to code-level variations. Our test with 398 kernel images corresponding to 51 patches shows that PDiff can achieve high accuracy with an extremely low rate of false negatives and zero false positives. This significantly outperforms the state-of-the-art tool. More importantly, PDiff demonstrates consistently high effectiveness when code customization and non-standard building configurations occur.
Recent research has proposed various methods to perform kernel exploitation and bypass kernel protection. For example, security researchers have demonstrated an exploitation method that utilizes the characteristic of elastic kernel objects to bypass KASLR, disclose stack/heap cookies, and even perform arbitrary read in the kernel. While this exploitation method is considered a commonly adopted approach to disclosing critical kernel information, there is no evidence indicating a strong need for developing a new defense mechanism to limit this exploitation method. It is because the effectiveness of this exploitation method is demonstrated only on anecdotal kernel vulnerabilities. It is unclear whether such a method is useful for a majority of kernel vulnerabilities.
To answer this question, we propose a systematic approach. It utilizes static/dynamic analysis methods to pinpoint elastic kernel objects and then employs constraint solving to pair them to corresponding kernel vulnerabilities. In this work, we implement our proposed method as a tool - ELOISE. Using this tool on three popular OSes (Linux, FreeBSD, and XNU), we discover that elastic objects are pervasive in general caches. Evaluating the effectiveness of these elastic objects on 40 kernel vulnerabilities across three OSes, we observe that they can enable most of the vulnerabilities to bypass KASLR and heap cookie protector. Besides, we also observe that these elastic objects can even escalate the exploitability of some vulnerabilities allowing them to perform arbitrary read in the kernel. Motivated by these observations, we further introduce a new defense mechanism to mitigate the threat of elastic kernel objects. We prototype our defense mechanism on Linux, showing this mechanism introduces negligible overhead.
Drivers on Apple OSes (e.g., iOS, tvOS, iPadOS, macOS, etc.) run in the kernel space and driver vulnerabilities can incur serious security consequences. A recent report from Google Project Zero shows that driver vulnerabilities on Apple OSes have been actively exploited in the wild. Also, we observed that driver vulnerabilities have accounted for one-third of kernel bugs in recent iOS versions based on Apple's security updates. Despite the serious security implications, systematic static analysis on Apple drivers for finding security vulnerabilities has never been done before, not to mention any large-scale study of Apple drivers.
In this paper, we developed the first automatic, static analysis tool iDEA for finding bugs in Apple driver binaries, which is applicable to major Apple OSes (iOS, macOS, tvOS, iPadOS). We summarized and tackled a set of Apple-unique challenges: for example, we show that prior C++ binary analysis techniques are ineffective (i.e., failing to recover C++ classes and resolve indirect calls) on Apple platform due to Apple's unique programming model. To solve the challenges, we found a reliable information source from Apple's driver programming and management model to recover classes, and identified the unique paradigms through which Apple drivers interact with user-space programs. iDEA supports customized, pluggable security policy checkers for its security analysis. Enabled by iDEA, we performed the first large-scale study of 3,400 Apple driver binaries across major Apple OSes and 15 OS versions with respect to two common types of security risks - race condition and out-of-bound read/write, and discovered 35 zero-day bugs. We developed PoC and end-to-end attacks to demonstrate the practical impacts of our findings. A portion of the bugs have been patched by recent Apple security updates or are scheduled to be fixed; others are going through Apple's internal investigation procedure. Our evaluation showed that iDEA incurs a low false-positive rate and time overhead.
Operating system (OS) kernels frequently encounter various errors due to invalid internal states or external inputs. To ensure the security and reliability of OS kernels, developers propose a diverse set of mechanisms to conservatively capture and handle potential errors. Existing research has thus primarily focused on the completeness and adequacy of error handling to not miss the attention. However, we find that handling an error with an over-severe level (e.g., unnecessarily terminating the execution) instead hurts the security and reliability. In this case, the error-handling consequences are even worse than the error it attempts to resolve. We call such a case Exaggerated Error Handling (EEH). The security impacts of EEH bugs vary, including denial-of-service, data losses, broken control-flow integrity, memory leaks, etc. Despite its significance, detecting EEH remains an unexplored topic.
In this paper, we first conduct an in-depth study on EEH. Based on the findings of the study, we then propose an approach, EeCatch, to detect EEH bugs in a context-aware manner. EeCatch accurately identifies errors and extracts their contexts (both spatial and temporal), and automatically infers the appropriate severity level for error handling. Using the inferred severity level, EeCatch finally detects EEH bugs in which the used error handling exceeds the inferred severity level. By analyzing the whole Linux kernel, EeCatch reports hundreds of potential EEH bugs that may cause security issues such as crashing the system. After evaluating 104 cases reported by EeCatch, we manually confirmed 64 EEH bugs and submitted patches for all of them. Using our patches, Linux maintainers have fixed 48 reported EEH bugs, confirming the effectiveness of EeCatch. To the best of our knowledge, we are the first to systematically study and detect EEH bugs. We hope the findings could raise the awareness of the critical consequences of EEH bugs to help developers avoid them.
SESSION: Session 4D: Distributed Protocols
Secure search looks for and retrieves records from a (possibly cloud-hosted) encrypted database while ensuring the confidentiality of the queries. Researchers are paying increasing attention to secure search in recent years due to the growing concerns about database privacy. However, the low efficiency of (especially multiplicative) homomorphic operations in secure search has hindered its deployment in practice. To address this issue, Akavia et al. [CCS 2018, PETS 2019] proposed new protocols that bring down the number of multiplications in the search algorithm from O(n2) to O(n log2 n), and then to O(n log n), where n is the size of the database.
In this paper, we present the first secure search protocol -- LEAF and its variant LEAF+ -- which only requires $O(n)$ multiplications. Specifically, at the core of LEAF are three novel methods we propose, referred to as Localization, Extraction, and Reconstruction. In addition, LEAF enjoys low communication complexity and only requires the client to perform decryption, which adds its advantage in deployment on weak-power devices such as mobile phones.
Anonymous Committed Broadcast is a functionality that extends DC-nets and allows a set of clients to privately commit messages to set of servers, which can then simultaneously open all committed messages in a random ordering. Anonymity holds since no one can learn the ordering or the content of the client's committed message. We present Blinder, the first system that provides a scalable and fully robust solution for anonymous committed broadcast. Blinder maintains both properties of security (anonymity) and robustness (aka. 'guaranteed output delivery' or 'availability') in the face of a global active (malicious) adversary. Moreover, Blinder is censorship resistant, that is, an honest client cannot be blocked from participating. Blinder obtains its security and scalability by carefully combining classical and state-of-the-art techniques from the fields of anonymous communication and secure multiparty computation (MPC). Relying on MPC for such a system is beneficial since it naturally allows the parties (servers) to enforce some properties on accepted messages prior their publication. A GPU based implementation of Blinder with 5 servers, which accepts 1 million clients, incurs a latency of less than 8 minutes; faster by a factor of $>100$ than the 3-servers Riposte protocol (S&P '15), which is not robust and not censorship resistant; we get an even larger factor when comparing to AsynchroMix and PowerMix (CCS '19), which are the only ones that guarantee fairness (or robustness in the online phase).
Secure aggregation is a cryptographic primitive that enables a server to learn the sum of the vector inputs of many clients. Bonawitz et al. (CCS 2017) presented a construction that incurs computation and communication for each client linear in the number of parties. While this functionality enables a broad range of privacy preserving computational tasks, scaling concerns limit its scope of use. We present the first constructions for secure aggregation that achieve polylogarithmic communication and computation per client. Our constructions provide security in the semi-honest and the semi-malicious settings where the adversary controls the server and a δ-fraction of the clients, and correctness with up to δ-fraction dropouts among the clients. Our constructions show how to replace the complete communication graph of Bonawitz et al., which entails the linear overheads, with a k-regular graph of logarithmic degree while maintaining the security guarantees. Beyond improving the known asymptotics for secure aggregation, our constructions also achieve very efficient concrete parameters. The semi-honest secure aggregation can handle a billion clients at the per-client cost of the protocol of Bonawitz et al. for a thousand clients. In the semi-malicious setting with 10 4 clients, each client needs to communicate only with 3% of the clients to have a guarantee that its input has been added together with the inputs of at least 5000 other clients, while withstanding up to 5% corrupt clients and 5% dropouts. We also show an application of secure aggregation to the task of secure shuffling which enables the first cryptographically secure instantiation of the shuffle model of differential privacy.
We present a scalable protocol for database joins on secret shared data in the honest-majority three-party setting. The key features of our protocol are a rich set of SQL-like join/select queries and the ability to compose join operations together due to the inputs and outputs being generically secret shared between the parties. Provided that all joins operate on unique primary keys, no information is revealed to any party during the protocol. In particular, not even the sizes of intermediate joins are revealed. All of our protocols are constant-round and achieve O(n) communication and computation overhead for joining two tables of n rows.
These properties make our protocol ideal for outsourced secure computation. In this setting several non-colluding servers are setup and the input data is shared among them. These servers then perform the relevant secret shared computation and output the result. This model has recently been gaining traction in industry, e.g. Facebook's Crypten, Cape Privacy's TFEncrypted, Mozilla Telemetry.
We additionally implement two applications on top of our framework. The first application detects voter registration errors within and between agencies of 50 US states, in a privacy-preserving manner. The second application allows several organizations to compare network security logs to more accurately identify common security threats, e.g. the IP addresses of a bot net. In both cases, the practicality of these applications depends on efficiently performing joins on millions of secret shared records. For example, our three party protocol can perform a join on two sets of 1 million records in 4.9 seconds or, alternatively, compute the cardinality of this join in just 3.1 seconds.
SESSION: Session 4E: Network Security
The Boon and Bane of Cross-Signing: Shedding Light on a Common Practice in Public Key Infrastructures
Public Key Infrastructures (PKIs) with their trusted Certificate Authorities (CAs) provide the trust backbone for the Internet: CAs sign certificates which prove the identity of servers, applications, or users. To be trusted by operating systems and browsers, a CA has to undergo lengthy and costly validation processes. Alternatively, trusted CAs can cross-sign other CAs to extend their trust to them. In this paper, we systematically analyze the present and past state of cross-signing in the Web PKI. Our dataset (derived from passive TLS monitors and public CT logs) encompasses more than 7 years and 225 million certificates with 9.3 billion trust paths. We show benefits and risks of cross-signing. We discuss the difficulty of revoking trusted CA certificates where, worrisome, cross-signing can result in valid trust paths to remain after revocation; a problem for non-browser software that often blindly trusts all CA certificates and ignores revocations. However, cross-signing also enables fast bootstrapping of new CAs, e.g., Let's Encrypt, and achieves a non-disruptive user experience by providing backward compatibility. In this paper, we propose new rules and guidance for cross-signing to preserve its positive potential while mitigating its risks.
In recent years, the security implication of stale NS records, which point to a nameserver that no longer resolves the domain, has been unveiled. Prior research studied the stale DNS records that point to expired domains. The popularity of DNS hosting services brings in a new category of stale NS records, which reside in the domain's zone (instead of the TLD zone) for an active domain. To the best of our knowledge, the security risk of this kind of stale NS record has never been studied before. In our research, we show that this new type of stale NS record can be practically exploited, causing a stealthier hijack of domains associated with the DNS hosting service. We also performed a large-scale analysis on over 1M high-profile domains, 17 DNS hosting providers and 12 popular public resolver operators to confirm the prevalence of this security risk. Our research further discovers 628 hijackable domains (e.g., 6 government entities and 2 payment services), 14 affected DNS hosting providers (e.g., Amazon Route 53), and 10 vulnerable public resolver operators (e.g., CloudFlare). Furthermore, we conducted an in-depth measurement analysis on them, thus providing a better understanding of this new security risk. Also, we explore the mitigation techniques that can be adopted by different affected parties.
In this paper, we uncover a new off-path TCP hijacking attack that can be used to terminate victim TCP connections or inject forged data into victim TCP connections by manipulating the new mixed IPID assignment method, which is widely used in Linux kernel version 4.18 and beyond to help defend against TCP hijacking attacks. The attack has three steps. First, an off-path attacker can downgrade the IPID assignment for TCP packets from the more secure per-socket-based policy to the less secure hash-based policy, building a shared IPID counter that forms a side channel on the victim. Second, the attacker detects the presence of TCP connections by observing the shared IPID counter on the victim. Third, the attacker infers the sequence number and the acknowledgment number of the detected connection by observing the side channel of the shared IPID counter. Consequently, the attacker can completely hijack the connection, i.e., resetting the connection or poisoning the data stream. We evaluate the impacts of this off-path TCP attack in the real world. Our case studies of SSH DoS, manipulating web traffic, and poisoning BGP routing tables show its threat on a wide range of applications. Our experimental results show that our off-path TCP attack can be constructed within 215 seconds and the success rate is over 88%. Finally, we analyze the root cause of the exploit and develop a new IPID assignment method to defeat this attack. We prototype our defense in Linux 4.18 and confirm its effectiveness through extensive evaluation over real applications on the Internet.
In this paper, we report a series of flaws in the software stack that leads to a strong revival of DNS cache poisoning --- a classic attack which is mitigated in practice with simple and effective randomization-based defenses such as randomized source port. To successfully poison a DNS cache on a typical server, an off-path adversary would need to send an impractical number of $2^32 $ spoofed responses simultaneously guessing the correct source port (16-bit) and transaction ID (16-bit). Surprisingly, we discover weaknesses that allow an adversary to "divide and conquer'' the space by guessing the source port first and then the transaction ID (leading to only $2^16 +2^16 $ spoofed responses). Even worse, we demonstrate a number of ways an adversary can extend the attack window which drastically improves the odds of success. The attack affects all layers of caches in the DNS infrastructure, such as DNS forwarder and resolver caches, and a wide range of DNS software stacks, including the most popular BIND, Unbound, and dnsmasq, running on top of Linux and potentially other operating systems. The major condition for a victim being vulnerable is that an OS and its network is configured to allow ICMP error replies. From our measurement, we find over 34% of the open resolver population on the Internet are vulnerable (and in particular 85% of the popular DNS services including Google's 184.108.40.206). Furthermore, we comprehensively validate the proposed attack with positive results against a variety of server configurations and network conditions that can affect the success of the attack, in both controlled experiments and a production DNS resolver (with authorization).
SESSION: Session 5A: User Authentication
We use biometrics like fingerprints and facial images to identify ourselves to our mobile devices and log on to applications everyday. Such authentication is internal-facing: we provide measurement on the same device where the template is stored. If our personal devices could participate in external-facing authentication too, where biometric measurement is captured by a nearby external sensor, then we could also enjoy a frictionless authentication experience in a variety of physical spaces like grocery stores, convention centers, ATMs, etc. The open setting of a physical space brings forth important privacy concerns though. We design a suite of secure protocols for external-facing authentication based on the cosine similarity metric which provide privacy for both user templates stored on their devices and the biometric measurement captured by external sensors in this open setting. The protocols provide different levels of security, ranging from passive security with some leakage to active security with no leakage at all. With the help of new packing techniques and zero-knowledge proofs for Paillier encryption -- and careful protocol design, our protocols achieve very practical performance numbers. For templates of length 256 with elements of size 16 bits each, our fastest protocol takes merely 0.024 seconds to compute a match, but even the slowest one takes no more than 0.12 seconds. The communication overhead of our protocols is very small too. The passive and actively secure protocols (with some leakage) need to exchange just 16.5KB and 27.8KB of data, respectively. The first message is designed to be reusable and, if sent in advance, would cut the overhead down to just 0.5KB and 0.8KB, respectively.
The steady reports of privacy invasions online paints a picture of the Internet growing into a more dangerous place. This is supported by reports of the potential scale for online harms facilitated by the mass deployment of online technology and by the data-intensive web. While Internet users often express concern about privacy, some report taking actions to protect their privacy online. We investigate the methods and technologies that individuals employ to protect their privacy online. We conduct two studies, of N=180 and N=907, to elicit individuals' use of privacy methods, within the US, the UK and Germany. We find that non-technology methods are among the most used methods in the three countries. We identify distinct groupings of privacy methods usage in a cluster map. The map shows that together with non-technology methods of privacy protection, simple PETs that are integrated in services, form the most used cluster, whereas more advanced PETs form a different, least used cluster. We further investigate user perception and reasoning for mostly using one set of PETs, in a third study with N=183 participants. We do not find a difference in perceived competency in protecting privacy online between advanced and simpler PETs users. We compare use perceptions between advanced and simpler PETs and report on user reasoning for not using advanced PETs, as well as support needed for potential use. This paper contributes to privacy research by eliciting use and perception of use across 43 privacy methods, including 26 PETs across three countries and provides a map of PETs usage. The cluster map provides a systematic and reliable point of reference for future user-centric investigations across PETs. Overall, this research provides a broad understanding of use and perceptions across a collection of PETs, and can lead to future research for scaling use of PETs.
The development of deep learning techniques has significantly increased the ability of computers to recognize CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart), thus breaking or mitigating the security of existing captcha schemes. To protect against these attacks, recent works have been proposed to leverage adversarial machine learning to perturb captcha pictures. However, they either require the prior knowledge of captcha solving models or lack adaptivity to the evolving behaviors of attackers. Most importantly, none of them has been deployed in practical applications, and their practical applicability and effectiveness are unknown.
In this work, we introduce advCAPTCHA, a practical adversarial captcha generation system that can defend against deep learning based captcha solvers, and deploy it on a large-scale online platform with near billion users. To the best of our knowledge, this is the first such work that has been deployed on international large-scale online platforms. By applying adversarial learning techniques in a novel manner, advCAPTCHA can generate effective adversarial captchas to significantly reduce the success rate of attackers, which has been demonstrated by a large-scale online study. Furthermore, we also validate the feasibility of advCAPTCHA in practical applications, as well as its robustness in defending against various attacks. We leverage the existing user risk analysis system to identify potential attackers and serve advCAPTCHA to them. We then use their answers as queries to the attack model. In this manner, advCAPTCHA can be adapted/fine-tuned to accommodate the attack model evolution. Overall, advCAPTCHA can serve as a key enabler for generating robust captchas in practice and providing useful guidelines for captcha developers and practitioners.
Practical Recommendations for Stronger, More Usable Passwords Combining Minimum-strength, Minimum-length, and Blocklist Requirements
Multiple mechanisms exist to encourage users to create stronger passwords, including minimum-length and character-class requirements, prohibiting blocklisted passwords, and giving feedback on the strength of candidate passwords. Despite much research, there is little definitive, scientific guidance on how these mechanisms should be combined and configured to best effect. Through two online experiments, we evaluated combinations of minimum-length and character-class requirements, blocklists, and a minimum-strength requirement that requires passwords to exceed a strength threshold according to neural-network-driven password-strength estimates.
Our results lead to concrete recommendations for policy configurations that produce a good balance of security and usability. In particular, for high-value user accounts we recommend policies that combine minimum-strength and minimum-length requirements. While we offer recommendations for organizations required to use blocklists, using blocklists does not provide further gains. Interestingly, we also find that against expert attackers, character-class requirements, traditionally associated with producing stronger passwords, in practice may provide very little improvement and may even reduce effective security.
SESSION: Session 5B: Secure Messaging and Key Exchange
We provide a composition framework together with a variety of composition theorems allowing to split the security proof of an unbounded number of sessions of a compound protocol into simpler goals. While many proof techniques could be used to prove the subgoals, our model is particularly well suited to the Computationally Complete Symbolic Attacker (ccsA) model.
We address both sequential and parallel composition, with state passing and long term shared secrets between the protocols. We also provide with tools to reduce multi-session security to single session security, with respect to a stronger attacker. As a consequence, our framework allows, for the first time, to perform proofs in the CCSA model for an unbounded number of sessions.
To this end, we introduce the notion of O-simulation: a simulation by a machine that has access to an oracle O. Carefully managing the access to long term secrets, we can reduce the security of a composed protocol, for instance P || Q, to the security of P (resp. Q), with respect to an attacker simulating Q (resp. P) using an oracle O. As demonstrated by our case studies the oracle is most of the time quite generic and simple.
These results yield simple formal proofs of composed protocols, such as multiple sessions of key exchanges, together with multiple sessions of protocols using the exchanged keys, even when all the parts share long terms secrets (e.g. signing keys). We also provide with a concrete application to the SSH protocol with (a modified) forwarding agent, a complex case of long term shared secrets, which we formally prove secure.
The Signal Private Group System and Anonymous Credentials Supporting Efficient Verifiable Encryption
In this paper we present a system for maintaining a membership list of users in a group, designed for use in the Signal Messenger secure messaging app. The goal is to support private groups where membership information is readily available to all group members but hidden from the service provider or anyone outside the group. In the proposed solution, a central server stores the group membership in the form of encrypted entries. Members of the group authenticate to the server in a way that reveals only that they correspond to some encrypted entry, then read and write the encrypted entries.
Authentication in our design uses a primitive called a keyed-verification anonymous credential~(KVAC), and we construct a new KVAC scheme based on an algebraic MAC, instantiated in a group G of prime order. The benefit of the new KVAC is that attributes may be elements in G, whereas previous schemes could only support attributes that were integers modulo the order of G. This enables us to encrypt group data using an efficient Elgamal-like encryption scheme, and to prove in zero-knowledge that the encrypted data is certified by a credential. Because encryption, authentication, and the associated proofs of knowledge are all instantiated in G the system is efficient, even for large groups.
We present KEMTLS, an alternative to the TLS 1.3 handshake that uses key-encapsulation mechanisms (KEMs) instead of signatures for server authentication. Among existing post-quantum candidates, signature schemes generally have larger public key/signature sizes compared to the public key/ciphertext sizes of KEMs: by using an IND-CCA-secure KEM for server authentication in post-quantum TLS, we obtain multiple benefits. A size-optimized post-quantum instantiation of KEMTLS requires less than half the bandwidth of a size-optimized post-quantum instantiation of TLS 1.3. In a speed-optimized instantiation, KEMTLS reduces the amount of server CPU cycles by almost 90% compared to TLS 1.3, while at the same time reducing communication size, reducing the time until the client can start sending encrypted application data, and eliminating code for signatures from the server's trusted code base.
We investigate whether modern messaging apps achieve the strong post-compromise security guarantees offered by their underlying protocols. In particular, we perform a black-box experiment in which a user becomes the victim of a clone attack; in this attack, the user's full state (including identity keys) is compromised by an attacker who clones their device and then later attempts to impersonate them, using the app through its user interface.
Our attack should be prevented by protocols that offer post-compromise security, and thus, by all apps that are based on Signal's double-ratchet algorithm (for instance, the Signal app, WhatsApp, and Facebook Secret Conversations). Our experiments reveal that this is not the case: most deployed messaging apps fall far short of the security that their underlying mechanisms suggest.
We conjecture that this security gap is a result of many apps trading security for usability, by tolerating certain forms of desynchronization. We show that the tolerance of desynchronization necessarily leads to loss of post-compromise security in the strict sense, but we also show that more security can be retained than is currently offered in practice. Concretely, we present a modified version of the double-ratchet algorithm that tolerates forms of desynchronization while still being able to detect cloning activity. Moreover, we formally analyze our algorithm using the Tamarin prover to show that it achieves the desired security properties.
SESSION: Session 5C: Forensics
The creation and distribution of child sexual abuse materials (CSAM) involves a continuing violation of the victims? privacy beyond the original harms they document. A large volume of these materials is distributed via the Freenet anonymity network: in our observations, nearly one third of requests on Freenet were for known CSAM. In this paper, we propose and evaluate a novel approach for investigating these violations of exploited childrens' privacy. Our forensic method distinguishes whether or not a neighboring peer is the actual uploader or downloader of a file or merely a relayer. Our method requires analysis of the traffic sent to a single, passive node only. We evaluate our method extensively. Our in situ measurements of actual CSAM requests show an FPR of 0.002 ± 0.003 for identifying downloaders. And we show an FPR of 0.009 ± 0.018, a precision of 1.00 ± 0.01, and a TPR of 0.44 ± 0.01 for identifying uploaders based on in situ tests. Further, we derive expressions for the FPR and Power of our hypothesis test; perform simulations of single and concurrent downloaders; and characterize the Freenet network to inform parameter selection. We were participants in several United States Federal Court cases in which the use of our method was uniformly upheld.
Several large scale studies on the Maven, NPM, and Android ecosystems point out that many developers do not often update their vulnerable software libraries thus exposing the user of their code to security risks. The purpose of this study is to qualitatively investigate the choices and the interplay of functional and security concerns on the developers' overall decision-making strategies for selecting, managing, and updating software dependencies.
We run 25 semi-structured interviews with developers of both large and small-medium enterprises located in nine countries. All interviews were transcribed, coded, and analyzed according to applied thematic analysis. They highlight the trade-offs that developers are facing and that security researchers must understand to provide effective support to mitigate vulnerabilities (for example bundling security fixes with functional changes might hinder adoption due to lack of resources to fix functional breaking changes).
We further distill our observations to actionable implications on what algorithms and automated tools should achieve to effectively support (semi-)automatic dependency management.
We pose and study forensic analysis in the context of access control systems in a manner that prior work has not. Forensics seeks to answer questions about past states of a system, and thereby provides important clues and evidence in the event of a security incident. Access control deals with who may perform what action on a resource and is a critical security function. Our focus is access control systems that allow for changes to the authorization state to be delegated to potentially untrusted users. We argue that this context in access control is an important one in which to consider forensic analysis, and observe that it is a natural complement of safety analysis, which has been considered extensively in the literature. We pose the forensic analysis problem for such access control systems abstractly, and instantiate it for three schemes from the literature: a well-known access matrix scheme, a role-based scheme, and a discretionary scheme. We identify the computational complexity of forensic analysis, and compare it to that of safety analysis for each of the schemes. We consider also the notion of logs, i.e., data that can be collected over time to aid forensic analysis.
We present results for sufficient and minimal logs that render forensic analysis for the three schemes efficient. This motivates discussions on goal-directed logging, with the explicit intent of aiding forensic analysis. We carry out a case-study in the realistic setting of a serverless cloud application, and observe that goal-directed logging can be highly effective. Our work makes contributions at the foundations of information security, and its practical implications.
For system logs to aid in security investigations, they must be beyond the reach of the adversary. Unfortunately, attackers that have escalated privilege on a host are typically able to delete and modify log events at will. In response to this threat, a variety of secure logging systems have appeared over the years that attempt to provide tamper-resistance (e.g., write once read many drives, remote storage servers) or tamper-evidence (e.g., cryptographic proofs) for system logs. These solutions expose an interface through which events are committed to a secure log, at which point they enjoy protection from future tampering. However, all proposals to date have relied on the assumption that an event's occurrence is concomitant with its commitment to the secured log.
In this work, we challenge this assumption by presenting and validating a race condition attack on the integrity of audit frameworks. Our attack exploits the intrinsically asynchronous nature of I/O and IPC activity, demonstrating that an attacker can snatch events about their intrusion out of message buffers after they have occurred but before they are committed to the log, thus bypassing existing protections. We present a first step towards defending against our attack by introducing KennyLoggings, the first kernel- based tamper-evident logging system that satisfies the synchronous integrity property, meaning that it guarantees tamper-evidence of events upon their occurrence. We implement KennyLoggings on top of the Linux kernel and show that it imposes between 8% and 11% overhead on log-intensive application workloads.
SESSION: Session 5D: Secure Computation
Multi-Protocol SPDZ (MP-SPDZ) is a fork of SPDZ-2 (Keller et al., CCS '13), an implementation of the multi-party computation (MPC) protocol called SPDZ (Damgård et al., Crypto '12). MP-SPDZ extends SPDZ-2 to 30 MPC protocol variants, all of which can be used with the same high-level programming interface based on Python. This considerably simplifies comparing the cost of different protocols and security models. The protocols cover all commonly used security models (honest/dishonest majority and semi-honest/malicious corruption) as well as computation of binary and arithmetic circuits (the latter modulo primes and powers of two). The underlying primitives employed include secret sharing, oblivious transfer, homomorphic encryption, and garbled circuits. The breadth of implemented protocols coupled with an accessible high-level interface makes it suitable to benchmark the cost of computation in various security models for researchers both with and without a background in secure computation This paper aims to outline the variety of protocols implemented and the design choices made in the development of MP-SPDZ as well as the capabilities of the programming interface.
One of the most challenging aspects in secure computation is offering protection against active adversaries, who may arbitrarily alter the behavior of corrupted parties. A powerful paradigm due to Goldreich, Micali, and Wigderson (GMW), is to follow a two-step approach: (1) design a passively secure protocol π for the task at hand; (2) apply a general compiler to convert π into an actively secure protocol π' for the same task.
In this work, we implement the first two-party actively secure protocol whose design is based on the general GMW paradigm. Our implementation applies to a passively secure π based on garbled circuits, using a sublinear zero-knowledge proof to ensure correctness of garbling. The main variant of our protocol makes a black-box use of an underlying oblivious transfer primitive by following the "certified oblivious transfer" blueprint of Ishai et al. (Eurocrypt 2011) and Hazay et. al. (TCC 2017). We also analyze a conceptually simpler but less efficient variant that makes a non-black-box use of oblivious transfer. %that designed an efficient parallel OT in which the receiver is additionally assured that the pairs of strings transmitted satisfy a global consistency predicate.
Our protocol has several important advantages. It supports non-interactive secure computation (NISC), where a receiver posts an "encryption" of its input and gets back from a sender an "encryption" of the output. The efficiency of this NISC protocol is enhanced by using an offline non-interactive preprocessing, where the sender publishes a single garbled circuit together with a proof of correctness, while the receiver need not even be online. The online work of both the sender and the receiver is lightweight, with a small overhead compared Yao's passively secure protocol depending mostly on the input size rather than the circuit size.
Correlated oblivious transfer (COT) is a crucial building block for secure multi-party computation (MPC) and can be generated efficiently via OT extension. Recent works based on the pseudorandom correlation generator (PCG) paradigm presented a new way to generate random COT correlations using only communication sublinear to the output length. However, due to their high computational complexity, these protocols are only faster than the classical IKNP-style OT extension under restricted network bandwidth. In this paper, we propose new COT protocols in the PCG paradigm that achieve unprecedented performance. \em With $50$ Mbps network bandwidth, our maliciously secure protocol can produce one COT correlation in $22$ nanoseconds. More specifically, our results are summarized as follows: \beginenumerate \item We propose a semi-honest COT protocol with sublinear communication and linear computation. This protocol assumes primal-LPN and is built upon a recent VOLE protocol with semi-honest security by Schoppmann et al. (CCS 2019). We are able to apply various optimizations to reduce its communication cost by roughly $15\times$, not counting a one-time setup cost that diminishes as we generate more COT correlations. \item We strengthen our COT protocol to malicious security with no loss of efficiency. Among all optimizations, our new protocol features a new checking technique that ensures correctness and consistency essentially for free. In particular, our maliciously secure protocol is only \em $1-3$ nanoseconds slower for each COT. \item We implemented our protocols, and the code will be publicly available at EMP toolkit. We observe at least $9\times$ improvement in running time compared to the state-of-the-art protocol by Boyle et al. (CCS 2019) in both semi-honest and malicious settings under any network faster than $50$ Mbps. \endenumerate With this new record of efficiency for generating COT correlations, we anticipate new protocol designs and optimizations will flourish on top of our protocol.
Recent works on distributed garbling have provided highly efficient solutions for constant-round MPC tolerating an arbitrary number of corruptions. In this work, we improve upon state-of-the-art protocols in this paradigm for further performance gain. First, we propose a new protocol for generating authenticated AND triples, which is a key building block in many recent works. \beginitemize \item We propose a new authenticated bit protocol in the two-party and multi-party settings from bare IKNP OT extension, allowing us to reduce the communication by about $24%$ and eliminate many computation bottlenecks. We further improve the computational efficiency for multi-party authenticated AND triples with cheaper and fewer consistency checks and fewer hash function calls. \item We implemented our triple generation protocol and observe around $4\times$ to $5\times$ improvement compared to the best prior protocol in most settings. For example, in the two-party setting with 10 Gbps network and 8 threads, our protocol can generate more than $4$ million authenticated triples per second, while the best prior implementation can only generate $0.8$ million triples per second. In the multi-party setting, our protocol can generate more than $37000$ triples per second over 80 parties, while the best prior protocol can only generate the same number of triples per second over 16 parties. \enditemize We also improve the state-of-the-art multi-party authenticated garbling protocol. \beginitemize \item We take the first step towards applying half-gates in the multi-party setting, which enables us to reduce the size of garbled tables by $2κ$ bits per gate per garbler, where κ is the computational security parameter. This optimization is also applicable in the semi-honest multi-party setting. \item We further reduce the communication of circuit authentication from $4ρ$ bits to $1$ bit per gate, using a new multi-party batched circuit authentication, where ρ is the statistical security parameter. Prior solution with similar efficiency is only applicable in the two-party setting. \enditemize For example, in the three-party setting, our techniques can lead to roughly a $35%$ reduction in the size of a distributed garbled circuit.
SESSION: Session 5E: Infrastructure Security
OpenPGP and S/MIME are two major standards for securing email communication introduced in the early 1990s. Three recent classes of attacks exploit weak cipher modes (EFAIL Malleability Gadgets, or EFAIL-MG), the flexibility of the MIME email structure (EFAIL Direct Exfiltration, or EFAIL-DE), and the Reply action of the email client (REPLY attacks). Although all three break message confidentiality by using standardized email features, only EFAIL-MG has been mitigated in IETF standards with the introduction of AEAD algorithms. So far, no uniform and reliable countermeasures have been adopted by email clients to prevent EFAIL-DE and REPLY attacks. Instead, email clients implement a variety of different ad-hoc countermeasures which are only partially effective, cause interoperability problems, and fragment the secure email ecosystem.
We present the first generic countermeasure against both REPLY and EFAIL-DE attacks by checking the decryption context including SMTP headers and MIME structure during decryption. The decryption context is encoded into a string DC and used as Associated Data (AD) in the AEAD encryption. Thus the proposed solution seamlessly extends the EFAIL-MG countermeasures. The decryption context changes whenever an attacker alters the email source code in a critical way, for example, if the attacker changes the MIME structure or adds a new Reply-To header. The proposed solution does not cause any interoperability problems and legacy emails can still be decrypted. We evaluate our approach by implementing the decryption contexts in Thunderbird/Enigmail and by verifying their correct functionality after the email has been transported over all major email providers, including Gmail and iCloud Mail.
Impersonation-as-a-Service: Characterizing the Emerging Criminal Infrastructure for User Impersonation at Scale
In this paper we provide evidence of an emerging criminal infrastructure enabling impersonation attacks at scale. Impersonation-as-a-Service (IMPaaS) allows attackers to systematically collect and enforce user profiles (consisting of user credentials, cookies, device and behavioural fingerprints, and other metadata) to circumvent risk-based authentication system and effectively bypass multi-factor authentication mechanisms. We present the IMPaaS model and evaluate its implementation by analysing the operation of a large, invite-only, Russian IMPaaS platform providing user profiles for more than 260,000 Internet users worldwide. Our findings suggest that the IMPaaS model is growing, and provides the mechanisms needed to systematically evade authentication controls across multiple platforms, while providing attackers with a reliable, up-to-date, and semi-automated environment enabling target selection and user impersonation against Internet users as scale.
Phishing websites are still a major threat in today's Internet ecosystem. Despite numerous previous efforts, similarity-based detection methods do not offer sufficient protection for the trusted websites, in particular against unseen phishing pages. This paper contributes VisualPhishNet, a new similarity-based phishing detection framework, based on a triplet Convolutional Neural Network (CNN). VisualPhishNet learns profiles for websites in order to detect phishing websites by a similarity metric that can generalize to pages with new visual appearances. We furthermore present VisualPhish, the largest dataset to date that facilitates visual phishing detection in an ecologically valid manner. We show that our method outperforms previous visual similarity phishing detection approaches by a large margin while being robust against a range of evasion attacks.
Dangerous Skills Got Certified: Measuring the Trustworthiness of Skill Certification in Voice Personal Assistant Platforms
With the emergence of the voice personal assistant (VPA) ecosystem, third-party developers are allowed to build new voice-apps are called skills in the Amazon Alexa platform and actions in the Google Assistant platform, respectively. For the sake of brevity, we use the term skills to describe voice-apps including Amazon skills and Google actions, unless we need to distinguish them for different VPA platforms. and publish them to the skills store, which greatly extends the functionalities of VPAs. Before a new skill becomes publicly available, that skill must pass a certification process, which verifies that it meets the necessary content and privacy policies. The trustworthiness of skill certification is of significant importance to platform providers, developers, and end users. Yet, little is known about how difficult it is for a policy-violating skill to get certified and published in VPA platforms. In this work, we study the trustworthiness of the skill certification in Amazon Alexa and Google Assistant platforms to answer three key questions: 1) Whether the skill certification process is trustworthy in terms of catching policy violations in third-party skills. 2) Whether there exist policy-violating skills published in their skills stores. 3) What are VPA users' perspectives on the skill certification and their vulnerable usage behavior when interacting with VPA devices? Over a span of 15 months, we crafted and submitted for certification 234 Amazon Alexa skills and 381 Google Assistant actions that intentionally violate content and privacy policies specified by VPA platforms. Surprisingly, we successfully got 234 (100%) policy-violating Alexa skills certified and 148 (39%) policy-violating Google actions certified. Our analysis demonstrates that policy-violating skills exist in the current skills stores, and thus users (children, in particular) are at risk when using VPA services. We conducted a user study with 203 participants to understand users' misplaced trust on VPA platforms. Unfortunately, user expectations are not being met by the skill certification in leading VPA platforms.
SESSION: Session 6A: Signatures
MuSig is a multi-signature scheme for Schnorr signatures, which supports key aggregation and is secure in the plain public key model. Standard derandomization techniques for discrete logarithm-based signatures such as RFC~6979, which make the signing procedure immune to catastrophic failures in the randomness generation, are not applicable to multi-signatures as an attacker could trick an honest user into producing two different partial signatures with the same randomness, which would reveal the user's secret key.
In this paper, we propose a variant of MuSig in which signers generate their nonce deterministically as a pseudorandom function of the message and all signers' public keys and prove that they did so by providing a non-interactive zero-knowledge proof to their cosigners. The resulting scheme, which we call MuSig-DN, is the first Schnorr multi-signature scheme with deterministic signing. Therefore its signing protocol is robust against failures in the randomness generation as well as attacks trying to exploit the statefulness of the signing procedure, e.g., virtual machine rewinding attacks. As an additional benefit, a signing session in MuSig-DN requires only two rounds instead of three as required by all previous Schnorr multi-signatures including MuSig. To instantiate our construction, we identify a suitable algebraic pseudorandom function and provide an efficient implementation of this function as an arithmetic circuit. This makes it possible to realize MuSig-DN efficiently using zero-knowledge proof frameworks for arithmetic circuits which support inputs given in Pedersen commitments, e.g., Bulletproofs. We demonstrate the practicality of our technique by implementing it for the secp256k1 elliptic curve used in Bitcoin.
A verifiable timed signature (VTS) scheme allows one to time-lock a signature on a known message for a given amount of time T such that after performing a sequential computation for time T anyone can extract the signature from the time-lock. Verifiability ensures that anyone can publicly check if a time-lock contains a valid signature on the message without solving it first, and that the signature can be obtained by solving the same for time T.
This work formalizes VTS, presents efficient constructions compatible with BLS, Schnorr, and ECDSA signatures, and experimentally demonstrates that these constructions can be employed in practice. On a technical level, we design an efficient cut-and-choose protocol based on the homomorphic time-lock puzzles to prove the validity of a signature encapsulated in a time-lock puzzle. We also present a new efficient range proof protocol that significantly improves upon existing proposals in terms of the proof size, and is also of independent interest.
While VTS is a versatile tool with numerous existing applications, we demonstrate VTS's applicability to resolve three novel challenging issues in the space of cryptocurrencies. Specifically,we show how VTS is the cryptographic cornerstone to construct:(i) Payment channel networks with improved on-chain unlinkability of users involved in a transaction, (ii) multi-party signing of transactions for cryptocurrencies without any on-chain notion oftime and (iii) cryptocurrency-enabled fair multi-party computation protocol.
Asynchronous Distributed Key Generation for Computationally-Secure Randomness, Consensus, and Threshold Signatures.
In this paper, we present the first Asynchronous Distributed Key Generation (ADKG) algorithm which is also the first distributed key generation algorithm that can generate cryptographic keys with a dual (f,2f+1)-threshold (where f is the number of faulty parties). As a result, using our ADKG we remove the trusted setup assumption that the most scalable consensus algorithms make. In order to create a DKG with a dual (f,2f+1)- threshold we first answer in the affirmative the open question posed by Cachin et al.  on how to create an Asynchronous Verifiable Secret Sharing (AVSS) protocol with a reconstruction threshold of f+1<k łe 2f+1, which is of independent interest. Our High-threshold-AVSS (HAVSS) uses an asymmetric bivariate polynomial to encode the secret. This enables the reconstruction of the secret only if a set of k nodes contribute while allowing an honest node that did not participate in the sharing phase to recover his share with the help of f+1 honest parties. Once we have HAVSS we can use it to bootstrap scalable partially synchronous consensus protocols, but the question on how to get a DKG in asynchrony remains as we need a way to produce common randomness. The solution comes from a novelEventually Perfect Common Coin (EPCC) abstraction that enables the generation of a common coin from n concurrent HAVSS invocations. EPCC's key property is that it is eventually reliable, as it might fail to agree at most f times (even if invoked a polynomial number of times). UsingEPCC we implement anEventually Efficient Asynchronous Binary Agreement (EEABA) which is optimal when the EPCC agrees and protects safety when EPCC fails. Finally, using EEABA we construct the first ADKG which has the same overhead and expected runtime as the best partially-synchronous DKG (O(n4) words, O(f) rounds). As a corollary of our ADKG, we can also create the first Validated Asynchronous Byzantine Agreement (VABA) that does not need a trusted dealer to setup threshold signatures of degree n-f. Our VABA has an overhead of expected O(n2) words and O(1) time per instance, after an initial O(n4) words and O(f) time bootstrap via ADKG.
Building on the Gennaro & Goldfeder and Lindell & Nof protocols (CCS '18), we present two threshold ECDSA protocols, for any number of signatories and any threshold, that improve as follows over the state of the art: -- For both protocols, only the last round requires knowledge of the message, and the other rounds can take place in a preprocessing stage, lending to a non-interactive threshold ECDSA protocol. -- Both protocols withstand adaptive corruption of signatories. Furthermore, they include a periodic refresh mechanism and offer full proactive security. -- Both protocols realize an ideal threshold signature functionality within the UC framework, in the global random oracle model, assuming Strong RSA, DDH, semantic security of the Paillier encryption, and a somewhat enhanced variant of existential unforgeability of ECDSA. -- Both protocols achieve accountability by identifying corrupted parties in case of failure to generate a valid signature. The two protocols are distinguished by the round-complexity and the identification process for detecting cheating parties. Namely: -- For the first protocol, signature generation takes only 4 rounds (down from the current state of the art of 8 rounds), but the identification process requires computation and communication that is quadratic in the number of parties. -- For the second protocol, the identification process requires computation and communication that is only linear in the number of parties, but signature generation takes 7 rounds. These properties (low latency, compatibility with cold-wallet architectures, proactive security, identifiable abort and composable security) make the two protocols ideal for threshold wallets for ECDSA-based cryptocurrencies.
SESSION: Session 6B: Exploitation and Defenses
Code reuse attacks have been the subject of a substantial amount of research during the past decade. This research largely resulted from early work on Return-Oriented Programming (ROP), which showed that the then newly proposed Non-Executable Memory (NX) defense could be bypassed. More recently, the research community has been simultaneously investigating new defenses that are believed to thwart code reuse attacks, such as Control Flow Integrity (CFI), and defense-aware techniques for attacking these defenses, such as Data-Oriented Programming (DOP). Unfortunately, the feasibility of defense-aware attacks are very dependent on the behaviors of the attacked program, which makes it difficult for defenders to understand how much protection a defense such as CFI may provide. To better understand this, researchers have introduced automated defense-aware code reuse attack systems. Unfortunately, the handful of existing systems implement a single fixed, defense-specific strategy that is complex and cannot be used to consider other defenses.
In this paper, we propose a generic framework for automatically discovering defense-aware code reuse attacks in executables. Unlike existing work, which utilizes hard-coded strategies for specific defenses, our framework can produce attacks for multiple defenses by analyzing the runtime behavior of the defense. The high-level insight behind our framework is that code reuse attacks can be defined as a state reachability problem, and that defenses prevent some transitions between states. We implement our framework as a tool named Limbo, which employs an existing binary concolic executor to solve the reachability problem. We evaluate Limbo and show that it excels when there is little code available for reuse, making it complementary to existing techniques. We show that, in such scenarios, Limbo outperforms existing systems that automate ROP attacks, as well as systems that automate DOP attacks in the presence of fine-grained CFI, despite having no special knowledge about ROP or DOP attacks.
Just-in-time return-oriented programming (JIT-ROP) allows one to dynamically discover instruction pages and launch code reuse attacks, effectively bypassing most fine-grained address space layout randomization (ASLR) protection. However, in-depth questions regarding the impact of code (re-)randomization on code reuse attacks have not been studied. For example, how would one compute the re-randomization interval effectively by considering the speed of gadget convergence to defeat JIT-ROP attacks? ; how do starting pointers in JIT-ROP impact gadget availability and gadget convergence time? ; what impact do fine-grained code randomizations have on the Turing-complete expressive power of JIT-ROP payloads? We conduct a comprehensive measurement study on the effectiveness of fine-grained code randomization schemes, with 5 tools, 20 applications including 6 browsers, 1 browser engine, and 25 dynamic libraries. We provide methodologies to measure JIT-ROP gadget availability, quality, and their Turing-complete expressiveness, as well as to empirically determine the upper bound of re-randomization intervals in re-randomization schemes using the Turing-complete (TC), priority, MOV TC, and payload gadget sets. Experiments show that the upper bound ranges from 1.5 to 3.5 seconds in our tested applications. Besides, our results show that locations of leaked pointers used in JIT-ROP attacks have no impacts on gadget availability but have an impact on how fast attackers find gadgets. Our results also show that instruction-level single-round randomization thwarts current gadget finding techniques under the JIT-ROP threat model.
Control-flow integrity (CFI) is a promising technique to mitigate control-flow hijacking attacks. In the past decade, dozens of CFI mechanisms have been proposed by researchers. Despite the claims made by themselves, the security promises of these mechanisms have not been carefully evaluated, and thus are questionable.
In this paper, we present a solution to measure the gap between the practical security and the claimed theoretical security. First, we propose CScan to precisely measure runtime feasible targets of indirect control transfer (ICT) instructions protected by CFI, by enumerating all potential code addresses and testing whether ICTs are allowed to jump to them. Second, we propose CBench as a sanity check for verifying CFI solutions? effectiveness against typical attacks, by exploiting a comprehensive set of vulnerable programs protected by CFI and verifying the recognized feasible targets.
We evaluated 12 most recent open-source CFI mechanisms and discovered 10 flaws in most CFI mechanisms or implementations. For some CFIs, their security policies or protected ICT sets do not match what they claimed. Some CFIs even expand the attack surface (e.g. introducing unintended targets). To facilitate a deeper understanding of CFI, we summarize the flaws into 7 common pitfalls which cover the whole lifetime of CFI mechanisms and reveal issues that affect CFI mechanisms in practical security.
RTFM! Automatic Assumption Discovery and Verification Derivation from Library Document for API Misuse Detection
To use library APIs, a developer is supposed to follow guidance and respect some constraints, which we call integration assumptions (IAs). Violations of these assumptions can have serious consequences, introducing security-critical flaws such as use-after-free, NULL-dereference, and authentication errors. Analyzing a program for compliance with IAs involves significant effort and needs to be automated. A promising direction is to automatically recover IAs from a library document using Natural Language Processing (NLP) and then verify their consistency with the ways APIs are used in a program through code analysis. However, a practical solution along this line needs to overcome several key challenges, particularly the discovery of IAs from loosely formatted documents and interpretation of their informal descriptions to identify complicated constraints (e.g., data-/control-flow relations between different APIs).
In this paper, we present a new technique for automated assumption discovery and verification derivation from library documents. Our approach, called Advance, utilizes a suite of innovations to address those challenges. More specifically, we leverage the observation that IAs tend to express a strong sentiment in emphasizing the importance of a constraint, particularly those security-critical, and utilize a new sentiment analysis model to accurately recover them from loosely formatted documents. These IAs are further processed to identify hidden references to APIs and parameters, through an embedding model, to identify the information-flow relations expected to be followed. Then our approach runs frequent subtree mining to discover the grammatical units in IA sentences that tend to indicate some categories of constraints that could have security implications. These components are mapped to verification code snippets organized in line with the IA sentence's grammatical structure, and can be assembled into verification code executed through CodeQL to discover misuses inside a program. We implemented this design and evaluated it on 5 popular libraries (OpenSSL, SQLite, libpcap, libdbus and libxml2) and 39 real-world applications. Our analysis discovered 193 API misuses, including 139 flaws never reported before.
SESSION: Session 6C: Side Channels
The recent Spectre attacks have demonstrated the fundamental insecurity of current computer microarchitecture. The attacks use features like pipelining, out-of-order and speculation to extract arbitrary information about the memory contents of a process. A comprehensive formal microarchitectural model capable of representing the forms of out-of-order and speculative behavior that can meaningfully be implemented in a high performance pipelined architecture has not yet emerged. Such a model would be very useful, as it would allow the existence and non-existence of vulnerabilities, and soundness of countermeasures to be formally established. This paper presents such a model targeting single core processors. The model is intentionally very general and provides an infrastructure to define models of real CPUs. It incorporates microarchitectural features that underpin all known Spectre vulnerabilities. We use the model to elucidate the security of existing and new vulnerabilities, as well as to formally analyze the effectiveness of proposed countermeasures. Specifically, we discover three new (potential) vulnerabilities, including a new variant of Spectre v4, a vulnerability on speculative fetching, and a vulnerability on out-of-order execution, and analyze the effectiveness of existing countermeasures including constant time and serializing instructions.
To defeat ASLR or more advanced fine-grained and leakage-resistant code randomization schemes, modern software exploits rely on information disclosure to locate gadgets inside the victim's code. In the absence of such info-leak vulnerabilities, attackers can still hack blind and derandomize the address space by repeatedly probing the victim's memory while observing crash side effects, but doing so is only feasible for crash-resistant programs. However, high-value targets such as the Linux kernel are not crash-resistant. Moreover, the anomalously large number of crashes is often easily detectable. In this paper, we show that the Spectre era enables an attacker armed with a single memory corruption vulnerability to hack blind without triggering any crashes. Using speculative execution for crash suppression allows the elevation of basic memory write vulnerabilities into powerful speculative probing primitives that leak through microarchitectural side effects. Such primitives can repeatedly probe victim memory and break strong randomization schemes without crashes and bypass all deployed mitigations against Spectre-like attacks. The key idea behind speculative probing is to break Spectre mitigations using memory corruption and resurrect Spectre-style disclosure primitives to mount practical blind software exploits. To showcase speculative probing, we target the Linux kernel, a crash-sensitive victim that has so far been out of reach of blind attacks, mount end-to-end exploits that compromise the system with just-in-time code reuse and data-only attacks from a single memory write vulnerability, and bypass strong Spectre and strong randomization defenses. Our results show that it is crucial to consider synergies between different (Spectre vs. code reuse) threat models to fully comprehend the attack surface of modern systems.
Recent work on Side Channel Analysis (SCA) targets old, well-known vulnerabilities, even previously exploited, reported, and patched in high-profile cryptography libraries. Nevertheless, researchers continue to find and exploit the same vulnerabilities in old and new products, highlighting a big issue among vendors: effectively tracking and fixing security vulnerabilities when disclosure is not done directly to them. In this work, we present another instance of this issue by performing the first library-wide SCA security evaluation of Mozilla's NSS security library. We use a combination of two independently-developed SCA security frameworks to identify and test security vulnerabilities. Our evaluation uncovers several new vulnerabilities in NSS affecting DSA, ECDSA, and RSA cryptosystems. We exploit said vulnerabilities and implement key recovery attacks using signals---extracted through different techniques such as timing, microarchitecture, and EM---and improved lattice methods.
Intel SGX is a security solution promising strong and practical security guarantees for trusted computing. However, recent reports demonstrated that such security guarantees of SGX are broken due to access pattern based side-channel attacks, including page fault, cache, branch prediction, and speculative execution. In order to stop these side-channel attackers, Oblivious RAM (ORAM) has gained strong attention from the security community as it provides cryptographically proven protection against access pattern based side-channels. While several proposed systems have successfully applied ORAM to thwart side-channels, those are severely limited in performance and its scalability due to notorious performance issues of ORAM. This paper presents TrustOre, addressing these issues that arise when using ORAM with Intel SGX. TrustOre leverages an external device, FPGA, to implement a trusted storage service within a completed isolated environment secure from side-channel attacks. TrustOre tackles several challenges in achieving such a goal: extending trust from SGX to FPGA without imposing architectural changes, providing a verifiably-secure connection between SGX applications and FPGA, and seamlessly supporting various access operations from SGX applications to FPGA.We implemented TrustOre on the commodity Intel Hybrid CPU-FPGA architecture. Then we evaluated with three state-of-the-art ORAM-based SGX applications, ZeroTrace, Obliviate, and Obfuscuro, as well as an end-to-end key-value store application. According to our evaluation, TrustOre-based applications outperforms ORAM-based original applications ranging from 10x to 43x, while also showing far better scalability than ORAM-based ones. We emphasize that since TrustOre can be deployed as a simple plug-in to SGX machine's PCIe slot, it is readily used to thwart side-channel attacks in SGX, arguably one of the most cryptic and critical security holes today.
SESSION: Session 6D: Web Security
Thanks to the widespread deployment of TLS, users can access private data over channels with end-to-end confidentiality and integrity. What they cannot do, however, is prove to third parties the provenance of such data, i.e., that it genuinely came from a particular website. Existing approaches either introduce undesirable trust assumptions or require server-side modifications. Users' private data is thus locked up at its point of origin. Users cannot export data in an integrity-protected way to other applications without help and permission from the current data holder. We propose DECO (short for decentralized oracle) to address the above problems. DECO allows users to prove that a piece of data accessed via TLS came from a particular website and optionally prove statements about such data in zero-knowledge, keeping the data itself secret. DECO is the first such system that works without trusted hardware or server-side modifications. DECO can liberate private data from centralized web-service silos, making it accessible to a rich spectrum of applications. To demonstrate the power of DECO, we implement three applications that are hard to achieve without it: a private financial instrument using smart contracts, converting legacy credentials to anonymous credentials, and verifiable claims against price discrimination.
HTTPS is principally designed for secure end-to-end communication, which adds confidentiality and integrity to sensitive data transmission. While several man-in-the-middle attacks (e.g., SSL Stripping) are available to break the secured connections, state-of-the-art security policies (e.g., HSTS) have significantly increased the cost of successful attacks. However, the TLS certificates shared by multiple domains make HTTPS hijacking attacks possible again.
In this paper, we term the HTTPS MITM attacks based on the shared TLS certificates as HTTPS Context Confusion Attack (SCC Attack). Despite a known threat, it has not yet been studied thoroughly. We aim to fill this gap with an in-depth empirical assessment of SCC Attack. We find the attack can succeed even for servers that have deployed current best practice of security policies. By rerouting encrypted traffic to another flawed server that shares the TLS certificate, attackers can bypass the security practices, hijack the ongoing HTTPS connections, and subsequently launch additional attacks including phishing and payment hijacking. Particularly, vulnerable HTTP headers from a third-party server are exploitable for this attack, and it is possible to hijack an already-established secure connection.
Through tests on popular websites, we find vulnerable subdomains under 126 apex domains in Alexa top 500 sites, including large vendors like Alibaba, JD, and Microsoft. Meanwhile, through a large-scale measurement, we find that TLS certificate sharing is prominent, which uncovers the high potential of such attacks, and we summarize the security dependencies among different parties. For responsible disclosure, we have reported the issues to affected vendors and received positive feedback. Our study sheds light on an influential attack surface of the HTTPS ecosystem and calls for proper mitigation against MITM attacks.
Website fingerprinting (WFP) aims to infer information about the content of encrypted and anonymized connections by observing patterns of data flows based on the size and direction of packets. By collecting traffic traces at a malicious Tor entry node --- one of the weakest adversaries in the attacker model of Tor --- a passive eavesdropper can leverage the captured meta-data to reveal the websites visited by a Tor user. As recently shown, WFP is significantly more effective and realistic than assumed. Concurrently, former WFP defenses are either infeasible for deployment in real-world settings or defend against specific WFP attacks only.
To limit the exposure of Tor users to WFP, we propose novel lightweight WFP defenses, TrafficSliver, which successfully counter today's WFP classifiers with reasonable bandwidth and latency overheads and, thus, make them attractive candidates for adoption in Tor. Through user-controlled splitting of traffic over multiple Tor entry nodes, TrafficSliver limits the data a single entry node can observe and distorts repeatable traffic patterns exploited by WFP attacks. We first propose a network-layer defense, in which we apply the concept of multipathing entirely within the Tor network. We show that our network-layer defense reduces the accuracy from more than 98% to less than 16% for all state-of-the-art WFP attacks without adding any artificial delays or dummy traffic. We further suggest an elegant client-side application-layer defense, which is independent of the underlying anonymization network. By sending single HTTP requests for different web objects over distinct Tor entry nodes, our application-layer defense reduces the detection rate of WFP classifiers by almost 50 percentage points. Although it offers lower protection than our network-layer defense, it provides a security boost at the cost of a very low implementation overhead and is fully compatible with today's Tor network.
SESSION: Session 6E: Zero Knowledge
Zero-knowledge proofs and in particular succinct non-interactive zero-knowledge proofs (so called zk-SNARKs) are getting increasingly used in real-world applications, with cryptocurrencies being the prime example. Simulation extractability (SE) is a strong security notion for zk-SNARKs which informally ensures non-malleability of proofs. The high importance of this property is acknowledged by leading companies in this field such as Zcash and underpinned by various attacks against the malleability of cryptographic primitives in the past. Another problematic issue for the practical use of zk-SNARKs is the requirement of a fully trusted setup, as especially for large-scale decentralized applications finding a trusted party that runs the setup is practically impossible. Quite recently, the study of approaches to relax or even remove the trust in the setup procedure, and in particular subversion as well as updatable zk-SNARKs (with latter being the most promising approach), has been initiated and received considerable attention since then. Unfortunately, so far SE-SNARKs with the aforementioned properties are only constructed in an ad-hoc manner and no generic techniques are available.
In this paper, we are interested in such generic techniques and therefore firstly revisit the only available lifting technique due to Kosba et al. (called COCO) to generically obtain SE-SNARKs. By exploring the design space of many recently proposed SNARK- and STARK-friendly symmetric-key primitives we thereby achieve significant improvements in the prover computation and proof size. Unfortunately, the COCO framework as well as our improved version (called OCOCO) is not compatible with updatable SNARKs. Consequently, we propose a novel generic lifting transformation called LAMASSU. It is built using different underlying ideas compared to COCO (and OCOCO). In contrast to COCO it only requires key-homomorphic signatures (which allow to shift keys) covering well studied schemes such as Schnorr or ECDSA. This makes LAMASSU highly interesting, as by using the novel concept of so called updatable signatures, which we introduce in this paper, we can prove that LAMASSU preserves the subversion and in particular updatable properties of the underlying zk-SNARK. This makes LAMASSU the first technique to also generically obtain SE subversion and updatable SNARKs. As its performance compares favorably to OCOCO, LAMASSU is an attractive alternative that in contrast to COCO is only based on well established cryptographic assumptions.
Vector commitments enable a user to commit to a sequence of values and provably reveal one or many values at specific posi- tions at a later time. In this work, we construct Pointproofs? a new vector commitment scheme that supports non-interactive aggregation of proofs across multiple commitments. Our construction enables any third party to aggregate a collection of proofs with respect to different, independently computed commitments into a single proof represented by an elliptic curve point of 48-bytes. In addition, our scheme is hiding: a commitment and proofs for some values reveal no information about the remaining values. We build Pointproofs and demonstrate how to apply them to blockchain smart contracts. In our example application, Pointproofs reduce bandwidth overheads for propagating a block of transactions by at least 60% compared to prior state- of-art vector commitments. Pointproofs are also efficient: on a single-thread, it takes 0.08 seconds to generate a proof for 8 values with respect to one commitment, 0.25 seconds to aggregate 4000 such proofs across multiple commitments into one proof, and 23 seconds (0.7 ms per value proven) to verify the aggregated proof.
This paper follows the line of works that design concretely efficient transparent sublinear zero-knowledge Interactive Oracle Proofs (IOP). Arguments obtained via this paradigm have the advantages of not relying on public-key cryptography, not requiring a trusted setup, and resistance to known quantum attacks. In the realm of transparent systems, Ligero and Aurora stand out with incomparable advantages where the former has a fast prover algorithm somewhat succinct proofs and the latter has somewhat fast prover and succinct proofs. In this work, we introduce Ligero++ that combines the best features of both approaches to achieve the best of both worlds. We implement our protocol and benchmark the results.
Machine learning has become increasingly prominent and is widely used in various applications in practice. Despite its great success, the integrity of machine learning predictions and accuracy is a rising concern. The reproducibility of machine learning models that are claimed to achieve high accuracy remains challenging, and the correctness and consistency of machine learning predictions in real products lack any security guarantees. In this paper, we initiate the study of zero knowledge machine learning and propose protocols for zero knowledge decision tree predictions and accuracy tests. The protocols allow the owner of a decision tree model to convince others that the model computes a prediction on a data sample, or achieves a certain accuracy on a public dataset, without leaking any information about the model itself. We develop approaches to efficiently turn decision tree predictions and accuracy into statements of zero knowledge proofs. We implement our protocols and demonstrate their efficiency in practice. For a decision tree model with 23 levels and 1,029 nodes, it only takes 250 seconds to generate a zero knowledge proof proving that the model achieves high accuracy on a dataset of 5,000 samples and 54 attributes, and the proof size is around 287 kilobytes.
Zero-Knowledge (ZK) proofs (ZKP) are foundational in cryptography. Most recent ZK research focuses on non-interactive proofs (NIZK) of small statements, useful in blockchain scenarios. Another line, and our focus, instead targets proofs of large statements that are useful, e.g., in proving properties of programs in ZK. We specify a zero-knowledge processor that executes arbitrary programs written in a simple instruction set, and proves in ZK the correctness of the execution. Such an approach is well-suited for constructing ZK proofs of large statements as it efficiently supports complex programming constructs, such as loops and RAM access. Critically, we propose several novel ZK improvements that make our approach concretely efficient: (1) an efficient arithmetic representation with conversions to/from Boolean, (2) an efficient read-only memory that uses $2łog n$ OTs per access, and (3) an efficient read-write memory, øurram, which uses $\frac1 2 łog^2 n$ OTs per access. øurram beats linear scan for RAM of size $>3$ elements! Prior ZK systems used generic ORAM costing orders of magnitude more. We cast our system as a garbling scheme that can be plugged into the ZK protocol of [Jawurek et al, CCS'13]. Put together, our system is concretely efficient: for a processor instantiated with $512$KB of main memory, each processor cycle costs $24$KB of communication. We implemented our approach in \textttC++. On a 1Gbps LAN our implementation realizes a $2.1$KHz processor.
SESSION: Keynote Talk II
The speaker will utilize his experience from inside one of the world's largest social networks during the 2016 and 2018 elections, and running an election integrity war room in 2020 to discuss the ways that technology fails the people we try so hard to serve. We will discuss the realistic assumptions we can make about threats, and the expectations we should have of users, and try to chart a path forward for how cutting-edge security research might better inform the engineers and product designers who end up putting computing technologies in the hands of billions.
We describe version 2.0 of our benchmarking framework, PhishBench. With the addition of the ability to dynamically load features, metrics, and classifiers, our new and improved framework allows researchers to rapidly evaluate new features and methods for machine-learning based phishing detection. Researchers can compare under identical circumstances their contributions with numerous built-in features, ranking methods, and classifiers used in the literature with the right evaluation metrics. We will demonstrate PhishBench 2.0 and compare it against at least two other automated ML systems.
VirusTotal is the largest online anti-malware scanning service. It is widely used by security researchers for labeling malware data or serving as a comparison baseline. However, several important challenges of using VirusTotal are left unaddressed (e.g., whether VirusTotal labels are already stable, when VirusTotal labels can be trusted), severely harming the correctness of research projects depending on VirusTotal.
In this paper, we present VTSet, which contains daily VirusTotal labels on more than 14,000 files over one year. VTSet can be used to build and evaluate various tools to tackle the existing challenges and facilitate the usage of VirusTotal. Besides the data, VTSet also provides a demonstration tool to display many measurement results and a query tool to ease the access of its data. A video demonstration of VTSet is located at the following link: https://youtu.be/aSVaUGHxFi4.
As a young programming language designed for systems software development, Rust aims to provide safety guarantees like high-level languages and performance efficiency like low-level languages. Lifetime is a core concept in Rust, and it is key to both safety checks and automated resource management conducted by the Rust compiler. However, Rust's lifetime rules are very complex. In reality, it is not uncommon that Rust programmers fail to infer the correct lifetime, causing severe concurrency and memory bugs. In this paper, we present VRLifeTime, an IDE tool that can visualize lifetime for Rust programs and help programmers avoid lifetime-related mistakes. Moreover, VRLifeTime can help detect some lifetime-related bugs (i.e., double locks) with detailed debugging information. A demo video is available at https://youtu.be/L5F_XCOrJTQ.
LPET -- Mining MS-Windows Software Privilege Escalation Vulnerabilities by Monitoring Interactive Behavior
Local Privilege Escalation (LPE) is a common attack vector used by attackers to gain higher-level permissions. In this poster, we present a system called LPET to mine LPE vulnerabilities of third-party software in MS-Windows. Our insight is that the LPE is often caused by the interactions between high-privilege processes and user-controllable files. The interactions include creating a file, starting a process and others. Based on this observation, LPET first monitors software behaviors and constructs a directed interaction graph to abstract entities, such as files and processes, and their interactions. Then LPET analyzes exploiting paths from the graph by extracting user-controllable entities and checking their privileges. Finally, LPET verifies the exploiting paths using replacement or hijacking attacks. In the preliminary experiments, LPET found vulnerabilities in various software. Moreover, we discovered a common weakness pattern that some components were executed by software with high privilege after being released in the user-controllable temporary directory during installation, update, and uninstallation. By replacing the components, attackers with low privilege can hijack the execution flow of software to execute their codes with high privilege. We found that a wide range of software suffers from this weakness pattern, including Cisco AnyConnect, Dropbox, Notepad++.
Increasing popularity of third-party package repositories, like NPM, PyPI, or RubyGems, makes them an attractive target for software supply chain attacks. By injecting malicious code into legitimate packages, attackers were known to gain more than 100,000 downloads of compromised packages. Current approaches for identifying malicious payloads are resource demanding. Therefore, they might not be applicable for the on-the-fly detection of suspicious artifacts being uploaded to the package repository. In this respect, we propose to use source code repositories (e.g., those in Github) for detecting injections into the distributed artifacts of a package. Our preliminary evaluation demonstrates that the proposed approach captures known attacks when malicious code was injected into PyPI packages. The analysis of the 2666 software artifacts (from all versions of the top ten most downloaded Python packages in PyPI) suggests that the technique is suitable for lightweight analysis of real-world packages.
Mitigating cybersecurity threats in power distribution grids requires a testbed for cybersecurity, e.g., to evaluate the (physical) impact of cyberattacks, generate datasets, test and validate security approaches, as well as train technical personnel. In this paper, we present a blueprint for such a testbed that relies on network emulation and power flow computation to couple real network applications with a simulated power grid. We discuss the benefits of our approach alongside preliminary results and various use cases for cybersecurity research and training for power distribution grids.
The number of cybersecurity threats has been increasing, and these threats have become more sophisticated year after year. Malicious hosts play a large role in modern cyberattacks, e.g., as a launcher of remote-control attacks or as a receiver of stolen information. In such circumstances, continuous monitoring of malicious hosts (URL/IP addresses) is indispensable to reveal cyberattack activities, and many studies have been conducted on that. However, many of them have limitations: they help only in the short-term or they help only a few regions and/or a few organizations. Therefore, we cannot effectively monitor attacks that are active for only a short time or that change their behavior depending on where the victims are from (e.g., country/organization). In this paper, we propose Stargazer, a program that monitors malicious hosts from multiple points on a long-term basis. Multiregional monitoring sensors and inter-organizational collaboration are conducted to achieve this surveillance. In this paper, we describe an implementation of the Stargazer prototype and how monitoring was carried out using multiregional sensors starting in Dec. 2018 of 1,050 malicious hosts; 10,929,418 measurements were obtained. Case studies on (1) revived hosts, (2) hosts that only respond to specific regions, and (3) the behavior of attack preparation were created.
Cyber-physical systems are increasingly threatened by sophisticated attackers, also attacking the physical aspect of systems. Supplementing protective measures, industrial intrusion detection systems promise to detect such attacks. However, due to industrial protocol diversity and lack of standard interfaces, great efforts are required to adapt these technologies to a large number of different protocols. To address this issue, we identify existing universally applicable intrusion detection approaches and propose a transcription for industrial protocols to realize protocol-independent semantic intrusion detection on top of different industrial protocols.
Tor is a powerful and important tool for providing anonymity and censorship resistance to users around the world. Yet it is surprisingly difficult to deploy new services in Tor---it is largely relegated to proxies and hidden services---or to nimbly react to new forms of attack. Conversely, "non-anonymous" Internet services are thriving like never before because of recent advances in programmable networks, such as Network Function Virtualization (NFV) which provides programmable in-network middleboxes.
This work seeks to close this gap by introducing programmable middleboxes into the Tor network. In this architecture, users can install and run sophisticated "functions" on willing Tor routers,further improving anonymity, resilience to attack, performance of hidden services, and more. We present the design of an architecture, Bento, that protects middlebox nodes from the functions they run and protects the functions from the middleboxes they run on. Bento does not require modifications to Tor, and can run on the live Tor network. Additionally, we give an overview of how we can significantly extend the capabilities of Tor to meet users' anonymity needs and nimbly react to new threats.
Given the same input, a program may not behave the same in two runs due to some non-deterministic features, e.g., context switch and randomization. Such behaviors would cause non-deterministic program bugs which are hard to discover or diagnose. Record-and-replay is a promising technique to address such issues, however, performance and transparency are the main obstacles of existing works. In this poster, we propose a novel record-and-replay system named RIPT. RIPT utilizes Intel Processor Trace to record control flow information with very low overhead, and transparently captures non-deterministic sources such as system calls and signals with a kernel module. During replay, RIPT recovers the effect of non-deterministic events from the collected information, and makes target programs behave the same as recorded. We evaluate it with real-world program bugs and show that RIPT works well in practice.
Successful deployment of Long-Range Wide Area Network (LoRaWAN) technology in several Industrial Internet of Things (IIoT) scenarios, such as Outage Management System (OMS) in smart metering, rely on low energy consumption of the end device. In this work, we conducted an experiment to demonstrate an on-off Denial-of-Service (DoS) attack to analyze the impact on the energy consumption of the LoRaWAN end device. We implemented the attack that manipulates the end device to remain in packet retransmission mode for several seconds. The conducted experiments show that the configurable parameters of LoRaWAN that are required for applications, like OMS, are susceptible to energy consumption attacks. In summary, our results show that when an on-off DoS attack is performed, the end device utilizing the Spreading Factor (SF) 12 consumes 92 times more energy due to packet retransmissions as compared to the end node using SF 7 under no attack.
The rapid growth of Internet of Things (IoT) devices makes it vitally important to understand real-world cybersecurity threats to them. Traditionally, honeypots have been used as decoys to mimic real devices on a network and help researchers/organizations understand the dynamic of threats. A crucial condition for a honeypot to yield useful insights is to let attackers believe they are real systems used by humans and organizations. However, IoT devices pose unique challenges in this respect, due to the large variety of device types and the physical-connectedness nature. In this work, we (1) presented an approach to create a multi-phased multi-faceted honeypot ecosystem, where researchers gradually increase the sophistication of a low-interaction IoT honeypot by observing real-world attackers' behaviors, (2) built a low-interaction honeypot for IoT cameras that allowed researchers to gain a concrete understanding of what attackers were going after on IoT camera devices, and (3) designed a proxy instance, called ProxyPot, that sits between IoT devices and the external network and helps researchers study the IoT devices' inbound/outbound communication. We used PorxyPot as a means to understanding attacks against IoT cameras and increasing the honeypot's sophistication. We deployed honeypots for more than two years. Our preliminary results showed that we were able to attract increasingly sophisticated attack data in each new phase. Moreover, we captured activities that appeared to involve direct human interactions rather than purely automated scripts.
Voice-Indistinguishability -- Protecting Voiceprint with Differential Privacy under an Untrusted Server
With the rising adoption of advanced voice-based technology together with increasing consumer demand for smart devices, voice-controlled "virtual assistants" such as Apple's Siri and Google Assistant have been integrated into people's daily lives. However, privacy and security concerns may hinder the development of such voice-based applications since speech data contain the speaker's biometric identifier, i.e., voiceprint (as analogous to fingerprint). To alleviate privacy concerns in speech data collection, we propose a fast speech data de-identification system that allows a user to share her speech data with formal privacy guarantee to an untrusted server. Our open-sourced system can be easily integrated into other speech processing systems for collecting users' voice data in a privacy-preserving way. Experiments on public datasets verify the effectiveness and efficiency of the proposed system.
Insider threat is a well-recognized problem in the cyber-security domain. There is good amount of research on detecting and predicting an insider attack. However, none of them addresses the influence of an insider over other individuals, and the spread of impact due to direct and indirect access to enterprise assets by having such influence. In this work, we propose a graph-based influence profiling solution called rProfiler that analyzes the data from multiple sources to determine the influence spread and calculate the probability of loss of data from an affected device using pertinent graph features. We also highlight multiple enterprise scenarios that may benefit from this work.
Clouds and massive-scale computing infrastructures are starting to dominate computing and will likely continue to do so for the foreseeable future. Major cloud operators are now comprising millions of cores hosting substantial fractions of corporate and government IT infrastructure.
CCSW is the world's premier forum bringing together researchers and practitioners in all security aspects of cloud-centric and outsourced computing, including: Side channel attacks; Practical cryptographic protocols for cloud security; Secure cloud resource virtualization mechanisms; Secure data management outsourcing (e.g., database as a service); Practical privacy and integrity mechanisms for outsourcing; Foundations of cloud-centric threat models; Secure computation outsourcing; Remote attestation mechanisms in clouds; Sandboxing and VM-based enforcements; Trust and policy management in clouds; Secure identity management mechanisms; New cloud-aware web service security paradigms and mechanisms; Cloud-centric regulatory compliance issues and mechanisms; Business and security risk models and clouds; Cost and usability models and their interaction with security in clouds; Scalability of security in global-size clouds; Trusted computing technology and clouds; Binary analysis of software for remote attestation and cloud protection; Network security (DOS, IDS etc.) mechanisms for cloud contexts; Security for emerging cloud programming models; Energy/cost/efficiency of security in clouds; Machine learning for cloud protection CCSW especially encourages novel paradigms and controversial ideas that are not on the above list. The workshop has historically acted as a fertile ground for creative debate and interaction in security-sensitive areas of computing impacted by clouds.
This year marked the 11th anniversary of CCSW. In the past decade, CCSW has had a significant impact in our research community. As of August 2019, in the Google Scholar Metrics entry for ACM CCS (which encompasses CCSW), 20% of the top 20 cited papers come from CCSW. One way to look at it is that authors are as likely or perhaps more likely to have a top-20 paper publishing in CCSW than in CCS!
This year, CCSW received 40 submissions out of which 12 full papers (30%) and 5 blitz abstracts were accepted.
There is a rapidly growing interest in the security of cyber-physical systems (CPS) and internet-of-things (IoT) in industry, government and academia. NIST recently created a Cyber-Physical Systems group and it is leading a public-private initiative to identify a general architecture, design principles, solutions, and challenges ahead. In Europe, the Horizon 2020 research program has targeted security issues relating to cyber-physical infrastructures and Internet of Things, while the fundamental science program CHIST-ERA launched calls for research projects on resilient trustworthy cyber-physical systems (2015) and user-centered security and privacy in the Internet of Things (2016). In the US, grant calls such as the 2019 CPS grant call by NSF (total of 50M USD) show the significance of CPS&IoT and their security.
In 2019, CCS accepted two workshops: One on CPS security and privacy (running 5 times in the past) and one on IoT security and privacy (running 2 times in the past). Since the topics are connected and there is some significant overlap, the Steering Committees of the two workshops have decided to merge towards delivering a world-class event on CPS&IoT security and privacy.
The seventh ACM Workshop on Moving Target Defense (MTD) Workshop is held virtually on November 9, 2020, in conjunction with the ACM Conference on Computer and Communications Security (CCS). The main objective of the workshop is to discuss novel randomization, diversification, and dynamism techniques for computer systems and network, new metric and analysis frameworks to assess and quantify the effectiveness of MTD, and discuss challenges and opportunities that such defenses provide. New this year the workshop has incorporated a number of invited papers to capture systematization of knowledge (SoK) from experts in this field that investigate the past ten years of MTD and discuss the way forward. We have constructed an exciting and diverse program of three refereed papers, five invited papers, and two invited keynote talks that will provide the participant with a vibrant and thought-provoking set of ideas and insights.
With the rapid development of technology, data is becoming ubiquitous. User privacy and data security are drawing much attention over the recent years, especially with the European Union's General Data Protection Regulation (GDPR) and other laws coming into force. On one hand, from the customers' perspective, how to protect user privacy while making use of customers? data is a challenging task. On the other hand, data silos are becoming one of the most prominent issues for the society. From the business? perspective, how to bridge these isolated data islands to build better AI systems while meeting the data privacy and regulatory compliance requirements has imposed great challenges to the traditional machine learning paradigm. PPMLP will provide an opportunity to connect researchers from both CCS community and machine learning community to tackle these challenges.
The 19th Workshop on Privacy in the Electronic Society (WPES 2020) was held as a virtual conference on 9 November, 2020, in conjunction with the 27th ACM Conference on Computer and Communication Security (CCS 2020). The goal of WPES is to bring together privacy researchers and practitioners to discuss the privacy problems that arise in an interconnected society and solutions to those problems. The program for the workshop contains 12 full papers and 3 short papers selected from a total of 34 submissions. Specific topics covered in the program include but are not limited to: communication privacy, data anonymization, differential privacy, medical privacy, mobile privacy, privacy engineering, privacy policies, user perception of privacy, and Web privacy.
Recent years have seen a dramatic increase in applications of Artificial Intelligence (AI), Machine Learning (ML), and data mining to security and privacy problems. The analytic tools and intelligent behavior provided by these techniques make AI and ML increasingly important for autonomous real-time analysis and decision making in domains with a wealth of data or that require quick reactions to constantly changing situations. The use of learning methods in security-sensitive domains, in which adversaries may attempt to mislead or evade intelligent machines, creates new frontiers for security research. The recent widespread adoption of "deep learning'' techniques, whose security properties are difficult to reason about directly, has only added to the importance of this research. In addition, data mining and machine learning techniques create a wealth of privacy issues, due to the abundance and accessibility of data. The AISec workshop provides a venue for presenting and discussing new developments in the intersection of security and privacy with AI and machine learning.
The workshop on "Attacks and Solutions in HardwarE Security"(ASHES) welcomes any theoretical and practical works on hardware security, including attacks, solutions, countermeasures, proofs, classification, formalization, and implementations. Besides mainstream research, ASHES puts some focus on new and emerging scenarios: This includes the internet of things (IoT), nuclear weapons inspections, arms control, consumer and infrastructure security, or supply chain security, among others. ASHES also welcomes dedicated works on special purpose hardware, such as lightweight, low-cost, and energy-efficient devices, or non-electronic security systems. The workshop hosts four different paper categories: Apart from regular and short papers, this includes works that systematize and structure a certain (sub-)area (so-called "Systematization of Knowledge" (SoK) papers), and so-termed "Wild and Crazy" (WaC) papers, which distribute seminal ideas at an early conceptual stage. This summary gives a brief overview of the fourth edition of the workshop, which will take place virtually on November 13, 2020, as a post-conference satellite workshop of ACM CCS.
The goal of CYSARM workshop is to foster collaboration among researchers and practitioners to discuss the various facets and trade-offs of cyber-security. In particular, how new technologies and algorithms might impact the cyber-security of existing or future models and systems.
The Fifth Workshop on Forming an Ecosystem Around Software Transformation (FEAST) provides a forum for presentation and discussion of new tools, methodologies, and techniques facilitating the automated or semi-automated transformation and analysis of software executables for improving their security and efficiency without the benefit of any original source code whence they were developed. Late-stage software customization of this form is of particular benefit to security-conscious software consumers who must use closed-source or source-free binary software components in mission-critical settings, or who must harden software against newly emerging attacks not anticipated during the software's original design and development. However, code analysis and transformation becomes much more difficult without the aid of source-level information to provide a context for its intended operation. This outstanding challenge motivates the FEAST Workshop's goal of forming a robust ecosystem of strategies and tools for accomplishing source-free binary code transformation reliably and on-demand.
The 15th ACM SIGSAC Workshop on Programming Languages and Analysis for Security (PLAS 2020) is co-located with the 27th ACM Conference on Computer and Communications Security (ACM CCS 2020). Over its now more than ten-year history, PLAS has provided a unique forum for researchers and practitioners to exchange ideas about programming language and program analysis techniques with the goal of improving the security of software systems. Strongly encouraged are proposals of new, speculative ideas, evaluations of new or known techniques in practical settings, and discussions of emerging threats and important problems.
Differential privacy is a rigorous mathematical model of privacy protection that has been the subject of deep theoretical research and also been deployed in real-world systems. This workshop aims to bring together a diverse array of researchers and practitioners to provoke stimulating discussion about the current state of differential privacy, in theory and practice. TPDP aims to be an inclusive forum that seeks to grow and diversify the differential privacy community.