ASHES’19- Proceedings of the 3rd ACM Workshop on Attacks and Solutions in Hardware Security WorkshopFull Citation in the ACM Digital Library
In this talk, I will discuss how recent advances in side-channel analysis and leakage-resilience could lead to both stronger security properties and improved confidence in cryptographic implementations. For this purpose, I will start by describing how side-channel attacks exploit physical leakages such as an implementation’s power consumption or electromagnetic radiation. I will then discuss the definitional challenges that these attacks raise, and argue why heuristic hardware-level countermeasures are unlikely to solve the problem convincingly. Based on these premises, and focusing on the symmetric setting, securing cryptographic implementations can be viewed as a tradeoff between the design of modes of operation, underlying primitives and countermeasures. Regarding modes of operation, I will describe a general design strategy for leakage-resilient authenticated encryption, propose models and assumptions on which security proofs can be based, and show how this design strategy encourages so-called leveled implementations, where only a part of the computation needs strong (hence expensive) protections against side-channel attacks. Regarding underlying primitives and countermeasures, I will first emphasize the formal and practically-relevant guarantees that can be obtained thanks to masking (i.e., secret sharing at the circuit level), and how considering the implementation of such countermeasures as an algorithmic design goal (e.g., for block ciphers) can lead to improved performances. I will then describe how limiting the leakage of the less protected parts in a leveled implementations can be combined with excellent performances, for instance with respect to the energy cost. I will conclude by putting forward the importance of sound evaluation practices in order to empirically validate (by lack of falsification) the assumptions needed both for leakage-resilient modes of operation and countermeasures like masking, and motivate the need of an open approach for this purpose. That is, by allowing adversaries and evaluators to know implementation details, we can expect to enable a better understanding of the fundamentals of physical security, therefore leading to improved security and efficiency in the long term.
SESSION: Full Papers
FPGA-SoCs are heterogeneous computing systems consisting of reconfigurable hardware and high performance processing units. This combination enables a flexible design methodology for embedded systems. However, the sharing of resources between these heterogeneous systems opens the door to attacks from one system on the other. This work considers Direct Memory Access attacks from a malicious hardware block inside the reconfigurable logic on the CPU. Previous works have shown similar attacks on FPGA-SoCs containing no memory isolation between the FPGA and the CPU. Our work studies the same idea on a system based on the Xilinx Zynq Ultrascale+ architecture. This platform contains memory isolation mechanisms such as a system memory management unit, memory protection units and supports ARM TrustZone technology. Despite the existence of these protection mechanisms, the two attacks presented in this work show that a malicious hardware block can still interfere with a security critical task executed on the CPU inside ARM TrustZone
Simple Electromagnetic Analysis Attacks based on Geometric Leak on an ASIC Implementation of Ring-Oscillator PUF
Physically unclonable functions (PUFs) are assumed to provide high tamper resistance against counterfeiting and hardware attacks since PUFs extract inherent physical properties from random and uncontrollable variations in manufacturing. Recent studies have reported on the vulnerabilities to physical and mathematical attacks on PUFs. This paper focuses on the security evaluation of a ring-oscillator PUF (RO PUF) against electromagnetic analysis (EMA) attacks. We designed an RO PUF with a 180-nm CMOS process to evaluate the threats of EMA attacks. The power consumption of this RO PUF is reduced as much as possible to reduce EM leaks and EMA resistance is enhanced in the layout design. We show the EMA-attack results on our RO PUF and discuss the threats of EMA attacks on the application specific integrated circuit (ASIC) implementation of RO PUFs. We also propose a new EMA attack on RO PUFs. The key is geometric leak. All components of an RO PUF are usually arranged in a matrix or an array. Geometric periodicity in the layout of RO PUFs leaks secret PUF responses. Though previous studies require identifying ROs, the proposed attack, called simple EMA (SEMA) attack based on geometric leak, reveals a PUF response from one measured EM trace directly. These attacks correctly predicted 94.2% of PUF responses of our RO PUF. We present how a PUF response is revealed from a measured EM trace, suggesting that such attacks pose a serious threat to RO PUFs.
Physical attacks constitute a significant threat for any cryptosystem. Among them, Side-Channel Analysis (SCA) is a common practice to stress the security of embedded devices like smartcards or secure controllers. Nowadays, it has become more than relevant on mobile and connected devices requiring a high security level. Yet, their applicability to smartphones is not obvious, as the architecture of modern System-on-Chips (SoC) is becoming ever more complex.
This paper describes how a secret AES key was retrieved from the hardware cryptoprocessor of a smartphone. It is part of an attack scenario targeting the bootloader decryption. The focus is on practical realization and the challenges it brings. In particular, catching meaningful signals emitted by the cryptoprocessor embedded in the main System-on-Chip can be troublesome. Indeed, the Package-on-Package technology makes access to the die problematic and prevents straightforward near-field electromagnetic measurements. The described scenario can apply to any device whose chain-of-trust relies on firmware encryption, such as many smartphones or Internet-of-Things nodes.
Physical cryptographic implementations are vulnerable to side-channel attacks, including fault attacks, which can be used to recover a secret key. Using a deep neural network (NN) with fault intensity map analysis (FIMA), we present a new highly efficient statistical fault analysis technique called FIMA-NN. This technique employs a convolutional neural network (CNN) to rank the key candidates based on multiple features in data distribution under fault with varying intensities, and generalizes most existing statistical techniques including fault sensitivity analysis (FSA), differential fault intensity analysis (DFIA), statistical ineffective fault analysis (SIFA), and FIMA. As FIMA-NN does not rely on a single feature of data distribution, it is successful even in the presence of a wide variety of countermeasures against fault analysis. Using a simulated fault mechanism on an FPGA implementation of AES, we demonstrate that, in terms of required amount of collected ciphertexts, FIMA-NN is 7.3 and 4.5 times more efficient than statistical techniques using bias alone, at low and medium fault intensities, respectively. Further, in the presence of error-detection and infective countermeasures, FIMA-NN is 10.7 and 7.9 times more efficient than biased-based techniques, respectively.
FPGA system on chips (SoCs) are ideal computing platforms for edge devices in applications which require high performance through hardware acceleration and updatability due to long operation in the field. A secure update of hardware functionality can in general be achieved by using built-in cryptographic engines and provided secret key storage. However, reported examples have shown that such cryptographic engines may become insecure against side-channel attacks at any later point in time. This leaves already deployed systems vulnerable without any clear mitigation options. To solve this, we propose a comprehensive concept that uses an alternative and side-channel protected cryptographic engine within the FPGA logic instead of the built-in one for the crucial task of bitstream decryption. Remarkably this concept even allows to update the cryptographic engine itself. As proof of concept, we describe an application to the Xilinx Zynq-7020 FPGA SoC in detail using a leakage resilient decryption engine. The lack of accessible secret key storage poses a significant challenge and requires the use of a physical unclonable function (PUF) to generate a device intrinsic secret within the FPGA logic. At the same time this means that no manufacturer provided secret key storage or cryptography is required anymore; only a public key for signature verification of the first stage bootloader and initial static bitstream. We provide empirical results proving the side-channel security of the protected cryptographic engine as well as an evaluation of the PUF quality. The full design and source code is made available to encourage further research in this direction.
Side-channel attacks exploit architectural features of computing systems and algorithmic properties of applications executing on these systems to steal sensitive information. Cache side-channel attacks are more powerful and practical compared to other classes of side-channel attacks due to several factors, such as the ability to be mounted without physical access to the system. Some secure cache architectures have been proposed to counter side-channel attacks. However, they all incur significant performance overheads. This work explores the viability of using adaptive caches, which are conventionally used as a performance-oriented architectural feature, as a defense mechanism against cache side-channel attacks. We conduct an empirical analysis, starting from establishing a baseline for the attacker’s ability to infer information regarding the memory accesses of the victim process when there is no active defense mechanism in place and the attacker is fully aware of all the cache parameters. Then, we analyze the effectiveness of the attack without complete knowledge of the cache configuration. Finally, based on the insight that the success of the attack is heavily dependent on knowledge of the cache configuration, we implement the run-time cache reconfigurations and observe their effect on the success of the attack. We observe that reconfiguring different cache parameters during a side-channel attack reduces the accuracy of the attack in detecting cache sets accessed by the victim by 44% on average, with a maximum of 90% reduction.
Reverse engineering of integrated circuits (IC) serves an evergrowing need for both defensive and offensive applications, such as competitive analysis, IP theft evidence and hardware Trojan detection. The IC reverse engineering process comprises two phases, netlist extraction and specification discovery. The latter draws a particular research interest due to fundamental questions of the process, which are how to represent specification and how to measure success of the process. In this paper, we survey the state of the art in IC reverse engineering, focusing on the specification discovery. We generate a taxonomy of the published methods and algorithms, list the challenges and open questions and discuss future directions.
Due to the outsourcing of semiconductor design and manufacturing, a number of threats have emerged in recent years, and they are overproduction of integrated circuits (ICs), illegal sale of defective ICs, and piracy of intellectual properties (IPs). Logic locking is one method to enable trust in this complex IC design and manufacturing processes, where a design is obfuscated by inserting a lock to modify the underlying functionality so that an adversary cannot make a chip to function properly. A locked chip will only work properly once it is activated by programming with a secret key into its tamper-proof memory. Over the years, researchers have proposed different locking mechanisms primarily to prevent Boolean satisfiability (SAT)-based attacks, and successfully preserve the security of a locked design. However, an untrusted foundry, the adversary, can use many other effective means to find out the secret key. In this paper, we present a novel oracle-less and topology-guided attack denoted as TGA. The attack relies on identifying repeated functions for determining the value of a key bit. The proposed attack does not require any data from an unlocked chip, and eliminates the need for an oracle. The attack is based on self-referencing, i.e., it compares the internal netlist to find the key. The proposed graph search algorithm efficiently finds a duplicate function of the locked part of the circuit. Our proposed attack correctly estimate a key bit very efficiently, and it only takes few seconds to determine the key bit. We also present a solution to thwart TGA and make logic locking secure.
Clock glitches are an inexpensive method to attack embedded systems. Usually the intention is to alter the program flow or to extract cryptographic keys. However, the wide-spread use of Phase Locked Loops (PLLs) prohibits the direct reach-through on the internal clock. Hence, the commonly applied procedure to induce glitches on the external clock does not have any effect on these systems. In this paper, we show by means of two different ARM Cortex-M microcontrollers, that despite the fact that the system clock is derived from the external clock signal by a PLL, fault injection by manipulation of the external clock signal is yet feasible. Even though the process of fault injection is impeded, our results indicate that the risk from this attack vector cannot be eliminated by the use of PLLs. We demonstrate this in practice by successfully performing a differential fault attack on an AES implementation.
This paper presents a low-cost distance-spoofing attack on a mmWave Frequency Modulated Continuous Wave (FMCW) radar. It uses only a replica radar chipset and a single compact microcontroller board both in mass production. No expensive and bulky test instrument is required, and hence a low-cost and light-weight attack setup is developed. Even with the limited hardware resource in this setup, the replica radar can be precisely synchronized with the target radar for distance-spoofing capability. A half-chirp modulation scheme enables timing compensation between crystal oscillators on the replica and the target radar boards. A two-step delay insertion scheme precisely controls relative delay difference between two radars at ns-order, and as a result the attacker can manipulate distance measured at target radar with only around ±10m ranging error. This demonstrates potential feasibility of low-cost malicious attack on the commercial FMCW radar as a physical security threat. A countermeasure employing random-chirp modulation is proposed and its security level is evaluated under the proposed attack for secure and safe radar ranging.
A Large Scale Comprehensive Evaluation of Single-Slice Ring Oscillator and PicoPUF Bit Cells on 28nm Xilinx FPGAs
Many field programmable gate array (FPGA)-based security primitives have been developed, e.g., physical unclonable functions (PUFs) and true random number generator (TRNG). To accurately evaluate the performance of a PUF or other security designs, data from a large number of devices are required. A slice is the smallest reconfigurable logic block in an FPGA. The maximum or minimum entropy, exploitable from each slice of an FPGA, is an important factor for the design of a single-bit disorder-based security primitive. Previous research has shown that the locations of slices can impact the quality of delay-based PUF designs implemented on FPGAs. To investigate the effect of the placement of each single-bit PUF cell free from the routing resource constraint between slices, single-bit ring oscillator (RO) and identity-based PUF design (PicoPUF) cells that can each be fully fitted into a single slice are evaluated. 217 Xilinx Artix-7 FPGAs has been employed to provide a large-scale comprehensive analysis for the two designs. This is the first time two different single slice based security entities have been investigated and compared on 28nm Xilinx FPGA. Experimental results, including uniqueness, uniformity, correlation, reliability, bit-aliasing and min-entropy, based on 4 different floorplan locations are presented. The experimental results demonstrate that the lower the correlation between devices, the higher the min-entropy and uniqueness for both designs on the FPGAs. While the implementation location of both designs on the FPGA affects their performances, the overall min-entropy, correlation and uniqueness of PicoPUF are slightly higher than those of RO. All other metrics, including uniformity, bit-aliasing and reliability of the PicoPUF are slightly lower than those of the RO. The raw data for the PicoPUF design is made publicly available to enable the research community to use them for benchmarking and/or validation.