Harnessing the Power of General-Purpose LLMs in Hardware Trojan Design

Georgios Kokolakis, Athanasios Moschos, Angelos D Keromytis
Artificial Intelligence in Hardware Security Workshop - Applied Cryptography and Network Security, 2024

Large language models (LLMs) are becoming a powerful transformative force of automation in the areas of software engineering and cybersecurity. In software-centric security research, the LLMs have undertaken a prime role in the identification and repair of security vulnerabilities and bugs. However, in hardware related fields such as logic design and hardware security, use of LLMs has only recently started to get traction. In this work we aim to explore the potential of LLMs in the offensive hardware security domain. More specificaly, we explore the level of assistance that LLMs can provide to attackers for the insertion of vulnerabilities, known as hardware trojans (HTs), in complex hardware designs (e.g., CPUs). Having in mind high-level attack outlines, we test the ability of a general-purpose LLM to act as a “filter” that correlates system level concepts of security interest with specific module abstractions of hardware designs. By doing so, we tackle the challenges posed by the context length limit of LLMs, that become prevalent during LLM-based analyses of large code bases. Next, we initiate an LLM analysis of the reduced code base, that includes only the register transfer level code of the identified modules and test the LLM’s ability to locate the parts that implement the queried security related features. In this way, we reduce the complexity of the overall analysis performed by the LLM. Lastly, we instruct the LLM to insert suitable trojan functionalities by modifying the identified code parts accordingly. To showcase the potential of our automated LLM-based hardware trojan insertion flow, we craft a realistic HT for a modern RISC-V micro-architecture. We test the functionality of the LLM-generated HT on an FPGA board, by attacking the integrity and the availability of the RISCV CPU. Hence, we demonstrate how general-purpose LLMs can navigate attackers through complex hardware designs and assist them in the implementation of realistic HT attacks.