As automotive vehicles are increasingly connected, they face higher risk of being compromised. Especially, their in-vehicle networks are prone to attacks due to original designs with no security concerns in mind and one of the most common attacks is injecting messages to a vehicle’s CAN Bus.
In fact, this received widespread media attention in 2015 that a Senate bill was proposed in 2015 and recently reintroduced in 2019 to “ensure cybersecurity in increasingly computerized vehicles”.
Similar threats exist for aircrafts, smart factories, smart buildings, and of course, the increasing number of IoT appliances. But let’s focus on automotive cybersecurity in this tutorial.
Two of the proposed requirement in the SPY Car act are:
- All entry points to the electronic systems of each motor vehicle manufactured for sale in the United States shall be equipped with reasonable measures to protect against hacking attacks
- Any motor vehicle manufactured for sale in the United States that presents an entry point shall be equipped with capabilities to immediately detect, report, and stop attempts to intercept driving data or control the vehicle
It is not obvious that such intrusion detection system (IDS) could work, but it turns out that with careful system design, we can construct it with Human1st.AI. The nature of CAN bus data and vehicle operations is that the normal traffic is highly regular (unlike an open node on the internet) and we can leverage this to build an IDS.
Let’s dive in!
1a. CAN data basics
Let’s familiarize ourselves with vehicle data.
Controller Area Network (CAN Bus) is a common in-vehicle network architecture. It was designed to avoid massive physical wires between Engine Control Units (ECUs) in a vehicle. A CAN packet (also called message)’s payload contains data from one or more ECUs which we refer to as sensors such as Car Speed, Steering Wheel’s Angle, Yaw Rate, Longitudinal Acceleration (Gx), Latitudinal Acceleration (Gy).
CAN Bus’ simple communication protocol makes it vulnerable to cyber-attacks due to security issues such as message broadcasting, no authentication, etc. Injection attacks are common to CAN Bus.
Note
All the tutorial notebooks and code is available from the H1st Github project at https://github.com/h1st-ai/h1st/tree/master/examples/AutoCyber.
Simply go ahead and clone it, then follow along.
The following dataset is originally based on https://zenodo.org/record/3267184#.XpHta1NKhQJ with important processing done by Arimo. The reason is that recreating realistic message frequency for each CAN ID is crucial for this problem. Simply following along the tutorial would help you understand why this is needed.
For convenient, we provides a utility function to download this dataset which is about ~200MB in size.
Note that the data has a particular rhythm to it: each non-NA CarSpeed or YawRate comes at a regular interval, and YawRate/Gx/Gy messages always come with each other. In technical parlance, these are 3 different CAN IDs with different message payloads.
1b. Simulating attacks
Now comes the hard & fun part, as we only have normal data. How can we develop an intrusion detection system?
The first natural step is to generate attack data. There are many ways to simulate such attacks but the cheapest method is simply to inject fake messages into the stored data stream.
A more realistic (and also more expensive) method to safely simulate attacks is to inject messages directly into the CAN bus while vehicle is stationary (engine on/transmission in park), or when vehicle is in motion in a controlled driving environment / test track such as conducted by the NHTSA
For convenience, we have provided some synthetic samples (they are generated using the aegis_datagen.py). We can visualize one such attack as follow.
The key question is can ML/AD system detect the injected messages from the normal ones?