Dr. Avimanyu Sahoo, a researcher at 哔哩传媒 (UAH), has been awarded a National Science Foundation (NSF) award totaling $299,969 to characterize the vulnerability of the learning-based intelligent cyber-physical systems (CPS) and defend them. The CPS represents a symbiotic integration of physical systems, sensors, actuators and learning-based intelligent controllers through communication networks such as smart grids, robotic swarms and autonomous vehicles.
Sahoo is an assistant professor in the department of Electrical and Computer Engineering at UAH, a part of The University of Alabama System. The effort is a collaborative project with Dr. Vignesh Narayan, assistant professor from the Artificial Intelligence Institute at the University of South Carolina. The total funding for the project is $600,000. The project is slated to run through June 2027.
In CPS, many systems are connected to each other via a communications network. Learning-based intelligent controllers are used for decision making, which upgrade the capabilities of CPS, providing numerous benefits, such as helping a system adapt to real-world environments and achieve optimal function. Like neuronal reward and decision signals in the human brain, intelligent controllers for CPS seeks to mimic how human learning systems process decision-making and reward signals.
However, 鈥渢he introduction of a learning component adds an additional layer of security challenges, which adversaries can exploit via cyber attacks,鈥 Sahoo explains. 鈥淏y feeding manipulated reward signals, the adversary can subtly steer the controller into following an adversarial policy instead of its intended one.
鈥淔or instance, an autonomous vehicle may have a goal to travel from Huntsville to Nashville via I-65, but an adversary could manipulate the reward signals to discourage this route by providing negative rewards whenever the vehicle travels on I-65. As a result, the vehicle might take a longer, less efficient route, consuming more fuel 鈥 ultimately serving the adversary's objectives.鈥
The goal is to identify attacks that manipulate a controller鈥檚 actions. 鈥淚n a microgrid or power grid, all the power plants are connected,鈥 the researcher notes. 鈥淚f we attack one generating system, it will affect the others. Therefore, to systematically study this, we start by analyzing a single generator under attack, focusing on how different types of attacks can bias the controller鈥檚 decision-making.鈥
The overarching aim of the project is to understand the information patterns that adversaries can exploit to manipulate controller decisions through an approach called 鈥渞einforcement learning.鈥
鈥淚n reinforcement learning, an agent or controller interacts with an environment by taking actions and receiving feedback in the form of rewards,鈥 Sahoo says. 鈥淭he goal is to learn an optimal policy that maximizes cumulative rewards. For example, in a microgrid 鈥 a cyber-physical system comprising generators, controllers and loads 鈥 a controller regulates parameters like voltage or frequency. The generator (acting as the environment) evaluates the controller's action and provides a reward based on how well the regulation goal was achieved.
鈥淎 positive reward indicates that the control action was successful, while a negative reward suggests otherwise. These reward signals guide the controller in refining its policy for future actions. The environment typically uses performance evaluation criteria to generate these reward signals, helping the controller iteratively improve its actions.鈥
The research will be applicable to fending off adversaries in military or civilian arenas.
鈥淭his is not only restricted to the military,鈥 Sahoo points out. 鈥淭he reinforcement learning approach is used in many fields, such as control of any dynamical systems (power grid, microgrid, robotics, autonomous vehicles). In fact, our project will validate the research outcomes in a microgrid scenario. We will also demonstrate the research to high school students through robotic applications.鈥
With the dizzying speed of technological change, providing safeguards to advanced systems will only grow as a priority.
鈥淭he reinforcement learning approach helps in systematically identifying vulnerabilities and potential countermeasures for maintaining the integrity of decision-making in critical systems,鈥 the researcher concludes.
This project is jointly funded through NSF by the (SaTC) program and the (EPSCoR).