As Bloomberg reports, machine-learning systems depend on so many pieces of correctly labeled information that even the biggest cybersecurity companies are sometimes forced to crowd-source their malware training data.
Northwestern University researcher Giorgio Severi has pointed out that “a resourceful hacker” can exploit this potential vulnerability. Such a hacker would create malware, label it as “good,” and then include it in a large data trove, fooling a neural network into believing that the malware is benign.
Two other researchers, Cheng Shin-ming and Tseng Ming-huei, have reportedly demonstrated that poisoning less than 0.7% of the data inputted into a machine-learning system could fully bypass defenses. So it doesn’t take much malware to crack a neural network, and it doesn’t take much supplemental data to expose the system to risk, either.
Cybersecurity firms are working on the data-poisoning threat. One safeguard would be for AI model developers to periodically confirm that their training data is all accurately labeled. Research company OpenAI LLP, co-founded by Elon Musk, said that its new image-generating tool uses special filters to be sure that data sets are accurately labeled.
As TechGenix reports, cybersecurity threats from data poisoning include training an AI to wrongly trust external parties, manipulating a pool of data to make something seem more prevalent than it is or guiding AI to use incorrect parameters, such as for whether a person is eligible to receive a loan.
Warnings about data poisoning aren’t new. Johannes Ullrich, dean of research of SANS Technology Institute, highlighted the risk in his RSA 2021 keynote presentation, as Security Intelligence reports. “One of the most basic threats when it comes to machine learning is one of the attackers actually being able to influence the samples that we are using to train our models,” Ullrich said.
“Adversarial data poisoning is an effective attack against ML and threatens model integrity by introducing poisoned data into the training dataset,” researchers from Cornell University have written, as TechHQ reports.
Government officials have warned that humans will still need to oversee cybersecurity addressing artificial intelligence, as TechTarget reports.