Removing the human element from the data collection process for machine learning model training can introduce security vulnerabilities.
Machine learning is the darling of business operations, but breaking it is a promising and innovative field. If trying to mess up ML models is puzzling, the collaborative research team from MIT, U of Maryland, U of Illinois Urbana Champaign, and U of California Berkeley wants you to imagine the possibilities.
In a paper published last month titled “Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses,” researchers sought to categorize and examine a wide range of dataset vulnerabilities and exploits and suggested approaches for defending against these threats.
See also: Modern Applications Must Consider Container Security Risks
As part of their work, the researchers summarized the techniques they used to attack datasets. The research lays foundations that help predict and prevent security loopholes businesses and other organizations have because of the rapidly evolving nature of artificial intelligence models.
These purposeful attacks include things like poisoning classification data, causing the machine to misrepresent or misclassify items. It could also involve perturbing the outputs of trained models causing aberrant results. The researchers explored these types of backdoor attacks to provide a better understanding of how to prevent them.
Why machine learning security matters to businesses and organizations
As machine learning becomes more widely adopted, training requirements grow exponentially. And as such, organizations find they need much more data. The way to keep pace with this demand is to automate and outsource the curation of training data. Removing the human element from the data collection process can introduce security vulnerabilities. For example, training data can be manipulated to control and degrade the output of learned models.
Furthermore, artificial intelligence and machine learning systems are wildly complicated, and businesses have experienced enough issues through no-fault data misuse. When you add the possibilities of backdoor attacks – or issues the business may not readily identify because of AI’s black box decision making – it makes it challenging to implement machine learning initiatives into the foundational level of a company.
The paper also explored multiple ways organizations can use knowledge of these perturbations to prevent and fix any possible attacks. As researchers repeatedly try to mess with these complex systems, it only makes it easier for businesses to predict weak spots and security loopholes or fix them in the case of a successful attack.
This research is great news for organizations. Machine learning and AI must become more transparent to foster the trust businesses need to continue gathering customer data. It also ensures their deployment in more broad contexts with security first architecture.
While CEOs may not understand the full scope of these methods, the IT team is certainly paying attention. The more we know about how these models break down, the better able we are to deploy them and review what goes wrong before it’s a security situation.