Machine Learning Project Security: 5 Machine Learning Security Risks You Should Watch Out For

Last updated on Sep 29, 2021

Machine learning can offer your business plenty of benefits that unfortunately come hand in hand with plenty of security vulnerabilities. Although you may want your machine learning project to work as fast as possible at the lowest cost, good security can be precisely the opposite - slow and expensive. This article explains how secure machine learning is and provides you with a list of the main risks you should guard against.


1. Adversarial Examples

These are the most encountered attacks that aim to fool your machine learning system by feeding it with malicious input that includes very small and unnoticeable perturbations. They function like optical illusions for your system, and they can cause it to make false predictions and categorizations.  


2. Data Poisoning

This happens when the attacker manipulates the data fed into your machine learning system, thus compromising it. Your machine learning engineers should consider your training data and be aware of any weaknesses that could make it prone to an attacker and to what extent that could happen. Attackers can even manipulate raw data used to train models so that even your machine learning training could go bad.


3. Online System Manipulation

Online machine learning systems are the ones that continuously learn during operational use and can modify behavior throughout time. An easy to carry out attack consists of nudging the still-learning system through system input and then retraining the model to do the wrong thing. For this, your machine learning engineers should consider data provenance and algorithm choice very carefully.


4. Transfer-Learning Attack

Machine learning systems are usually made by tuning an already trained-based model - basically, its generic abilities are fine-tuned with specialized training. If the pre-trained model is widely available, attackers can use it and succeed against your tuned model. Make sure that your machine learning system used for fine-tuning does not include unanticipated behaviors. There is also a risk when you take models for transfer from groups. If you do so, make sure that there is a description of exactly what their system does and how they control the risks in the document.  


5. Data Confidentiality

Machine learning systems often include highly sensitive and confidential data that can be attacked. In this case, sub-symbolic ‘feature’ extraction may be helpful because it can hone adversarial attacks. 

So, by now, you hopefully have a good idea about why risk management is an essential part of any data science project. Find out more about how to design a data science experiment here. Threats are always there and can range in severity, so be cautious and prepared. 

About the author

Claudia is a data scientist, consultant and trainer. She is the CEO of Edlitera, a data science and machine learning training and consulting company helping teams and businesses futureproof themselves and turn their data into profits.

Before Edlitera, Claudia taught Computer Science at Harvard, and worked in biotech (Qiagen), marketing tech (ZoomInfo), and ecommerce (Wayfair). Claudia earned her degree in Economics from Yale, with a focus on Statistics and Computer Science.