What Are the Most Popular “Machine Learning Service” Tools in 2021?

A list of the most popular MLaaS tools, from least to most flexible.
By Claudia Virlanuta • Updated on Nov 30, 2022

In recent years, machine learning (ML) has grown in popularity. In fact, it’s become a game-changer for many businesses by solving complicated problems faster than humans can. While the machine learning market was worth $7.3 billion in 2020, Forbes estimates that it will be worth $30.6 billion in four years’ time.

Data scientists have most of the technical responsibility to build effective machine learning systems. However once the machine learning system is built, it also needs to be maintained and monitored. This is where things can get tricky, and also where "Machine Learning as a Service", or MLaaS, comes into play. MLaaS can be especially helpful if you’re a small- or medium-sized company that doesn’t have the resources needed to build your infrastructure from scratch. In short, the term “MLaaS” refers to cloud-based platforms that use certain machine learning tools to help you to scale up the way your company uses its data, without beginning from square one.


In this article, I’ll talk about some of the most popular MLaaS tools. The following is a list of them, presented in four distinct categories and ordered from least to most flexible. However, higher flexibility doesn’t always equal the best solution. In fact, it often means more time required for model development, configuration, and maintenance.


What are Semi-Specialized Platforms for Language and Vision?


Language Platforms

Language platforms are used to train custom text models from your data so that the inputs will be the custom text in a specified language. Some examples of language platform tools include MonkeyLearn, Lateral, Google AutoML Natural Language, and Amazon Comprehend.

Let’s take a closer look at the Amazon Comprehend tool, which was made to extract key phrases, people, brands, places, and events from a given data set. More specifically, this tool was designed to comprehend the negative and positive connotations of a text by analyzing it through tokenization, or the process of breaking down a piece of text into small units called tokens. Once that process is completed, Amazon Comprehend automatically organizes text files by topics. You can also use the AutoML capabilities of Amazon Comprehend to build text classification models that specifically meet your needs.

Amazon Comprehend tool

Image Source: https://aws.amazon.com/blogs/machine-learning/analyze-content-with-amazon-comprehend-and-amazon-sagemaker-notebooks/

Some language platform tools also offer data annotation services, where machine learning is used to create accurate labels for a given data set. A great example is the crowdsourced Data Labeling Service created by Google. Other tools like Appen also use an allocated annotation platform.



Vision Platforms

While language platforms use text as input, vision platforms use images or videos. Some examples of these tools are Clarifai, Google AutoML Vision, AutoML Video Intelligence, and Amazon Rekognition. Like language platforms, vision platforms also provide outputs which are, in essence, labels that identify concepts associated with given images or videos. For instance, labels generated by vision platforms can be categories like “internal projects” (as opposed to external projects) or specific team names from within your company.

These semi-specialized platforms are great because they’re quick to create a working model. They are not, however, able to tailor the model training process for you.



What Are High-Level Platforms as a Service?

High-level platforms as a service, or PaaS, are easy to use because they don't require installation and therefore allow you to stress less about infrastructure. Another benefit is that they can automatically detect the type of problem that needs to be solved in your system. For example, high-level PaaS can detect whether a given data set can be used to solve a classification or a regression problem. They can also automatically prepare the data and perform tasks like encoding of categorical variables, feature selection, normalization, and more. Platforms as a service can automatically configure their learning algorithm, making this kind of tool useful for those with less machine learning knowledge and experience. Some examples of platforms as a service tools include Google AutoML Tables, Google Time Inference API, BigML, and Microsoft Azure ML.

Let’s take a closer look at the Microsoft Azure ML tool. Microsoft Azure ML is essentially an MLaaS platform that has two model authoring environments: Automated ML and Designer. While Automated ML helps users create models quickly, Designer is an environment that allows the users to view and edit model training pipelines, making it easier to understand their own pipelines and catch potential errors.

Microsoft Azure MLImage Source: https://techcommunity.microsoft.com/t5/azure/how-to-get-started-with-azure-machine-learning/m-p/54679


How to Use Self-Hosted Studios?

Self-hosted studios are based on standard, open-sourced machine learning libraries which allow users to customize both libraries and code. A notable advantage of self-hosted studios is that they help you avoid “lock-ins,” a term referring to being stuck with a specific company or vendor due to proprietary language or knowledge in a tool you’re using. With self-hosted studios, you can export machine learning pipelines as Python scripts and keep them readily accessible to you or others. Another advantage is that you can also export trained models to different open formats.

On the other hand, self-hosted studios are considered low-level solutions compared to the other MLaaS tools in this list. These solutions are not offered as a service and need to be installed and hosted on your own machines. This can make everything more expensive and difficult to use if you’re just getting started with machine learning.

Some notable self-hosted studio tools are DataRobot, Rapidminer, and Dataiku. All of these tools offer Designer and AutoML features.


What are Cloud Machine Learning IDEs?

Cloud machine learning Integrated Development Environments, or IDEs, are powerful GPU-equipped virtual machines provided by cloud platforms that simultaneously offer Jupyter Lab environments on pre-configured infrastructures. These IDEs run experiments more efficiently, and can even do so 24/7. They are also able to scale experiments up to run on CPUs with many cores, as well as CPUs with powerful GPUs, plenty of RAM, and pre-configured clusters. 


However, Cloud machine learning IDEs are also known to have low-level features compared to ML studios. While low-level features are intended to increase flexibility and core model access for advanced users, they can in some cases make model development, configuration, and maintenance a more time-consuming job than it needs to be. Some examples of Cloud ML IDEs are Floyd, Google AI Platform Notebooks, Databricks, Amazon SageMaker, and Faculty.ai.


This list of MLaaS tools is non-exhaustive, and more importantly, you should choose the most suitable tools based on your needs and application. Even if you’re new to machine learning, many tools require little to no experience to get started. As technology continues to advance, you may wonder if it’s worth it to build your own tools and models. Purchasing a MLaaS solution— as opposed to starting from scratch—has many advantages. Once you’ve decided that you may benefit from machine learning, you’re ready to pick the right tools for you and begin to work toward success.


Claudia Virlanuta

CEO | Data Scientist

Claudia Virlanuta

Claudia is a data scientist, consultant and trainer. She is the CEO of Edlitera, a data science and machine learning training and consulting company helping teams and businesses futureproof themselves and turn their data into profits.

Before Edlitera, Claudia taught Computer Science at Harvard, and worked in biotech (Qiagen), marketing tech (ZoomInfo), and ecommerce (Wayfair). Claudia earned her degree in Economics from Yale, with a focus on Statistics and Computer Science.