What Are the Roles of a Data Science Team?

Data science is a team sport, but who are the players?

What Does a Data Analyst Do?
- Data Analyst Further Reading
What Does a Data Scientist Do?
- Data Scientist Further Reading
What Does a Data Engineer Do?
- Data Engineer Further Reading

As Duke Economics professor Dan Ariely once famously said, big data is a lot like teenage sex: everyone talks about it, no one really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it.

As you might have heard, or felt in your own recruiting efforts, Data Scientists are a veritable rare bird to find, let alone recruit. However, one could argue that a large reason for this perceived scarcity is the fact that many people think of Data Science as a catch-all role. They are consequently looking for Data Scientists who can build and maintain a data warehouse, can set up a data pipeline for analysis, run analyses that reveal groundbreaking insights every time, then turn around and build resilient production systems running perfectly optimized machine learning algorithms that automate said analyses and predictions, and run them seamlessly every time a customer logs in.

Talk about wishful thinking! Given this job description, it is no wonder that positions remain unfilled for months, and even years at a time, while companies sit on unproductive big data assets that could otherwise give them a solid competitive edge in the marketplace.

On the contrary, in my experience, great data scientists are rarely lone wolves, working in solitude and only emerging occasionally to utter brilliant insights. In most companies that use it successfully, data science is a team sport, where data analysts, engineers and scientists work together.

Here is a brief overview of what each of these professionals do, and how they complement each other.

What Does a Data Analyst Do?

Data analysts are interpreters of structured data. They are spreadsheet whizzes, and write SQL queries to extract data from relational databases. Data analysts can be found in many functional groups within a company, including: finance, marketing, operations, and business intelligence.

While this role does not get nearly the same level of publicity as the more glamorous-sounding data scientist and data engineer, it can be a very fulfilling job in the right company, as well as a gateway to higher-level analytics positions. Data analysts can have a lot of freedom in choosing the directions they take in their analyses, and they often have the opportunity to see their work directly informing management decisions on a daily basis, which can be very satisfying, and a great source of professional pride.

What's more, as a data analyst, you will develop highly transferable skills, which you can apply to many other roles in a variety of industries if you ever wish to change tracks in your career. In addition, this role offers exposure to a variety of tools and analysis techniques, which not only increase your marketability as a data analyst, but are also useful in data science work, which can often be the next step on the career ladder for a seasoned analyst.

Data analysts typically have a bachelor's degree, though that is not always required if you are able to convey skills you have acquired in your previous job experience as relevant to this role.

5 Reasons Why You Should Learn Python

Article continues below

Want to learn more? Check out some of our courses:

Understanding and Deploying Edge AI

Learn More

Classic Machine Learning with Python

Learn More

Apache Spark and Data Stream Processing: A Crash Course

Learn More

Data Analyst Further Reading

Some good sources for further reading on doing analytics are the following:

Image Source: Amazon.com

(Kinley's and Knaflic's books are available on Kindle Unlimited with a subscription, and Maheshwari's book is only available in ebook format, though it is totally worth a read.)

Data Analytics for Beginners by Paul Kinley
Data Analysis Made Accessible by Anil Maheshwari
Storytelling with Data: A Data Visualization Guide for Business Professionals by Cole Nussbaumer Knaflic

What Does a Data Scientist Do?

It's been said that a data scientist is someone who is better at statistics than any software engineer, and better at software engineering than any statistician. This saying is actually not very far from the truth, in my opinion. However, I would add that a data scientist should also be able to make their results and findings accessible to non-technical audiences, so that business stakeholders can rally around the findings and data products put forth, and see to it that they are used effectively for the benefit of the organization.

In short, data scientists are interpreters of unstructured data. A data scientist is typically able to fetch data from public APIs, integrate heterogeneous data from multiple sources, clean it, and extrapolate from it to fill in missing values. Afterwards, they are able to formulate hypotheses and test them through the use of math, statistics, visualization and predictive modeling. Once they see results, data scientists then communicate them to stakeholders, working with them to translate these results into business action items.

Many data scientists working in the industry have a Ph.D. or other advanced degrees, but I have also met many accomplished data science practitioners who started in the job with only a bachelor’s degree and relevant work experience.

Data Scientist Further Reading

In terms of further reading, the following books cover everything from intro to advanced topics in data science. Master the concepts in these three books, and you will know more than 99% of all Data Scientists out there.

Image Source: Amazon

Data Science from Scratch: First Principles with Python by Joel Grus
Programming Collective Intelligence by Toby Segaran
Doing Data Science: Straight Talk from the Frontline by Cathy O'Neil and Rachel Schutt

What Does a Data Engineer Do?

Data engineers are usually data infrastructure engineers who are responsible for building and maintaining the infrastructure that transports and houses big data. A data engineer is the one setting up and configuring a Hadoop cluster, building a Spark Streaming pipeline, or migrating a company’s data assets to a public cloud service such as AWS.

In some companies, Machine Learning engineers are also called data engineers, though role requirements could be vastly different. Most of the data engineers I’ve met have started out as back-end or full stack developers who developed an interest in data technologies and have taught themselves Hadoop, Spark, and AWS before transitioning to data engineering. Advanced degrees are typically not required for this role.