What is a Data Science Portfolio? (Part 1)

How to professionally showcase your Data Science skills.
By Claudia Virlanuta • Updated on May 2, 2023
blog image

We live in an automated, data-driven world. Not long ago, analytics used to be the job of the analytics team only, which was the source of precious insights and business intelligence.

Nowadays, data is at the core of virtually every role in an organization, and everyone is responsible for generating their own insights. When it comes to career development, big data has become a bit like the proverbial bear: you either “eat” it by learning to tame it and put it to work for you, or it eats you.

 

Looking around the professional world, the divide between these two groups is more obvious every day.

The first group (Group One) is made up of the most successful professionals out there, who can use the data available to them in order to build tailored solutions to the problems in their professional space. The best are also charismatic storytellers who can clearly communicate to prospects and stakeholders the value that their solution provides, and thus build a following that will work to make them even more successful in the future.

The second group (Group Two), and the most numerous one, is made up of available task completers. While the first group enjoys high pay, engaging work and flexible schedules, the second group is forced to make a living by stringing together gigs of boring, mind-numbing work on an unpredictable schedule for abysmal pay.

While the professionals in the first group have abundant opportunities available to them and enjoy high influence and prestige through their work, the folks in the second group have low-influence roles and face an increasingly competitive landscape of shrinking opportunities due to automation and outsourcing.

 

Many things could be said about this division, but most are outside of the scope of this article. It bears noting, however, that it is possible (though not easy) for one to shape oneself into a Group One professional, as the students at Edlitera show us every day.

The point of this article is to lay out how, after acquiring the skills of a Group One profession, one can showcase these skills in a portfolio in order to firmly establish oneself as an expert in one's chosen profession. Since most of Edlitera's students pursue Data Science as their chosen profession, this two-part article will go over how to build a data science portfolio.

 

What is a Data Science Portfolio?

A portfolio is simply a collection of projects. A project is a coherent piece of analysis, prediction, and recommendations that focuses on answering a specific and well-defined question.

The primary goal of a portfolio is to showcase your skills and your ability to deliver a solid and coherent answer to the questions that prompted your analysis. Notice my use of the word “answer," and not “result." Many analysts and technical professionals tend to focus too much on technical details when talking about their work, which makes their answer to the original question very difficult to follow and understand by laypeople.

More on that later, though.

Before starting a new data science project, I strongly recommend giving this (free) book by DJ Patil a read:

Image Source: Data Jujitsu by DJ Patil, https://www.amazon.com

In addition to the advice in this book, take some time to consider each of the following points:

 

Article continues below

Audience

Who is the intended audience for your project? How will it be delivered to your audience? Will you present it in person during a job interview? Will you post it on Github or on your blog?

 

Data

What is a general topic that you wouldn’t mind spending anywhere from a few hours to a few days or weeks researching? Browse the datasets on kaggle.com, data.gov or any other data source that strikes your fancy and find something that is both interesting and that lends itself well to the particular analytical approach you intend to use.

 

Angle

What is an interesting question you can answer given your data?

 

Scope

As limited as possible – aim to answer 1-2 specific questions max.

 

Tools

Pick whichever tools you know best. Remember, the goal is to showcase your skills, so play to your strengths and put your best foot forward!

 

Claudia Virlanuta

CEO | Data Scientist

Claudia Virlanuta

Claudia is a data scientist, consultant and trainer. She is the CEO of Edlitera, a data science and machine learning training and consulting company helping teams and businesses futureproof themselves and turn their data into profits.

Before Edlitera, Claudia taught Computer Science at Harvard, and worked in biotech (Qiagen), marketing tech (ZoomInfo), and ecommerce (Wayfair). Claudia earned her degree in Economics from Yale, with a focus on Statistics and Computer Science.