Hello and welcome back to a new article in my Intro to Programming series. Today, we'll discuss installing and using Python packages.
What are Python Packages
You may remember that, in my first article in the series, I mentioned that one huge benefit of using the Python programming language is that it has a large community of developers. There are a lot of programmers out there who not only use Python, but also write Python code that implements some functionality that is not already built into the language, and they then open source that code. By open sourcing it, they make it available to the rest of the world and other people can contribute to it and further enhance it. These bundles of code that are written by the community are called packages.
There are a lot of open-source Python packages out there: some are for doing data science, some are for writing machine learning algorithms, some are for creating websites, etc. If you can think of a use-case for a programming language, there's likely at least one package that makes it easier.
A Python package is a collection of code usually written by other people. I say “usually” here because you can actually write your own packages. The most important feature of these packages is that they include functions and other definitions that simplify a specific task, for example, the task of doing data analysis.
So far, all the functionality that we've been using has come from the Python standard library, which comes with every installation of the Python programming language. The packages included in this standard library are just the basics that we need and they are deliberately not very specialized.
When you need specialized packages, the best place to search for them is on PyPi, which is the largest repository of open-source Python packages. I included the link to the Pypi repository here. You should check it out to get a sense of the kind of packages that are out there.
Next, I want to highlight two things that are very important when it comes to packages. First, how to install a package, and second, how to use a package in your own program.
How to Install a Package
First, let's talk about installation.
To install a Python package, we need a package manager, which is a command line program that is used to install, update and uninstall Python packages. There are two that are very popular: the first one is pip, and the second one is conda. Pip usually comes installed with all Python distributions. Conda comes installed with the Anaconda Python distribution. If you followed the instructions we went over in the Get Your Computer Ready to Run Python article, you should have both already installed on your computer, but if you did not install the Anaconda Python distribution, conda will likely not be available for you.
The main difference between pip and conda is the kind of packages they have access to. For example, if you're working on data science related tasks, you'll probably want to use conda, since it has access to better default packages and dependencies than pip. For example, conda is able to install non-Python code dependencies for packages, whereas pip is not. In this video, I'll mostly refer to the pip package manager, but pretty much everything that I'll cover here will also apply to conda, should you need to use that package manager instead.
To install a new Python package, all you have to do is launch the terminal or command line and then type pip install, followed by the name of the package you want to install. Or, if you're using conda, you can type conda install, again followed by the name of the package you want to install.
So for example, pip install scrapy or conda install scrapy will install the Scrapy Python package, which you can use to make your life a lot easier if you're doing webscraping.
Generally, you will rarely have to build things from scratch. Chances are, someone has already written a package that will help you along the way, so your first instinct should always be to search PyPi or the web for an existing Python package. You want to work with advanced math? There's a package for that, so you should install it instead of writing your own functions. You want to build a website? There's a package for that. You want to parse natural language to build a chatbot? There's a package for that. You get the idea. Always search for packages first before you go about building everything from scratch - it will make your life easier.
Let's briefly go over the exercise of installing a Python package. So I'm going to launch the Terminal app on my Mac, but if you have a PC, launch the command line application.
Ok, so once the terminal is loaded, I type directly into the terminal pip install scrapy for example. Once I hit enter, the package will be downloaded and installed. And that's it. Now I have this Python package available on my computer so I can go ahead and use it.
How to Use a Package
Now let’s learn how to use a Python package.
This can be either a package that comes preinstalled with Python, or a package that you installed using pip or conda.
There are lots of useful packages out there, but today we'll just focus on a couple of them: random and datetime. The package named random implements a number of functions that make it easier for us to generate random numbers. Datetime is a Python package that makes it easier to work with dates and time. Datetime is a fairly large library, so we won't be able to cover all of the goodies that it includes, but that's ok because once you're comfortable importing packages and using them, you'll have all the tools you need to explore the rest of datetime on your own. So, let's launch our Jupyter notebook and write some code.
To begin with, one thing you need to be aware of is that, even if a package is installed on your computer, Python still needs to be explicitly told to load that package whenever you want to use it. Basically, we need to tell the Python interpreter that we want to use a certain package. We do that by using the keyword import followed by the name of the package we want to use. Let's start by exploring the random package, so we write import random.
If we run this line, nothing seems to happen. However, in the background, the package named random was loaded, and it is now available for us to use. The random package contains, for instance, a definition for a very useful function, also called random, that returns a random floating-point number between 0 and 1. If we run random.random() we'll get some random number. Very likely, you'll get a different number, because, you know, it's random. We can run it again, and we'll get another random number.
# First, let's import the random package import random # If we run the code random.random() # we get a random float: # 0.6170348542968803 # If we run it again, random.random() # we will get another random float: # 0.02831839244676082
I want you to look at the line of code we ran. To run the function random from the package named random, we typed random.random(). The first part before the dot is the name of the package and what follows after the dot is the name of the function. And of course, since we're executing a function, we need to include the parentheses.
When we imported the random package above, we imported all the functions that are defined in that package. But sometimes, we don't need all of them. In fact, what if we just want to use the random function and nothing else? In that case, we can instead write from random import random. This is the equivalent of saying "from the package called random, I only want the function called random". What this does is that it reads the package called random and only makes the function called random available to us. So in this context, the word random in the code below no longer refers to the package itself, but rather to the function inside the package. That's why, if we want to run the function, we just type random() - and that looks like the other function executions we've seen before.
# Another way to only import the random function is: from random import random # Now we can run the random function: random() # And we'll get a random float: # 0.2905616446508019
Perhaps this random function inside a package also called random is a bit confusing, and I agree. The names are not ideal, but it is what it is.
Let's look at datetime. As I mentioned, datetime is a package that contains a number of objects (such as functions, data types, etc.) that make it easy to work with dates and times. We can start very simply by just importing the whole package. So we write import datetime. If you want to know what's included in the datetime package, the best thing to do is to search for the documentation for that package, all of which is available online.
# Let's import our package import datetime
Inside the datetime package, there are several data types, one of which is called "date" and one of which is called "time". The date data type inside the datetime package also has a number of functions and methods that are relevant for working with dates. All this functionality for working with dates and with time is bundled into the datetime package.
Let's focus on the date data type. The documentation tells us the date data type has a method called weekday that simply returns the current weekday as an integer for a given date. We can execute it by writing datetime.date(2008, 12, 3).weekday(). December 3rd, 2008, is when Python 3.0 was released. We see that the integer corresponding to the day of the week is 2, so it's a Wednesday. Monday would be 0, Tuesday would be 1 and Wednesday is 2. So you see that in order to access the method named weekday associated with the date data type inside the package named datetime, we write datetime.date.weekday. We basically use the dot to go one hierarchical level below: we start with the package name, then access the desired data type inside the package, and then finally the specific method we want to run. And of course, at the end, we have the parentheses, which are required to execute the function.
# Let's run our code to see what weekday December 3, 2008, is. datetime.date(2008, 12, 3).weekday() # We get the following output: # 2 # Which means it was a Wednesday.
Just like before, if we know we'll only use the date data type inside the datetime package, we can just write from datetime import date. And now, we can just write date(2008, 12, 3).weekday(). Essentially, what this does is that it reads the datetime package, it figures out that we are interested only in the date data type, and it makes that data type available for us. And once we have that data type loaded in our current context, we can just execute the function we want by using the dot notation.
# We can also simply import the date data types from the package. from datetime import date # Our code will still run the same. date(2008, 12, 3).weekday() # Will still return # 2
So that's the basic idea behind using packages. As you get more advanced, you'll also learn how to write your own packages, but in the beginning you'll mostly be using either built-in packages or third-party packages.
Thank you for reading all about importing and using packages in this article. Stay tuned for the next article in my Intro to Programming series!