Creating Pivot Tables - in Python!
Do you need to create pivot tables in Excel, but prefer to work with Python? This is exactly what we are dealing with tody! In this post, we read in a csv file on a zoo’s food expenses and create a pivot table out of it - all in Python!
Load in a CSV File
First, we need to load in our data with pd.read_csv
. Note, how we specify the dictionary dtype
, which maps a data type to every column in our data. This can come in handy with more complex data, especially when we are dealing with dates.
Also, we specify the delimiter of our csv file: Usually this is the default “,”, but sometimes you have csv files with a different delimiter. Here it is the semicolon “;”. This symbol indicates where the program needs to split between columns and is hence essential!
Below we copy the variable data
to the variable df
, which we use in our code later on. This can be useful when working with larger datasets as it prevents that we need to reload our data from the disk (which can take a while) in case something goes wrong
1 | # load modules and get a data set |
We have a short look at our data, which gives us the food-expenses for a caretaker’s animal. This could be important for a zoo-director, for example to provide enough resources for each employee to properly take care of his animals.
1 | df |
Caretaker | Name | Animal | Food-Expenses | |
---|---|---|---|---|
0 | Dan | Teacup | Pig | 10.0 |
1 | Dan | Joey | Pig | 5.0 |
2 | Toni | Olivia | Dog | 6.0 |
3 | Dan | Emma | Cat | 2.0 |
4 | Anna | William | Pig | 8.0 |
5 | Toni | James | Parrot | 1.0 |
6 | Toni | Lucas | Dog | 7.0 |
7 | Anna | Mia | Parrot | 4.0 |
8 | Toni | Evelyn | Cow | 24.0 |
9 | Dan | Colt | Cow | 29.0 |
10 | Anna | Walter | Dog | 5.0 |
11 | Toni | Body | Pig | 11.0 |
12 | Anna | Rocky | Dog | 3.0 |
Visualize your Tables with Pandas Style
This is a very clean table, but sometimes we want to be a bit more fancy. We can be, if we use pandas style
attribute. Go have a look at the pandas documentation! You can even specify own style sheets for your tables and get very detailed with your designs!
Create Pivot-Tables, not in Excel - in Python!
To get improve our analysis, we can create a pivot table as they are often commong business practice.
Luckily, we do not need to switch to excel for this. We can just rely on pandas! The function pd.pivot_table
is all we need and let’s us specify index
and values
columns as lists. Index columns are used to group the table, while the values are the quantities of interes (here the Food-Expenses
). Also, we pass np.sum
to the aggfunc
argument.
This function yields us a pivot table that returns the sum of expenses per caretaker per type of animal - exactly what we where looking for above!
1 | pivot_table = pd.pivot_table(df, index=["Caretaker", "Animal"], |
1 | pivot_table |
sum | ||
---|---|---|
Food-Expenses | ||
Caretaker | Animal | |
Anna | Dog | 8.0 |
Parrot | 4.0 | |
Pig | 8.0 | |
Dan | Cat | 2.0 |
Cow | 29.0 | |
Pig | 15.0 | |
Toni | Cow | 24.0 |
Dog | 13.0 | |
Parrot | 1.0 | |
Pig | 11.0 |
As many businesses rely a lot on excel, it would be nice to deliver this insight to your boss or colleague in excel format. For this we can install the openpyxl
module and write the pivot table directly to excel with pd.df.to_excel
.
This approach even maintains our formating, so we could even add some styling to the table in python and be ready to go with our excel file!
For even more information on handling excel tasks in python, check out the great blog “Practical Business Python” - I adapted a small part of this post, but you should visit their more in-depth posts on excel and python!
1 | pivot_table.to_excel("caretaker_expenses.xlsx", engine="openpyxl") |
Creating Pivot Tables - in Python!