Use FILTER() Instead of a Pivot Table

Introduction

I was presenting the FILTER() function in my Super Forecasting Course last week and as I was demonstrating it, one of the delegates suggested that I could do what I was doing in a Pivot Table. Yes, it was true, I could have done that. So, what is so good about the FILTER() function in Excel, then? Let's see with a real set of data.

Our World in Data: Covid-19 Database

Everyone knows about Covid-19 but not everyone will know about the fabulous web site and organisation Our World in Data, OWID, based in Oxford University and headed by Dr Max Roser. OWID has the mission of making data and research on the world's largest problems understandable and accessible. Because of the quality of OWID's work, they are the go to web site for many organisations for Covid-19 as well as many other data sets. If I want to know about Covid-19, I go to OWID, as I have in this case.

I am using the OWID Covid-19 database that I downloaded at 0651 hours, Thailand time on 14th May 2024 and you can access the CSV for the complete dataset at https://covid.ourworldindata.org/data/owid-covid-data.csv

Why not use a Pivot Table?

Why am I writing this blog post, then? Why not use a Pivot Table?

As I was preparing this post, I was working on the OWID Covid-19 database in preparation for the second running of my Super Forecasting course being run next week. The comment of last week's delegate was still bouncing around my mind so I asked myself the question, why not use a Pivot Table? In the end, what I did was to create a series for FILTER() based formulas that I have included below and that I will leave to you to determine whether a Pivot Table would be better

I have created a Pivot Table in my Excel file so you can check to see if it might be better to use that rather than use the FILTER() function!

The FILTER() Function

The FILTER() function is a dynamic array function that Microsoft released just a few months ago and you will see how useful it is in the examples I am about to demonstrate. I have other blog posts on the FILTER() function that you can easily find and read! For example:

https://duncanwil.blogspot.com/p/filter-with-multiple-tables.html

https://duncanwil.blogspot.com/p/dynamic-array-functions-example.html

The Database

 The OWID Covid-19 database, today (14th May 2024), is 397,559 rows deep and 67 columns wide: that is 26,636,453 cells to work with. That's quite a large database and one reason I might not want to use a Pivot Table is that there are 67 different variables relating to 255 countries or territories. That’s a lot of bits of data to think about.

 Secondly, I considered my FILTER() based report rather than using a Pivot Table because I knew that once I had created it, I would not be pivoting my reports and graphs, except if and when I downloaded the updated version of the OWID Covid-19 database.

 Here is some of my Output

I created a report using just six columns and the first few rows of it is shown below, together with a graph of the data: note all of the zeroes: that is because these rows come from the very beginning of the pandemic




I share my formula with you below but what it did was to extract data from six columns, not all contiguous and then create two graphs from there ... only one graph is shown here. You might notice that I have linked my FILTER() function to cell BS1 in which I type the name of the country of interest for my report: the United Kingdom in this case.

Here is the report for Thailand:




My Formula

The data are on the owid-covid-data worksheet

The formulas I created for the above report are found in cell filter_E5 and filter_E6:

For the column headers:

=FILTER(owid_covid_14052024[[#Headers],[location]:[new_deaths]],'owid-covid-data'!C1:I1)

For the data:

=FILTER(FILTER(owid_covid_14052024[[#All],[location]:[new_deaths]],owid_covid_14052024[[#All],[location]]=C5),{1,1,1,1,0,1,1})

The wrap around FILTER() function allows me to remove column five from the seven columns I identified in my initial FILTER() function: =FILTER(... ... ... ),{1,1,1,1,0,1,1}). By missing out column five, new_cases_smoothed, from my report, I am essentially working with non contiguous data and that’s really good news!

Go and get the data from OWID and copy my formula or create your own and see how you like it!

My Second Report

I created another FILTER based report and here it is:



You can choose from a list of 13 countries to show in that report by way of a data validation drop down box. By doing that, I have made the country selection dynamic so that you can call up the data from any country in that table. In this case, I created a new formula for each row.

Here are the formulas I have created for the first column, United Kingdom and notice my row headers are based on the column headers from the Covid-19 database:

 




Just fill right to columns two to N to complete your own table. Notice I have used cell ranges rather than structured references and, if I left them like that, I would need to update them manually every time this database was updated if I wanted to make sure they reflected all of the data in the database.

Sparklines

I created a sort of dashboard using Sparklines that I show below and it is a work in progress that I may well abandon in favour of something more formal. I do use Sparklines as the basis of a dashboard and they can look really effective but I am not at that stage with the following!



My Other Graphs

I already showed you one of my more formal graphs but here it is again with a second graph, based on data generated by my FILTER() formulas:


and 


Conclusion: is this Better than a Pivot Table?

Yes, it is better than a pivot table but then again, no it isn't!

What is better is that I have created specific reports that I will change in only very minor ways: just change the country/countries and nothing else. I think these outputs are perfect for that.

What is worse is that I have fixed my report hence, literally, I have lost the major flexibility that a pivot table can give me. I cannot pivot my data and draw comparisons and conclusions as I can with a pivot table.

All of the above is done deliberately and one of the main reasons for sharing this page is to help you to appreciate, learn and use the FILTER() function.

Excel File: follow the link to OWID and get the Covid-19 database for yourself ... it is not mine to share! Hence, there is no file to download from this page.

 

 

Duncan Williamson

16th June 2021 updated 14th May 2024

Page: Edit (blogger.com)



No comments: