Database Functions in Excel

Introduction

Over the last two or three weeks or so, some of our Excel Famous has started waxing lyrical about Excel's DataBase functions. One called them the forgotten functions, I think. Another called the DGET function a hidden function.

Tosh!

What has happened over the years is that I have been presenting them in my written work and in many of my courses but other Excel users have dismissed them as rubbish, too complicated, out of date ... 

The reality is that the database function really can been the ideal solution to a problem. Let me start by comparing three solutions to the same problem and ask yourself, which do you prefer? Which do your colleagues prefer?

  • =SUMPRODUCT((A6:A23=G27)*(B6:B23=H27)*(C6:C23=I27)*(D6:D23=F27)*E6:E23)
  • =SUMIFS(E6:E23,A6:A23,G27,B6:B23,H27,C6:C23,I27,D6:D23,F27)
  • =DSUM(A$5:E$23,E$5,F26:I27)
I will illustrate how I got to those formulas shortly but they are real and if we carried on with the example, the SUMPRODUCT() and SUMIFS() formulas would just get bigger and bigger yet the DSUM() function would stay the same, essentially.

Watch the video


I have created a video that you can watch on YouTube here

What are the Database Functions?

This is the listing of all database functions in my Excel 365 spreadsheet version:

  • DAVERAGE()
  • DCOUNT()
  • DCOUNTA()
  • DGET()
  • DMAX()
  • DMIN()
  • DPRODUCT()
  • DSUM()
  • DVAR()
  • DVARP()

That means, we can use these functions to find averages, to count value and non empty cells, maxima and minima and so on. They all work in a similar way to each other and to illustrate that let's look at DAVERAGE(), DGET() and DSUM()

What do the Database Functions do?

As we demonstrate those three database functions, we will be answering the question of what the database functions do and you will see in the DAVERAGE() section the data that we are working on for all of the examples in this post.

DAVERAGE()

The syntax of this function is =DAVERAGE(database,field,criteria)
where:
range is the range or array of the database we are working on
field is the filed or column of data of interest
criteria is a table that tells DAVERAGE() what it is is that we want it to evaluate



In cell B40 =DAVERAGE(A5:E23,E5,F26:F27) which gives the answer 66,082.50

The full criteria range of F26:I27 looks like this:


That is, in this first example, I asked DAVERAGE() to find all of the Total Sales values for the Product Apples ... you will find these values (85106 + 21910 + 83538 + 82030 + 15764 + 108147)/6 = 66,082.50.

Notice, that for the Field parameter, I simply used E5, the reference of the total cell for the Total Sales column. I could have entered "Total Sales" instead of that or even just 5, representing column five of the database: all three options are equally valid.
 
Alternatively, consider the following:

In cell B40 =DAVERAGE(A5:E23,E5,F26:G27) which gives the answer 68,416

The criteria range of F26:G27 looks like this:


That is, I asked DAVERAGE() to find all of the Total Sales values for the Product Apples in the Region North GA ... you will find these values (85106 + 21910 + 83538 + 82030)/4 = 68416.

Notice, that for the Field parameter, I simply used E5, the reference of the total cell for the Total Sales column. I could have entered "Total Sales" instead of that or even just 5, representing column five of the database: all three options are equally valid.

For You to Try

You could use your DAVERAGE() knowledge now to find the average Total Sales  of

Apples in the North GA Region, in the City of Atlanta ... $53,508 and then the same but when including the Chain Fruit R Us $85,106 ... 

Finally, we can amend the criteria table, for example, like this:


DGET()

DGET() extracts a single value from a column of a list or database that matches conditions that you specify

The syntax of DGET(database,field,criteria)


Notice how I have added a new row to the criteria table

I entered the following, B44 =DGET(A5:E23,E5,F26:I27) and it gave me the result of #NUM!

Why #NUM!? Because my specification returns more than one value

If we change the formula in B44 to =DGET(A5:E23,E5,F26:I28), we get 149,745

I am sure you can work out why that is the right answer

For You to Try

Change the criteria table to identify the single Total Sales value of your choice

DSUM()

I am sure you are seeing that these database functions are easy to work with and they are user friendly to construct.

DSUM() will provide us with the Total Sales, in this case, of the choices we make in our criteria table. Again we can make any changes we like to any part of this exercise.

The syntax of DSUM(database,field,criteria) ... that's familiar now, yes?!

Let's work with the criteria table below:


Now we want to find the sales of all oranges when the sales in the Total Sales column are greater than $69,000 and less than $100,000

D46 =DSUM(A5:E23,E5,F26:I27) ... 254,723

Try this, in which we nominate the Region to be Mid GA. Make sure you can agree with the Total Sales under those conditions to be $71,097

Conclusions so far

I am sure you can see the benefit of using these database functions: the final formula in every case is very simple to appreciate. The criteria table is simple to very simple to set up and apply. So, let's see when the SUMPRODUCT() and SUMIFS() functions are so problematic.

SUMPRODUCT()

The following shows the same database as above and then the choices we are making to evaluate the SUMPRODUCT() function:


The formula in cell K29 =SUMPRODUCT((D6:D23=F27)*E6:E23) ... 396,495 and that is the result for Apples

Now the other SUMPRODUCT() formulas:

  • K30 =SUMPRODUCT((A6:A23=G27)*(D6:D23=F27)*E6:E23)
  • K31 =SUMPRODUCT((A6:A23=G27)*(B6:B23=H27)*(D6:D23=F27)*E6:E23)
  • K32 =SUMPRODUCT((A6:A23=G27)*(B6:B23=H27)*(C6:C23=I27)*(D6:D23=F27)*E6:E23)

And you can see the values in the screenshot that you need to agree with

You can see how and why the formula expands as we ask more and more of it

SUMIFS()

I won't work through these formulas but you can download my file and check them to your heart's content:

L29 =SUMIFS(E6:E23,D6:D23,F27)

L30 =SUMIFS(E6:E23,A6:A23,G27,D6:D23,F27)

L31 =SUMIFS(E6:E23,A6:A23,G27,B6:B23,H27,D6:D23,F27)

L32 =SUMIFS(E6:E23,A6:A23,G27,B6:B23,H27,C6:C23,I27,D6:D23,F27)

You can appreciate the potential complexities of working with a larger database and more combinations of parameters. The database functions adapt very well to the circumstances of an interrogation but SUMPRODUCT() and SUMIFS() do not.


Download my Excel File


Duncan Williamson

12 December 2024 revised 20 December 2024


No comments: