Overuse of the MATCH Function in Excel

If you have trouble viewing this email, please click here.

The Brode Report | Aug 2016

David Brode Hi,

I have to start with a quick shout out to Olympic National Park, where I camped with Eleanor and Jacob and Ollie this month. If you haven’t seen it, it’s like the happiest forest you’ve ever seen, with huge trees and ferns and moss everywhere. You expect to see a triceratops walk by or see a wizard casting a spell. And if you have seen it, you know what I mean.

I realize the piece below is a bit more technical than I usually go, but you know I’m passionate about the craft of modeling and so please allow me to geek out a bit.

Best,

David

I created a short video where I describe some unique features of a financial model that I developed for a client that raised over $350 million. You can see it at brodetelecom.com/model-overview.

Overuse of the MATCH Function in Excel

match overuse
Ok, this is a more technical newsletter, but I’ve now seen the same issue twice in a month so it bears mentioning. Typically I think someone is a pretty sophisticated Excel user when I see them use the MATCH function. What I’m seeing is the overuse of MATCH.

Quick background/refresher on MATCH:

“The MATCH function searches for a specified item in a range of cells, and then returns the relative position of that item in the range. For example, if the range A1:A3 contains the values 5, 25, and 38, then the formula =MATCH(25,A1:A3,0) returns the number 2, because 25 is the second item in the range.”

From the Microsoft Office Help article on MATCH.

Why do I care about the overuse of MATCH?

It’s slow. Looking up a number inside a string is fine, but when you ask Excel to do it tens of thousands of times that extra computational work adds up.
Occum’s razor suggests we should find the simplest way to explain the formula, and MATCH often overcomplicates. Making models more readable reduces errors and increases flexibility.

The first case I saw used MATCH in the following situation. You need to grab data from somewhere else in the worksheet, which has line items down the rows and months across the columns. This is an extremely common situation. As is most often the case, you’re referencing data from the same month as the cell you’re in. Now, you know the data is in row 44, and you’re in J57 which, as J5 tells you, is Jan-17. The formula I saw was:

=INDEX($J44:$AK44,1,MATCH(J$5,$J$5:$AK$5,0))

Whereas I would have used the simple formula

=J44

Which certainly wins for simplicity’s sake. But let’s break down that INDEX formula. It’s saying to look at the array from J44:AK44, i.e. the data for every time period and then the “,1” part is saying to use the first row (of our one-row array), and then over a MATCH… number of columns. The MATCH function looks at the list of dates from J5:AK5 and tells you how many columns down the sequence you are, so you go that many columns over to get the data in row 44. Thus we looked up Jan-17 in the list of dates and found that it’s the first column. We then went to J44:AK44 and looked in the first column of that: J44. Bingo. The builder of that spreadsheet was worried about grabbing the wrong column and so did that. My preferred solution is to say that if I’m in J, I pretty much know I can reference data in J and it stands out if it’s otherwise. I certainly make sure that I have consistent time structures across pages, e.g. 2022 is in column BH on every page of the model. But to mess that up within a page? C’mon. This is a solution in search of a problem.

The second example I have concerns populating a series of months. So picture this: you have input legal costs for each of five years on row 5, labeled as Y1-Y5 in columns G:K. In columns N:BK are the 60 months that make up Y1-Y5. In row 2 from G:BK has the year (1-5) and row 3 lists the month names for N:BK. So your goal is to write a formula that grabs the cost from the proper year and divide it by 12. The MATCH formula I saw did this:

=INDEX($G5:$K5,1,MATCH(N$2,$G$2:$K$2,0))/12

In other words, I’m in N5, so I look up to row 2 and see that I’m in Year 1 from the 1 in N2. So I look into G2:K2 and find out where 1 is in that sequence of 1-5, and I find out it’s the first column. So then I can go to G5:K5 and grab the first one of that sequence, G5.

I prefer something simpler:

=OFFSET ($F5, 0, N$2)

Translating: From column F, move over N2 columns and grab that number. An added bonus: Excel is blindingly fast at using OFFSET.

MATCH certainly has it’s place. I used it regularly when processing lists and matching up different data sets. But for a financial forecasting model? I often find OFFSET to be the better choice.

Connect with me | Share your thoughts | Subscribe | See past issues | www.brodegroup.com

The Brode Group

Strategic Financial Consulting - Real-World Results

(303) 444-3300