When was the last time you changed your mind about something you fundamentally believed was true?

I am not talking about religious beliefs, but rather about a topic you consider yourself to be an expert on. For me, this happened in 2013, when I first discovered the

in Microsoft Excel.

If you don’t want to read this mini-tutorial of tables in Excel and Coda, I recorded a video below for you to watch:

What Most People Do In Excel

Like many beginners to Excel, I thought Excel was meant for storing lists of data. Stuff like your grocery lists, books you want to read, and maybe a personal budget. We all know Excel, today, is super powerful with features like PowerBI and

, but a majority of

are probably utilizing a fraction of Excel’s power and capabilities. I couldn’t find stats on feature usage in Excel, but from seeing hundreds of Excel files over the years, I will posit that a majority of people use Excel for one purpose:

lists

.

There is nothing wrong with lists. They make sense. The grid-like nature of Excel provides a solid framing for lists.

But once I discovered functions like

VLOOKUP()

,

OFFSET()

,

INDEX()

, and

MATCH()

, I found that I wasn’t creating standalone lists anymore. The paradigm became

connecting lists together

. As my Excel knowledge grew, I couldn’t see a world where your lists of data

were not related

in some way. If you have a list of product orders, at some point you might want to know the name of the customer who made the order, and that customer name will probably come from a Customers list and you pull in the name via

VLOOKUP()

.

Is It A Database Or A List?

Source: allyourbasearebelongtous.com

For anyone who has built a model or dashboard in Excel and wanted to make it scalable so it could handle new rows of data, you have probably used

with the

OFFSET()

function. I won’t go into detail on how the formula works, but it’s sort of a hack to account for new rows of data that get added to your spreadsheet and you want your formulas and charts to “pick up” the new data.

Numerous hacks like these exist to make your file feel more like an application. Hacks for getting filtered data to show up in the right place with random

IF

and

IFERROR()

formulas thrown into your file. Hacks for referencing the correct columns for

VLOOKUP()

formulas in the event new columns get added to your dataset.

And then I discovered SQL.

Instead of referring to a row or column reference, this new formula language just requires you to know the name of the column. It doesn’t matter how long your list is or how many rows get added hourly or daily; the

SELECT

statement will pull all the rows that meet your criteria.

In my head, SQL statements are easier to

understand

, but Excel formulas are easier to

write

. As I’m building my connected lists in Excel, am I really just creating a mini-database, but I just happen to be querying my data using hacked-up formulas? Are Excel formulas the most efficient or right way to pull your data, or am I just

familiar

enough with the formulas that the hacked-up solution feels

right

? This is still a question that I am struggling with, and hopefully the examples below will show

the paradigm shift I think many Excel users will grapple with and how Coda’s formula language challenged my beliefs about Excel’s formula language.

The Simplicity of a VLOOKUP()

You have two lists of Employees and Tax Rates below (

I was inspired by the example Joel Spolsky used in his Excel tutorial, watch it

):

If you want to calculate the Total Tax $ column, you need to know the Tax Rate %. Your first instinct would be to write a

VLOOKUP()

formula in cell

F5

and drag that formula down to cell

F14

:

=VLOOKUP(E5,$B$21:$C$23,2,0)

That’s what I love about the workhorse of

VLOOKUP()

. It only requires a few parameters and you can quickly pull in data from other lists if you have a unique identifier in that lookup table (in this case, it’s the City).

As I stated in the beginning of this post, tables in Excel really change the game on how you reference data. While tables surfaced in Excel 2007, I’ve rarely seen them used in various files I’ve worked on during my days at Google and as a freelance consultant. You probably noticed that the Employee and Tax Rates lists are indeed tables with the dropdown arrows in the header rows and the alternating row colors. If you write the

VLOOKUP()

formula in this

table

you get a formula that looks like this:

=VLOOKUP([@Location],TaxRates[#All],2,0)

When you move your cursor over to column E to reference the location, you actually refer to the entire

column

name similar to SQL. Similarly, instead of referencing

$B$21:$C$23

, you reference the

TaxRates

table (notice how when I select a cell in this range the

TaxRates

table name appears in the top left):

Another key feature about writing formulas in tables is that they will automatically fill down to all the empty cells below. Here is what happens when you try writing the new

VLOOKUP()

formula in the Employees table:

Why is the reference to the Location column as

[@Location]

and the tax rates as

TaxRates[#All]

important?

These references now are

objects

that you can reference elsewhere in your Excel file.

Gone are the days where you need to know the exact cell reference and/or naming ranges of cells with a unique name. Another SQL-like feature is that you can add rows to either the Employees or Tax Rates table, and those rows automatically become “part” of the table:

If we look inside the Total Tax $ column, you’ll notice it also contains the “@” references to the Salary and Tax Rate % columns instead of individual cell references:

A few things I don’t like about this solution:

You still have to enter in the column number you want to pull into your main table (e.g. if new columns are added to the Tax Rates table you’ll have to change the “2” to the new rate column)

Auto-filling the formulas down through the empty cells is great, but if you accidentally delete an individual cell in the Tax Rate % column, you have to copy and paste the formula again from another cell that contains the table formula.

I think Excel encourages you to type in the column and table names, but the familiar behavior for most VLOOKUPers is to move your cursor to the column or lookup table. Typing in the column and table references requires you to remember the exact syntax of using the “@” and “#” symbol with various brackets (although there is a bit of autocomplete to help you along the way):

Writing this formula was painful…

Finally, the ideal solution to create a scalable formula to link two tables together should use a combination of

INDEX()

and

MATCH()

, which I describe in the next section.

The Many Ways to Lookup

The debate is as old as evolution vs. creationism (ok not that old). When do you use

VLOOKUP()

and when to use

INDEX/MATCH

? There are countless

about this topic. My simple rule is this: if you have more than 50,000 rows of data and care about performance and speed, use

INDEX/MATCH

.

In our simple example of Employees and Tax Rates, a

VLOOKUP()

does what we need but that’s because our tables are small and the performance difference is negligible. If this was a list of millions of employees, we would have a problem with

VLOOKUP()

. So if we want the most optimized solution, let’s try out the

INDEX/MATCH

method:

I didn’t even bother trying to type in the column and table references since it would’ve been a pain to remember where to use the brackets, “@” symbols, etc. I just referenced the columns and ranges of cells the “old-fashioned” way by using the cursor. So, when it’s all said and done, the most optimized formula to lookup the tax rate is:

=INDEX(TaxRates[Tax Rate],MATCH([@Location],TaxRates[City],0))

A few things I don’t like about this solution:

INDEX/MATCH

is considered an “advanced” Excel formula that most Excel users won’t use because it requires you to nest the

MATCH()

function inside the

INDEX

function

Notice how there is a 0 at the end after

TaxRates[City]

? This is a weird flag that also plagues the

VLOOKUP()

formula indicating the

match_type

you want Excel to use. I have never seen a real-life scenario where you use a

match_type

of

true

(e.g. “1”). If you don’t use the 0, Excel finds the first closest match based on alphabetical order, which leads to completely wrong results for tax rates:

When To Link Your Tables Together

Source: Giphy (couldn’t help myself by not using a Zelda reference in this section)

While I consider tables in Excel to be super powerful, there are a few usability issues that make them less approachable to regular Excel users who are linking lists of data together with good ‘ol

VLOOKUP()

and cell/range references.

Tables completely changed my mind on how lists are linked, but I still rarely use them in my files due to the set up that’s required. To provide some more perspective, I would only convert my lists into tables if I knew that

other people were using my file

. I this were the case, I would want to build a scalable solution that accounts for the following:

Non-fixed number of rows being added to the file hourly, daily, weekly

Adding new columns does not affect functionality of

VLOOKUP()

(partially solved with

INDEX/MATCH

)

Inability for teammates to delete random cells in a column that contains formulas

If I am just using the file for personal use, the extensibility of the tables feature becomes less interesting since I am typically only using Excel for one-off data analysis. If your team has an Office 365 subscription where you have all the latest and greatest sharing features in Excel Online, tables might be the the right solution since your team is putting collaboration first.

Tables In Coda

Until I discovered Coda, my “worldview” of linking tables together consisted of two paradigms:

Using

VLOOKUP()

and

INDEX/MATCH

If you are approaching this problem from a programming perspective, there are numerous ways you can link data together in Javascript, Python, etc. But if you want to give the

average business user

a way to accomplish this basic task in a platform that feels approachable and intuitive, Coda’s formula language is the right implementation of formulas + visuals.

Let’s take the same data we’ve been working with and see how the Tax Rate % column would be pulled in via tables in Coda:

A few things to note in the gif above:

The Location column is a

column format that creates a dropdown of all the cities you can select from the Tax Rates table

As you write the formula in the Tax Rate % column, there is auto-complete with column names which you can TAB over to auto-complete the formula (no pesky “@” or brackets to type)

Since the Location column in our main table is a lookup to the Tax Rates table, we can reference the Tax Rate column from this lookup table easily by just writing

[Tax Rate]

in our formula

When it’s all said and done, the formula to pull the tax rate in the most optimized way is:

=Location.[Tax Rate]

=INDEX(TaxRates[Tax Rate],MATCH([@Location],TaxRates[City],0))

The Coda formula is obviously shorter, but more importantly, is easier to understand for the end user. The difficulty, as with any formula or programming language, is understanding the syntax to lookup your data.

However, much like the mental leap I took when discovering

VLOOKUP()

, I cannot see a case where I would want to use

VLOOKUP()

in my main table when I can reference columns as easily as Coda’s formula language affords.

Your Tables, Your Team

Source: morganlovell.co.uk

Table layouts from the open office plan in the 60s have evolved, shouldn’t your use of data tables too (just full of jokes today)?

Tools and platforms that change fundamental beliefs about the way you can pull, sort, and organize data are few and far between. You have Excel formulas, SQL, various programming languages, and the more recent NoSQL movement.

Aside from Excel’s formula language, no other language connects the

visual feedback

from writing a formula with the

back-end

processing that the formula triggers. Why should pulling and sorting data require you to be an Excel guru or SQL expert? Teams are much more fluid with how their data is managed and stored, and the way teams access the data should be just as flexible and on-demand like that of other business applications.