Good habits: aggregate and optimize

Ben Lee

Since joining Coda in 2019, I’ve worked with teams and docs of all sizes. I also host an internal office hours where we troubleshoot some of the most complex docs we see out in the wild.

Coda is very powerful, and our team has worked hard to build a product that loads, filters, and adapts to whatever you throw at it. But there are some overarching doc-building practices I’ve observed that can set you, your team, and your doc up for success.

We will cover complexity from a computer science perspective by presenting three strategies of accomplishing the same goal, each one more optimized than the last, and shows how simplifying the complex can kick off a chain of positive benefits for both the maker and end users.

⁠

Strategy one⁠

: Filtering data in place

⁠

Strategy two⁠

: Aggregate common data points

⁠

Strategy three⁠

: Re-use, not repeat

A good doc schema frame of mind

Rarely does a bad strategy lead to easy next steps. I’m sure you’ve had a project where every next step required more workarounds and duct-tape to make it functional, even at a minimal level. I’m also sure you’ve had projects that seemed to fall into place where every next step felt like that of a runner’s stride on their way to the finish line. This can usually be traced back to the projects initial strategy and foundation.

When asked how I think about building docs, my answer is always the same: build for the next step even when you don’t know what the next step is.

Okay, what does that mean? The answer is flexibility. Have you ever known a marketing team that was content with the metrics they had at hand or a data team that didn’t want to slice a data set in just one more way? Since we know we are going to get hit with questions, we’ll do our best so anticipate the yet-to-be-asked and set ourselves up with a great chance of being able to say yes.

Big O Notation

Formulas, or really anything involving an algorithm, can be difficult to measure just how complex it is or how much effort it will take to run once data is loaded in. To get a rough estimate quicker, computer scientists use what’s known as “Big O Notation”. This is technically an estimation of time complexity, but it’s easier to grasp if we think of it simply as task that need to run to get our final answer.

Let’s say we’re building a doc to onboard a team and our onboarding algorithm adds one task for each person added. There can be any number of people involved, so we will use “N” as a placeholder. If there’s a task that runs for every person in the table, we can say that this has a complexity of, or grows by, N. We can refer to this as O(N), or spoken as “O of N”.

Now let’s say that every person added gets assigned a task to meet every other person in the doc. Two people meeting once can count for both, but let’s say that when I’m, we need to have a meeting focused on me and when you’re added, we need to have a meeting focused on you. This grows faster than a 1-to-1 ratio, it actually grows by the square of the number of people, N^2. This is denoted as O(N^2) and spoken as “O of N squared”.

For each of our three strategies, we’re going to cover the complexity using Big O Notation. Our engineers use all sorts of creative ways to optimize our actual runtimes far below these estimates, but the general idea and thought process still applies. Even if we run an particular operation in an optimized manner, you’ll notice a difference if what you build requires only 100 steps instead of 1,000 steps. Our goal is to see how few operations we actually need to accomplish the same outcome.

Here are a few links if you’d like to explore Big O Notation further:

⁠

What is Big O Notation Explained: Space and Time Complexity⁠

⁠

How To Calculate Time Complexity With Big O Notation⁠

⁠

Let’s dig in!

⁠

Strategy one⁠

⁠

Want to print your doc?
This is not the way.

Try clicking the ⋯ next to your doc name or using a keyboard shortcut (

CtrlP

) instead.