Skip to content

icon picker
Diagnostics

How to debug metrics issues in PM data interviews
(Want advice on what interviewers are looking for? 👉 👈 )
The arc of this question
1
Step 1: set context. Clarify your understanding. Ask follow-up questions to better diagnose what is going on.
Step 2: go broad. Through the process of digging in, articulate a range of hypotheses. Describe the key data points for verifying what is going on.
Step 3: converge. Settle on one hypothesis and flesh it out with more detail. Propose a few solutions given the hypothesis, and select and justify one for execution. Discuss how you'd measure that your solution is working.
No results from filter

Practice question: sessions are down 20% today - why?

Let’s step through one example question in depth together. Don’t interpret these sample answers as “the only way to go.” There are many right answers that show your thought process, especially when demonstrating .
To begin this question, your interviewer says: “We’re a cross-platform community for athletes to track and compare workouts. Sessions are down 20% today - why?” and shows you the below chart. What do you say?
Sessions down 20% today
3

Sessions are down 20% today - why? Example answers:
7
Overall Score
Answer
Interviewer take away
1
1 - strong no 😡
Could be lots of things, hard to say really.
Nothing to work with here. No followups, no
@💭 Structured thinking
, or articulated curiosity and didn’t
@🤔 Asks good questions
.
2
2 - no 🙁
Did we launch something today? Is it a holiday?
Lacking
@💭 Structured thinking
and didn’t
@🤔 Asks good questions
.
@🦅 Changes altitudes
too quickly: jumps into details before describing the approach. While these questions aren’t bad, they aren’t the best and feel scattershot without a clearer initial frame.
3
3 - yes 🙂
I want to slice by different metrics to see what changed. Did only one platform take a hit? Is it one geography?
@💭 Structured thinking
: Sets one clear framework and
@🤔 Asks good questions
to drill into.
@💬 Communication
: Pausing to engage the interviewer here instead of jumping into a monologue is also a good idea.
4
4 - strong yes 😍
There are two ways I could debug this: 1) slice and dice the metrics to understand what segments changed, or 2) step through the funnel and see what may have changed or broken at each step. I’ll start with approach #1 and slice by referrer since that can be a major off-site driver for sessions.
@💭 Structured thinking
: Proposes two high level frameworks for how to dig in and picks one.
@🦅 Changes altitudes
: Chooses a great first metric to slice by that is clearly connected to Sessions.
No results from filter
As you start asking clarifying questions, your interviewer may reveal more information. Say, for example, they now show you this chart and ask you what you take away from it.
Sessions are down 20% equally across platforms
3

Sessions are down 20% equally across platforms - what's that tell you? Example answers:
7
Overall Score
Answer
Interviewer take away
1
1 - strong no 😡
Users like the app less now.
@🦅 Changes altitudes
poorly: jumps to an extreme conclusion, without supporting it or demonstrating the intermediate steps with
@🤔 Asks good questions
. This conclusion also isn’t the most obvious or first take-away.
2
2 - no 🙁
All user types are using the product less.
Platform is only one segmentation - the candidate isn’t articulating other segments that may still shifted (eg: geography, logged in vs logged out, etc). Didn’t
@🤔 Asks good questions
.
3
3 - yes 🙂
This is surprising. If the source were a platform change (eg: iOS revoked our app suddenly, or Google Chrome thought our site was a security risk) we might have seen it specifically in one of the platforms.
While this answer doesn’t jump into next steps, it shows
@💭 Structured thinking
and
@🦅 Changes altitudes
through several concrete hypotheses that are rejected by the new data.
4
4 - strong yes 😍
I’ve found these kinds of sudden drops are often reporting bugs or broken features. Since it’s across all platforms, if it’s a bug (real or reporting) it would likely be server side.
@📓 Demonstrates experience
, since they mention this issue may not necessarily be a real user change. Server changes distinction is a good
@🤔 Asks good questions
axis to explore next.
No results from filter
Calling out that these kinds of drops are sometimes logging and reporting issues is often a good touch, as it shows familiarity (and often this is the case). One candidate I interviewed with a related question to this digressed into talking about how “the most insidious thing you can do at a company is plant a subtle logging bug that slowly drifts your core metrics from ground truth. And then quit.” His terrifying anecdote stuck with me to this day as an example of a not so blaring obvious data issue that can be tough to correct, and demonstrated a deep awareness of subtle metrics challenges.
A few candidates I’ve interviewed over the years have also mentioned how logging bugs can have downstream impact on ranking and ML systems. This kind of second order effect often gets skipped over in answers, but can be another great show of expertise if relevant. One friend of mine actually got into a debate with their interviewer, calling out how a prolonged logging issue could have morphed a related feature’s ranking model significantly. The interviewer initially disagreed, though became convinced (gutsy move I wouldn’t always recommend, but my friend got the job!).
Referrers data is revealed!
3

Referrers data - what does this mean? Example answers:
7
Overall Score
Answer
Interviewer take away
1
1 - strong no 😡
Email and OS referrers are down. I totally bet Apple changed something that cut off these channels.
Weak
@💭 Structured thinking
: at this point we’ve already gone through how Sessions are down equally across platform. If Apple was worse but Google was fine, we wouldn’t expect equal decline across platform. Moreover, a coordinated hit from a platform across both email and notifications is highly unusual showing a weak
@📓 Demonstrates experience
. It would need more justification about why that is plausible or the most likely hypothesis.
2
2 - no 🙁
Looks like we found the smoking gun. We lost our email and OS notifications.
While correctly identifies the source (though not with the most detail/description), doesn’t describe what to do next to verify or fix, or list other possible rejected hypotheses. Missing
@💭 Structured thinking
,
@🤔 Asks good questions
,
@📓 Demonstrates experience
or something more.
3
3 - yes 🙂
We lost email and OS notifications. These are related push systems so it’s more likely to be an issue on our side than something else in the ecosystem. Perhaps a bug or really bad push content?
Correctly identifies the issue and articulates a broad hypothesis and
@🦅 Changes altitudes
to two concrete examples that we caused this.
4
4 - strong yes 😍
The email and OS notification referrer losses fully account for the full loss of Sessions. Did we have a bug preventing notifications from being sent? If so, it may have actually started a day or two before, because users could have still trickled in through old emails/notifications even after the new ones stopped getting sent. I’ve seen a few notification bugs on a previous project where it actually took a few days to see the drop due to a trickle in ongoing engagement with older notifications.
Correctly tallied the loss and attributed it based on the new referrers data. Has a strong suspicion on which systems to drill into next for bugs and why with clear
@💭 Structured thinking
. Additionally narrowed in on a time frame to investigate (even a few days before the drop) given intuition around user behavior and strong examples that
@📓 Demonstrates experience
.
No results from filter

Sticking the landing

The above questions and answers are a slice of a full interview. Fully completing this question usually gets to one more detailed final hypothesis about the problem. From there, it’s important to propose a few final solutions, weigh their trade-offs, and select one for execution. Talk about risks with that selected solution, mitigators, and how to verify/measure ultimate success.

Practice, practice, practice

👉 When you’re ready to move on, let’s jump into .
Below are more example Diagnostic questions to work through.
Copy this doc
and use it as a worksheet and checklist!
Diagnostics Questions Worksheets
5
Search
Sessions are down 20% today - why?
Set context: Go broad: Converge:
YouTube traffic went down 5% yesterday. How would you report this issue to Larry Page?
Set context: Go broad: Converge:
You launch a new feed algorithm for Facebook and the average time per session goes down by 20%. What do you do?
Set context: Go broad: Converge:
Facebook newsfeed engagement dropped by 2% — what do you do?
Set context: Go broad: Converge:
Weekly active users for Instagram on iPhone dropped. What happened?
Set context: Go broad: Converge:
Google ads revenue dropped 20% — what do you do?
Set context: Go broad: Converge:
Facebook groups usage dropped 10% — what do you do?
Set context: Go broad: Converge:
Amazon shopping cart conversion rate is down. How would you diagnose?
Set context: Go broad: Converge:
You are looking at DAU data across the world and notice that there has been a jump of 5% compared to yesterday in Indonesia. What would you ask the analyst?
Set context: Go broad: Converge:
Twitch noticed revenue for a certain keyword has declined. How do you diagnose what happened?
Set context: Go broad: Converge:
You are looking at Snapchat global DAUs and notice that there’s been a jump of 5%. What happened?
Set context: Go broad: Converge:
We’ve recognized a 10 percent drop in newly registered users. What data would you need to look into to understand and fix the problem?
Set context: Go broad: Converge:
There's been a 15% drop in usage of UberEats — how do you fix it?
Set context: Go broad: Converge:
You are the PM for Tinder, and daily active users dropped by 10%. What would you do?
Set context: Go broad: Converge:
DAUs jumped up 5% yesterday in Indonesia. What would you ask the analyst?
Set context: Go broad: Converge:
MAU is down. Troubleshoot.
Set context: Go broad: Converge:
Question
Sessions are down 20% today - why?
Practiced
Type
🕵 Diagnostics
Added by
David Kossnick
Practice answer
Set context:
Go broad:
Converge:
Overall Score
💭 Structured thinking
🤔 Asks eigenquestions
🦅 Changes altitudes
📓 Demonstrates experience
💬 Communication
Feedback to self
Show hidden columns
Have questions that should be added?
Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.