Skip to content
Share
Explore

Event-Driven Analytics & Audit Pipeline

Build a system that automatically processes and aggregates data as it arrives, maintaining real-time summary tables and triggering alerts when thresholds are exceeded or errors are identified.

WTF is this ?!

Often as data engineers, we are called on to fix problems after they occur. One of the keys to being a great engineer is to anticipate problems before they happen! To this end, creating an audit pipeline that checks data as it is written, and providing benchmarks for it is a skill to hone and develop throughout your career.

Why should I care?

The number of times a consultant comes to me asking for a solution to this particular problem is insane. Just a week ago, I talked an experienced consultant why this type of data governance approach is really useful and why you should aim to do it on all projects (time and resources allowing).
It also shows your potential to think forward to problems before they occur and can land you a job after your contract ends, right?!

Details

The Client

Your client for this project is a real database hardass who runs a team of highly skilled and efficient data engineers and analysts. His name is Gregory Lawson and is terrible at pub trivia. Gregory likes things to be controlled, reliable and stable. He also likes chocolate (just saying).

Deliverables

A complete ERD describing the database objects involved in your proposed solution.
Creation of views, tables, triggers and functions within the postgres database provided for this exercise.
A business process flow describing in detail how the data flows through the various database objects outlined in your ERD.
Technical breakdown of view tables, triggers, functions and any other objects used as part of your solution.
List what objects you are auditing and why, what specific errors you are looking for, and what logging capabilities you have setup
Generate an email via API relating to critical alerts that need to be urgently reviewed.
A simple dashboard to highlight key metrics to your client.

Outcomes & Questions

Remember what an ERD is
Understand what a database table, view, trigger and function is
Think about what would be important for a client to know about their data cleanliness:
Is the item business critical, important, normal or low priority?
When should it be flagged and fixed in the data lifecycle?
What logging should be available for review later?
Learn how audit logs can be useful for troubleshooting

Constraints

Alteryx can not be used as part of your solution (but you can use it to analyse the data)
You’re not expected to be an expert data engineer - use AI wisely
You’ve got access to the pro version of Claude

Credentials

Want to print your doc?
This is not the way.
Try clicking the ··· in the right corner or using a keyboard shortcut (
CtrlP
) instead.