Share
Explore

Data Engineering Using Data Build Tool

The pirate Jack Sparrow wants to know what Ad campaign got him most customers [almost all startups face this problem].

Following is the Ad analytics capturing infrastructure of Jack Sparrow’s Startup.

image.png


Signup / Conversion data of Jack Sparrow has the following data flow

image.png

Now the data warehouse has ads data and sign up data under one roof. Next step is to connect a BI tool like Mode Analytics, Chartio or looker and query the data using SQL. Although this method works and gets us desired results this has a lot of issues.
Data in our data warehouse is not validated against
Null Check
Foreign key primary key relationship
Uniqueness
Date format check
The entire schema and all columns have to be exposed to the BI tool and to the end user to successfully build dashboards
No failure logging capability around data migration from 3rd party data and production DB to data warehouse

In order to circumvent these issues we bring data build tool into play.
Using DBT we can perform
Null Check
Foreign key primary key relationship
Uniqueness
Date format check
Use Jinja to slice and dice data
Use snapshot to go back in time
Plugging DBT into our data warehouse looks like the following
image.png
Now we just have to connect the Analytics schema to the BI tool and our end users need to query consolidated tables to build dashboards
Advantages of this style of data engineering are
Data is vetted
We receive alerts when the system encounters errors
Data quality is maintained
Only very few tables are exposed to internal end users
Internal end users need not write complex joins to get the desired output


Get in touch → chris@omelet.xyz

Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.