Maturation Curve
1) Awareness - Aware of DQ issues; collecting list of issues manually through exploration, working on projects, or errors in reporting raised by users. Good example is 2) Reactive (this is where we are) - Automated alerting based on data critical or business critical data quality checks that alert after data is in production. There is a risk that stakeholders are impacted and the data team is always working behind the eight ball; but the automation piece helps the team catch these issues and address them prior to being alerted by stakeholders
3) Proactive - The data team is able to prevent data critical failures from hitting production, while also being able to alert stakeholders of anomalies, bugs, and other negative trends through business critical checks
Types of Checks
Data Critical
These errors should be breaking errors for our data pipeline. Data impacted by these errors should be proactively prevented from getting to production as they pose a serious risk to the integrity of our data and reporting.
Business Critical
These errors are what we think of when we say "exception reporting". They are normally related to oddities from upstream data sources, funky/outdated logic, or unconventional system design/business processes. We should be aware and warned of these issues with the end goal of being able to quickly diagnose and triage.
DQ Tools
Slack
notifies our team of Squarewave ETL failures and other data quality failures raised by Anomalo allows stakeholders to reach our team in the event that they find DQ issues in our reporting or Snowflake tables Anomalo
What is Anomalo?
Anomalo is a tool that data teams at Block leverage to run various data quality checks on their tables. Anomalo can check things such as daily freshness, expected values, table granularity, and other custom checks that can be created.
SCA Anomalo Configuration
The SCA team currently has all ETLs configured for fundamental daily checks under the label. These daily checks include:
The next evolution for using Anomalo is to introduce validation rules. Validation rules can cover:
Checking every value of a column Checking a relationship between multiple columns Compare multiple tables or SQL outputs Check column names and data types Check that a column defined in custom SQL logic is always true Check that a custom SQL query returns no bad data The final evolution for using Anomalo is to develop key metrics. Metrics can be defined using a variety of pre-defined aggregates such as: (Photo from this ) For more reading on Anomalo’s functionality and possibilities see their .
JIRA
Any data quality bugs should be raised in JIRA as a bug ticket in the SCA team project. Create JIRA ticket
Next Steps on Our Road to Proactive
Identify critical, warning, and contextual alerts/warnings/metrics our team wants to enable in the Prioritize the critical and warning level alerts to solidify the foundation of our framework Build out and test the contextual business alerts and metrics we’d like to create to test the full capacity of Anomalo Explore avenues to prevent bad production data that fails critical tests from reaching stakeholders and reports