How to integrate Great Expecation Data Quality tests in Airflow? | Data pipeline | Data Quality

Опубликовано: 26 Октябрь 2024
на канале: BI Insights Inc
1,678
41

In this video, we will cover how to integrate Great Expectation Data Quality tests in Apache Airflow. In this session, we will use the Great Expectation (GE) provider for Airlow and run the Great Expectations suite. Our data asset will be a PostgreSQL table.

In this tutorial, we will see how to test an ETL Pipeline with Great Expecations using Python. It is essential to test the quality of data before it lands in our production systems. We will focus on Product dimension and employ various built-in GE Data Quality tests.

Links to related sessions.

Link to GitHub (Updated DAG):
https://github.com/hnawaz007/pythonda...

Airflow Installation & Configuration with custom image:
   • Airflow Installation & Configurations...  

In the custom image we add the following line to install GE provider:
&& pip install airflow-provider-great-expectations

Orchestrate SQL Data Pipelines with Airflow:
   • Orchestrate SQL Data Pipelines with A...  

How to test your Data Pipelines with Great Expectations:
   • How to test your Data Pipelines with ...  

How to create Great Epxectations suite?
   • How to create Great Epxectations suit...  

Link to GE Expectations notebook:
https://github.com/hnawaz007/pythonda...

Link to GE suite used in the vidoe:
https://github.com/hnawaz007/pythonda...


Link to Channel's site:
https://hnawaz007.github.io/
--------------------------------------------------------------

💥Subscribe to our channel:
   / haqnawaz  

📌 Links
-----------------------------------------
#️⃣ Follow me on social media! #️⃣

🔗 GitHub: https://github.com/hnawaz007
📸 Instagram:   / bi_insights_inc  
📝 LinkedIn:   / haq-nawaz  
🔗   / hnawaz100  
🚀 https://hnawaz007.github.io/

-----------------------------------------

#ETL #dataquality #Airflow

Topics in this video (click to jump around):
==================================
0:00 - Introduction to Great Expectations Data Quality
0:49 - Prerequisites
1:16 - Create Great Expectation suite
1:46 - Review Great Expectation Data Quality Tests
2:29 - Airflow DAG
2:49 - Integrate Great Expectations Data Quality in Airflow
3:34 - Airflow UI: Dag review & run
3:56 - DAG logs: review Data Quality test run