Building Data Pipelines
Explore data pipelines and methods of processing them with and without ETL (extract transform load). In this course you will learn to create data pipelines by using the Apache Airflow workflow management program. Key concepts covered here include data pipelines as an application that sits between raw data and a transformed data set between a data source and a data target; how to build a traditional ETL pipeline with batch processing; and how to build an ETL pipeline with stream processing. Next learn how to set up and install Apache Airflow; the key concepts of Apache Airflow; how to instantiate a directed acyclic graph in Airflow. Learners are shown how to use tasks and include arguments in Airflow; how to use dependencies in Airflow; how to build an ETL pipeline with Airflow; and how to build an automated pipeline without using ETL. Finally learn how to test Airflow tasks by using the airflow command line utility and how to use Apache Airflow to create a data pipeline.