Feeding the Data-lake I: Task management / automatization using Airflow
What is Airflow? Airflow is a platform to programmatically author, schedule and monitor workflows. The idea is to program your workflow automation using python. We use the DAG concept to build tasks on Airflow. DAG stands for Direct Acyclic Graphs. This tasks are then interpreted and scheduled for execution by the Airflow execution engine. What is a DAG? In graph theory, Airflow install in docker Let's build a docker cluster to run Airflow locally, by following the oficial documentation. Please refer to that documentation if you bump into any issues or to get a more thorough view of this solution. Step 1: Install docker-compose sudo curl -L "https://github.com/docker/compose/releases/download/1.29.2/docker-compose- $( uname -s ) - $( uname -m ) " -o /usr/local/bin/docker-compose $ sudo chmod +x /usr/local/bin/docker-compose Step 2 : Get the docker compose yaml example file This yaml file will have...