daemonslayer/tests-airflow

View on GitHub
src/etl/docs/great.rst

Summary

Maintainability
Test Coverage
What makes Airflow great?
=========================

* Write workflows as if you're writing programs
* Jobs can pass parameters to other jobs downstream
* Logic within workflows (instead of logic hidden 'inside' a program)
* Handle errors and failures gracefully
* Community and community support, size of community
* Ease of deployment of workflow changes (continuous integration)
* Job testing through airflow itself
* Low requirements for hardware setup
* built-in authentication details with encrypted passwords and extra details
* Easy environmental awareness through airflow variables
* Resource pooling for shared resources
* The most common of tasks already implemented and usable
* Accessibility of log files and other meta-data through the web GUI
* Supporting documentation
* Growing user base and contributions
* Extensibility of the framework
* Ease of installation and automated redeployments
* Dynamic DAGs
* Conditional execution in job flow diagrams
* Easy to reprocess historical jobs by date
* Usability of the web interface, low number of clicks to get to where you want
* Detailed info about landing times, processing times and SLA misses
* Temporarily turning workflows on and off
* Easy to re-run processing for specific intervals
* Can (re)run only parts of the workflow and dependent tasks
* Schedules are defined in code, not in a separate tool and database
* Can run tasks based on whether the previous run succeeded or failed
* Implement trigger rules for tasks
* AJAX/Rest API for job manipulation
* Jobs/tasks are run in a context, the scheduler passes in the necessary details
* Can verify what is running on airflow and see the actual code
* Work gets distributed across your cluster at the task level, not at the DAG level