

With this strategy, you can then distinguish between operators that use Raw SQL / ORM. base import DjangoOperatorĬlass DjangoExampleOperator(DjangoOperator): Os.tdefault("DJANGO_SETTINGS_MODULE", "ttings")Īnd 2) Extend that DjangoOperator for logic / operators what would benefit from having access to ORM from.

My approach has been to 1) create a DjangoOperator import os, sys I agree we should continue to have this discussion as having access Django ORM can significantly reduce complexity of solutions. I think it's pretty important topic, as the whole banch of ORM-based frameworks and processes are not able to dive into Airflow in this case. Instead of implementing this functionality in raw SQL.
AIRFLOW DJANGO CODE
So what are the best practisies in this case? Do we share any hooks / operators for Django ORM / other ORMs? In order to have the following code real (treat as pseudo-code!): import osĪll_objects = (my_str_field = 'abc')ĭjango_op = DjangoOperator(task_id='get_and_modify_models', owner='airflow') It's like calling BashOperator with "python work_with_django_models.py" command. According to Using Django database layer outside of Django? question, it's needed to set up a connection configuration to the database, and then straight-forwardly execute queires in ORM, but doing that outside appropriate hooks / operators breaks Airflow principles. ORM provides a unified interface for working with such models.įor some reason, there are no examples of working with ORM in Airflow tasks in terms of hooks and operators. Every time these models's schemas changes, airflow raw SQL queries needs to be rewritten.

It's very convenient to use different ORM for fetching and processing database objects instead of raw SQL for the following reasons: Hook = MySqlHook(mysql_conn_id=self.mysql_conn_id)Īs we can see Hook incapsulates the connection configuration while Operator provides ability to execute custom queries. Super(MySqlOperator, self)._init_(*args, **kwargs) Self, sql, mysql_conn_id='mysql_default', parameters=None, Attaching the core code fragments: Copy from class MySqlHook(DbApiHook):Ĭonn = self.get_connection(self.mysql_conn_id)Ĭonn_config = conn.host or 'localhost'
AIRFLOW DJANGO HOW TO
How to work with Django models inside Airflow tasks?Īccording to official Airflow documentation, Airflow provides hooks for interaction with databases (like MySqlHook / PostgresHook / etc) that can be later used in Operators for row query execution.
