Sveriges mest populära poddar
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI

Introducing Airflow’s Common AI Provider with Pavan Kumar Gopidesu and Kaxil Naik

29 min23 april 2026

In this episode, we explore the newly released Apache Airflow common AI provider — what problem it solves, how it was built and what's coming next.


Kaxil Naik, Senior Director of Engineering atAstronomer and Apache Airflow PMC member, andPavan Kumar Gopidesu, Lead Data Engineer at Experian and Apache Airflow PMC member, join us to walk through the provider's first release and the technical decisions behind it.


Key Takeaways:


00:00 Introduction.

04:05 The common AI provider was born from a real production problem.

07:10 Airflow already had the primitives needed for durable agent execution, making it the natural foundation for AI orchestration. 

09:15 The LLM schema compare operator uses Apache DataFusion to fetch source schemas.

11:07 Apache DataFusion was chosen for its speed.

13:09 Hook tool sets expose Airflow's provider hooks to agents with an allowed methods list that blocks destructive operations.

15:20 Passing durable=True to an LLM operator caches tool calls and LLM outputs mid-task. 

18:13 The provider offers three abstraction levels. 

21:20 The provider currently requires Airflow 3 — the team is open to adding Airflow 2.11 support if demand is high enough. 

24:10 MCP server configs can be stored as Airflow connections.



Resources Mentioned:


Kaxil Naik

https://www.linkedin.com/in/kaxil/


Pavan Kumar Gopidesu

https://www.linkedin.com/in/pavan-kumar-gopidesu/


Astronomer | LinkedIn

https://www.linkedin.com/company/astronomer/


Astronomer | Website

https://www.astronomer.io


Experian

https://www.linkedin.com/company/experian/


Apache Airflow

https://www.linkedin.com/company/apache-airflow


Apache Airflow common AI provider docs

https://airflow.apache.org/docs/apache-airflow-providers-common-ai/stable/commits.html


Apache DataFusion

https://datafusion.apache.org/


Pydantic AI

https://pydantic.dev/docs/ai/overview/


Airflow Slack

https://airflow.apache.org/docs/apache-airflow-providers-slack/stable/index.html


Introducing the Common AI Provider: LLM and AI Agent Support for Apache Airflow

https://airflow.apache.org/blog/common-ai-provider/





Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.



#Automation #Airflow #MachineLearning

Fler avsnitt av The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI

Visa alla avsnitt av The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI

The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI med Astronomer finns tillgänglig på flera plattformar. Informationen på denna sida kommer från offentliga podd-flöden.