Data platforms are moving from batch-first pipelines to near real-time systems where orchestration, observability, scalability and governance all have to work together.
In this episode, Arun Karthik, Director, Data Solutions Engineering at Condé Nast Technology Lab, joins us to share how data engineering evolves from relational databases and ETL into distributed processing, modern orchestration with Apache Airflow and managed Airflow with Astronomer.
Key Takeaways:
00:00 Introduction.
02:13 Early data systems rely heavily on relational databases and batch-oriented processing models.
07:01 Scheduling requirements evolve beyond fixed time windows as dependencies increase.
10:14 Ease of use and developer experience influence adoption of orchestration frameworks.
13:22 Operating open source orchestration tools requires ongoing engineering effort.
14:45 Managed services help teams reduce infrastructure and maintenance responsibilities.
17:27 Observability improves confidence in pipeline execution and system health.
19:12 Governance considerations grow in importance as data platforms mature.
20:46 Building data systems requires balancing speed, reliability and long-term sustainability.
Resources Mentioned:
https://www.linkedin.com/in/earunkarthik/
Condé Nast Technology Lab | LinkedIn
https://www.linkedin.com/company/conde-nast-technology-lab/
Condé Nast Technology Lab | Website
https://www.condenast.com/
https://airflow.apache.org/
https://www.astronomer.io/
https://spark.apache.org/
https://hadoop.apache.org/
https://www.jenkins.io/
https://www.getdbt.com/product/what-is-dbt
https://aws.amazon.com/free/?trk=54026797-7540-48d8-9f6b-0db2c3a0040c&sc_channel=ps&trk=54026797-7540-48d8-9f6b-0db2c3a0040c&sc_channel=ps&ef_id=CjwKCAiAmp3LBhAkEiwAJM2JUKIc3E2I-hDlF6fRWgZn5n2-RWX-kEDAVApJYd88wwlsiyosV71VixoCmRoQAvD_BwE:G:s&s_kwcid=AL!4422!3!785574063524!e!!g!!amazon%20web%20services!23291338728!189486861095&gad_campaignid=23291338728&gbraid=0AAAAADjHtp813XNbg7azDj5QMwJPbGNqZ&gclid=CjwKCAiAmp3LBhAkEiwAJM2JUKIc3E2I-hDlF6fRWgZn5n2-RWX-kEDAVApJYd88wwlsiyosV71VixoCmRoQAvD_BwE
Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.
#AI #Automation #Airflow
Fler avsnitt av The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
Visa alla avsnitt av The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AIThe Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI med Astronomer finns tillgänglig på flera plattformar. Informationen på denna sida kommer från offentliga podd-flöden.
