Summary
This podcast started almost exactly six years ago, and the technology landscape was much different than it is now. In that time there have been a number of generational shifts in how data engineering is done. In this episode I reflect on some of the major themes and take a brief look forward at some of the upcoming changes.
Announcements
- Hello and welcome to the Data Engineering Podcast, the show about modern data management
- Your host is Tobias Macey and today I'm reflecting on the major trends in data engineering over the past 6 years
Interview
- Introduction
- 6 years of running the Data Engineering Podcast
- Around the first time that data engineering was discussed as a role
- Followed on from hype about "data science"
- Hadoop era
- Streaming
- Lambda and Kappa architectures
- Not really referenced anymore
- "Big Data" era of capture everything has shifted to focusing on data that presents value
- Regulatory environment increases risk, better tools introduce more capability to understand what data is useful
- Data catalogs
- Orchestration engine
- Oozie, etc. -> Airflow and Luigi -> Dagster, Prefect, Lyft, etc.
- Orchestration is now a part of most vertical tools
- Cloud data warehouses
- Data lakes
- DataOps and MLOps
- Data quality to data observability
- Metadata for everything
- Data catalog -> data discovery -> active metadata
- Business intelligence
- Read only reports to metric/semantic layers
- Embedded analytics and data APIs
- Rise of ELT
- dbt
- Corresponding introduction of reverse ETL
- What are the most interesting, unexpected, or challenging lessons that you have learned while working on running the podcast?
- What do you have planned for the future of the podcast?
Parting Question
- From your perspective, what is the biggest gap in the tooling or technology for data management today?
Closing Announcements
- Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com) with your story.
- To help other people find the show please leave a review on Apple Podcasts and tell your friends and co-workers
The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
Sponsored By:
Support Data Engineering Podcast