Podcast: Play in new window | Embed
Subscribe: Apple Podcasts | Google Podcasts | Spotify | Amazon Music | Email | TuneIn | Deezer | RSS
In this episode of the AI Today podcast hosts Kathleen Walch and Ron Schmelzer define terms related to data. Because, data is the heart of AI. So it’s important to understand the role data plays in AI and ML projects. In this episode we go over the terms data engineer, data engineering, and data pipeline. Each project has it’s own unique data pipeline, which is a set of interconnected steps developed as part of a data engineering process. The pipeline provides different operations, transformation, integration, aggregation, and other data-centric activities between data sources and final destinations where the data are used.
Additionally, we go over the term data wrangling which is the process of transforming data from a raw data form into its desired form. Also we discuss data feed, Data Governance, and Data integration. We explain how these terms relate to AI and why it’s important to know about them.
Show Notes:
- FREE Intro to CPMAI mini course
- CPMAI Training and Certification
- The Steps for a Machine Learning Project
- AI Glossary
- AI Glossary Series – DevOps, Machine Learning Operations (ML Ops)
- AI Glossary Series – Data Preparation, Data Cleaning, Data Splitting, Data Multiplication, Data Transformation
- AI Glossary Series – Data Augmentation, Data Labeling, Bounding box, Sensor fusion
- AI Glossary Series – Data, Dataset, Big Data, DIKUW Pyramid
- AI Glossary Series – V’s of Big Data, Data Volume, Exabyte / Petabyte / Yottabyte / Zettabyte, Data Variety, Data Velocity, Data Veracity
- AI Glossary Series – Data Science, Data Scientist, Citizen Data Scientist / Citizen Developer, Data Custodian