Safely Test Your Applications And Analytics With Production Quality Data Using Tonic AI

00:00:00
/
00:45:40

January 22nd, 2023

45 mins 40 secs

Your Host

About this Episode

Summary

The most interesting and challenging bugs always happen in production, but recreating them is a constant challenge due to differences in the data that you are working with. Building your own scripts to replicate data from production is time consuming and error-prone. Tonic is a platform designed to solve the problem of having reliable, production-like data available for developing and testing your software, analytics, and machine learning projects. In this episode Adam Kamor explores the factors that make this such a complex problem to solve, the approach that he and his team have taken to turn it into a reliable product, and how you can start using it to replace your own collection of scripts.

Announcements

  • Hello and welcome to the Data Engineering Podcast, the show about modern data management
  • Truly leveraging and benefiting from streaming data is hard - the data stack is costly, difficult to use and still has limitations. Materialize breaks down those barriers with a true cloud-native streaming database - not simply a database that connects to streaming systems. With a PostgreSQL-compatible interface, you can now work with real-time data using ANSI SQL including the ability to perform multi-way complex joins, which support stream-to-stream, stream-to-table, table-to-table, and more, all in standard SQL. Go to dataengineeringpodcast.com/materialize today and sign up for early access to get started. If you like what you see and want to help make it better, they're hiring across all functions!
  • Data and analytics leaders, 2023 is your year to sharpen your leadership skills, refine your strategies and lead with purpose. Join your peers at Gartner Data & Analytics Summit, March 20 – 22 in Orlando, FL for 3 days of expert guidance, peer networking and collaboration. Listeners can save $375 off standard rates with code GARTNERDA. Go to dataengineeringpodcast.com/gartnerda today to find out more.
  • Your host is Tobias Macey and today I'm interviewing Adam Kamor about Tonic, a service for generating data sets that are safe for development, analytics, and machine learning

Interview

  • Introduction
  • How did you get involved in the area of data management?
  • Can you describe what Tonic is and the story behind it?
  • What are the core problems that you are trying to solve?
  • What are some of the ways that fake or obfuscated data is used in development and analytics workflows?
  • challenges of reliably subsetting data
    • impact of ORMs and bad habits developers get into with database modeling
  • Can you describe how Tonic is implemented?
    • What are the units of composition that you are building to allow for evolution and expansion of your product?
    • How have the design and goals of the platform evolved since you started working on it?
  • Can you describe some of the different workflows that customers build on top of your various tools
  • What are the most interesting, innovative, or unexpected ways that you have seen Tonic used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on Tonic?
  • When is Tonic the wrong choice?
  • What do you have planned for the future of Tonic?

Contact Info

Parting Question

  • From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements

  • Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com) with your story.
  • To help other people find the show please leave a review on Apple Podcasts and tell your friends and co-workers

Links

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Support Data Engineering Podcast