Remove Data Lake Remove Data Management Remove Systems Remove Technology
article thumbnail

Build A Data Lake For Your Security Logs With Scanner

Data Engineering Podcast

Summary Monitoring and auditing IT systems for security events requires the ability to quickly analyze massive volumes of unstructured log data. Cliff Crosland co-founded Scanner to provide fast querying of high scale log data for security auditing. Can you describe what Scanner is and the story behind it?

Data Lake 147
article thumbnail

Find Out About The Technology Behind The Latest PFAD In Analytical Database Development

Data Engineering Podcast

Over the decades of research and development into building these software systems there are a number of common components that are shared across implementations. Data lakes are notoriously complex. Can you describe the operational/architectural aspects of building a full data engine on top of the FDAP stack?

Database 162
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Keep Your Data Lake Fresh With Real Time Streams Using Estuary

Data Engineering Podcast

The batch world has been the default for years because of the complexities of running a reliable streaming system at scale. In this episode David Yaffe and Johnny Graettinger share the story behind the business and technology and how you can start using it today to build a real-time data lake without all of the headache.

Data Lake 162
article thumbnail

Designing Data Transfer Systems That Scale

Data Engineering Podcast

Summary The first step of data pipelines is to move the data to a place where you can process and prepare it for its eventual purpose. Data transfer systems are a critical component of data enablement, and building them to support large volumes of information is a complex endeavor.

Systems 130
article thumbnail

Making Email Better With AI At Shortwave

Data Engineering Podcast

Summary Generative AI has rapidly transformed everything in the technology sector. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Dagster offers a new approach to building and running data platforms and data pipelines. Your first 30 days are free!

Data Lake 182
article thumbnail

Using Trino And Iceberg As The Foundation Of Your Data Lakehouse

Data Engineering Podcast

Summary A data lakehouse is intended to combine the benefits of data lakes (cost effective, scalable storage and compute) and data warehouses (user friendly SQL interface). Data lakes are notoriously complex. To start, can you share your definition of what constitutes a "Data Lakehouse"?

Data Lake 262
article thumbnail

Announcing New Innovations for Data Warehouse, Data Lake, and Data Lakehouse in the Data Cloud 

Snowflake

Over the years, the technology landscape for data management has given rise to various architecture patterns, each thoughtfully designed to cater to specific use cases and requirements. Use cases change, needs change, technology changes – and therefore data infrastructure should be able to scale and evolve with change.