article thumbnail

Build A Data Lake For Your Security Logs With Scanner

Data Engineering Podcast

Summary Monitoring and auditing IT systems for security events requires the ability to quickly analyze massive volumes of unstructured log data. Cliff Crosland co-founded Scanner to provide fast querying of high scale log data for security auditing. SIEM) A query engine is useless without data to analyze.

Data Lake 147
article thumbnail

Keep Your Data Lake Fresh With Real Time Streams Using Estuary

Data Engineering Podcast

Summary Batch vs. streaming is a long running debate in the world of data integration and transformation. In this episode David Yaffe and Johnny Graettinger share the story behind the business and technology and how you can start using it today to build a real-time data lake without all of the headache.

Data Lake 162
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Seamless SQL And Python Transformations For Data Engineers And Analysts With SQLMesh

Data Engineering Podcast

Summary Data transformation is a key activity for all of the organizational roles that interact with data. Because of its importance and outsized impact on what is possible for downstream data consumers it is critical that everyone is able to collaborate seamlessly. Can you describe what SQLMesh is and the story behind it?

article thumbnail

Charting A Path For Streaming Data To Fill Your Data Lake With Hudi

Data Engineering Podcast

Summary Data lake architectures have largely been biased toward batch processing workflows due to the volume of data that they are designed for. With more real-time requirements and the increasing use of streaming data there has been a struggle to merge fast, incremental updates with large, historical analysis.

Data Lake 130
article thumbnail

Announcing New Innovations for Data Warehouse, Data Lake, and Data Lakehouse in the Data Cloud 

Snowflake

Over the years, the technology landscape for data management has given rise to various architecture patterns, each thoughtfully designed to cater to specific use cases and requirements. These patterns include both centralized storage patterns like data warehouse , data lake and data lakehouse , and distributed patterns such as data mesh.

article thumbnail

Using Trino And Iceberg As The Foundation Of Your Data Lakehouse

Data Engineering Podcast

Summary A data lakehouse is intended to combine the benefits of data lakes (cost effective, scalable storage and compute) and data warehouses (user friendly SQL interface). Data lakes are notoriously complex. Join in with the event for the global data community, Data Council Austin.

Data Lake 262
article thumbnail

Zenlytic Is Building You A Better Coworker With AI Agents

Data Engineering Podcast

Summary The purpose of business intelligence systems is to allow anyone in the business to access and decode data to help them make informed decisions. The team at Zenlytic have leaned on the promise of large language models to build an AI agent that lets you converse with your data. Data lakes are notoriously complex.

Building 278