Remove getting-started-with-apache-flink-sql
article thumbnail

Analysis of Confluent Buying Immerok

Jesse Anderson

I started a Twitter thread with some of my initial thoughts, but I want to write a post giving more analysis and opinions. I think it’s quite telling that even the announcement doesn’t get ksqlDB’s name right. Since Kafka Streams is part of the Apache project, I don’t see it going away as quickly.

Kafka 147
article thumbnail

Build an Open Data Lakehouse with Iceberg Tables, Now in Public Preview

Snowflake

Apache Iceberg’s ecosystem of diverse adopters, contributors and commercial support continues to grow, establishing itself as the industry standard table format for an open data lakehouse architecture. If so, then the GLUE catalog integration provides an easy way to start querying those tables with Snowflake.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Getting Started With Cloudera Open Data Lakehouse on Private Cloud

Cloudera

Cloudera recently released a fully featured Open Data Lakehouse , powered by Apache Iceberg in the private cloud, in addition to what’s already been available for the Open Data Lakehouse in the public cloud since last year. Please note, you can also leverage Flink and SQL Stream Builder in CSA 1.11 Cloudera Flow Management 2.1.6

Cloud 75
article thumbnail

Fraud Detection With Cloudera Stream Processing Part 2: Real-Time Streaming Analytics

Cloudera

In part 1 of this blog we discussed how Cloudera DataFlow for the Public Cloud (CDF-PC), the universal data distribution service powered by Apache NiFi, can make it easy to acquire data from wherever it originates and move it efficiently to make it available to other applications in a streaming fashion. Data decays! Use case recap.

Process 85
article thumbnail

Brief History of Data Engineering

Jesse Anderson

Doug Cutting took those papers and created Apache Hadoop in 2005. Cloudera was started in 2008, and HortonWorks started in 2011. Hadoop was hard to program, and Apache Hive came along in 2010 to add SQL. Apache Pig in 2008 came too, but it didn’t ever see as much adoption. They eventually merged in 2012.

article thumbnail

Druid Deprecation and ClickHouse Adoption at Lyft

Lyft Engineering

Introduction At Lyft, we have used systems like Apache ClickHouse and Apache Druid for near real-time and sub-second analytics. In this particular blog post, we explain how Druid has been used at Lyft and what led us to adopt ClickHouse for our sub-second analytic system. Written by Ritesh Varyani and Jeana Choi at Lyft.

Kafka 104
article thumbnail

Data Engineering Weekly #151

Data Engineering Weekly

Github writes an excellent blog to capture the current state of the LLM integration architecture. I found this GitHub tutorial from Microsoft to be an excellent resource to get started with Gen-AI if you’re beginning your journey to understand the landscape. Lackluster AI/ML results often stem from poor data quality.