Remove getting-started-with-apache-kafka-in-python
article thumbnail

Getting Started with Apache Kafka in Python

Confluent

Welcome Pythonistas to the streaming data world centered around Apache Kafka®! If you’re using Python and ready to get hands-on with Kafka, then you’re in the right place. This blog […].

Kafka 120
article thumbnail

Brief History of Data Engineering

Jesse Anderson

Doug Cutting took those papers and created Apache Hadoop in 2005. Cloudera was started in 2008, and HortonWorks started in 2011. Hadoop was hard to program, and Apache Hive came along in 2010 to add SQL. Apache Pig in 2008 came too, but it didn’t ever see as much adoption. In the beginning, there was Google.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Data News — Week 23.09

Christophe Blefari

I'll try to think about it in the following weeks to understand where I go for the third year of the newsletter and the blog. I'll try to think about it in the following weeks to understand where I go for the third year of the newsletter and the blog. So thank you for that. Stay tuned and let's jump to the content.

article thumbnail

Data News — Week 23.11

Christophe Blefari

On my side I'm slowly starting to get on top of the things I had in queue. Which means that I get easily disturbed by a notification—or even a thought—and do something that I did not plan to do at first. It, probably, explains why you always get the newsletter late on Fridays—or Saturdays.

Data 130
article thumbnail

Druid Deprecation and ClickHouse Adoption at Lyft

Lyft Engineering

Introduction At Lyft, we have used systems like Apache ClickHouse and Apache Druid for near real-time and sub-second analytics. In this particular blog post, we explain how Druid has been used at Lyft and what led us to adopt ClickHouse for our sub-second analytic system. Written by Ritesh Varyani and Jeana Choi at Lyft.

Kafka 104
article thumbnail

Machine Learning with Python, Jupyter, KSQL and TensorFlow

Confluent

It takes much more effort than just building an analytic model with Python and your favorite machine learning framework. After all, machine learning with Python requires the use of algorithms that allow computer programs to constantly learn, but building that infrastructure is several levels higher in complexity.

article thumbnail

How to Become Databricks Certified Apache Spark Developer?

ProjectPro

With around 35k stars and over 26k forks on Github, Apache Spark is one of the most popular big data frameworks used by 22,760 companies worldwide. Apache Spark is the most efficient, scalable, and widely used in-memory data computation tool capable of performing batch-mode, real-time, and analytics operations.

Scala 52