Remove how-choose-number-topics-partitions-kafka-cluster
article thumbnail

Data Engineering Annotated Monthly – September 2022

Big Data Tools

If you think I missed something worthwhile, hit me up on Twitter and suggest a topic, link, or anything else you want to see. One of the use cases from the product page that stood out to me in particular was the effort to mirror multiple Kafka clusters in one Brooklin cluster! But what if there are many hardware clusters?

article thumbnail

Data Engineering Annotated Monthly – September 2022

Big Data Tools

If you think I missed something worthwhile, hit me up on Twitter and suggest a topic, link, or anything else you want to see. One of the use cases from the product page that stood out to me in particular was the effort to mirror multiple Kafka clusters in one Brooklin cluster! But what if there are many hardware clusters?

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

API-First Approach to Kafka Topic Creation

DoorDash Engineering

DoorDash’s Engineering teams revamped Kafka Topic creation by replacing a Terraform/Atlantis based approach with an in-house API, Infra Service. DoorDash’s Real-Time Streaming Platform, or RTSP, team is under the Data Platform organization and manages over 2,500 Kafka Topics across five clusters.

Kafka 90
article thumbnail

The Good and the Bad of Apache Kafka Streaming Platform

AltexSoft

Kafka can continue the list of brand names that became generic terms for the entire type of technology. In this article, we’ll explain why businesses choose Kafka and what problems they face when using it. In this article, we’ll explain why businesses choose Kafka and what problems they face when using it.

Kafka 93
article thumbnail

5 Key Takeaways from #Current2023

Cloudera

Recently, Confluent hosted Current 2023 (formerly Kafka summit) in San Jose on Sept 26th and 27th. This blog is for anyone who was interested but unable to attend the conference, or anyone interested in a quick summary of what happened there. More of a Confluent conference now than a kafka conference.

article thumbnail

A Beginner’s Guide to Learning PySpark for Big Data Processing

ProjectPro

PySpark Applications-How are Businesses leveraging PySpark? How long does it take to learn PySpark? Here’s What You Need to Know About PySpark This blog will take you through the basics of PySpark, the PySpark architecture, and a few popular PySpark libraries , among other things. But how does this happen?

article thumbnail

Kafka to Delta Lake, as fast as possible

Scribd Technology

Streaming data from Apache Kafka into Delta Lake is an integral part of Scribd’s data platform, but has been challenging to manage and scale. We use Spark Structured Streaming jobs to read data from Kafka topics and write that data into Delta Lake tables. To serve this need, we created kafka-delta-ingest.

Kafka 52