Remove Bytes Remove Hadoop Remove Kafka Remove Metadata
article thumbnail

Kafka Connect Deep Dive – Error Handling and Dead Letter Queues

Confluent

Kafka Connect is part of Apache Kafka ® and is a powerful framework for building streaming pipelines between Kafka and other technologies. Since Apache Kafka 2.0, This is the default behavior of Kafka Connect, and it can be set explicitly with the following: errors.tolerance = none. jq -c -M '[.name,tasks[].state]'

Kafka 111
article thumbnail

Kafka Listeners – Explained

Confluent

Put another way, courtesy of Spencer Ruport: LISTENERS are what interfaces Kafka binds to. Apache Kafka ® is a distributed system. When a client (producer/consumer) starts, it will request metadata about which broker is the leader for a partition—and it can do this from any broker. Is anyone listening? Brokers in the cloud (e.g.,

Kafka 99
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Data Engineering Annotated Monthly – May 2022

Big Data Tools

DataHub 0.8.36 – Metadata management is a big and complicated topic. On top of that, it’s a part of the Hadoop platform, which created additional work that we otherwise would not have had to do. DataHub is a completely independent product by LinkedIn, and the folks there definitely know what metadata is and how important it is.

article thumbnail

Data Engineering Annotated Monthly – May 2022

Big Data Tools

DataHub 0.8.36 – Metadata management is a big and complicated topic. On top of that, it’s a part of the Hadoop platform, which created additional work that we otherwise would not have had to do. DataHub is a completely independent product by LinkedIn, and the folks there definitely know what metadata is and how important it is.

article thumbnail

Optimizing Kafka Streams Applications

Confluent

With the release of Apache Kafka ® 2.1.0, Kafka Streams introduced the processor topology optimization framework at the Kafka Streams DSL layer. In what follows, we provide some context around how a processor topology was generated inside Kafka Streams before 2.1, Kafka Streams topology generation 101.

Kafka 89
article thumbnail

97 things every data engineer should know

Grouparoo

This provided a nice overview of the breadth of topics that are relevant to data engineering including data warehouses/lakes, pipelines, metadata, security, compliance, quality, and working with other teams. For example, grouping the ones about metadata, discoverability, and column naming might have made a lot of sense.

article thumbnail

100+ Kafka Interview Questions and Answers for 2023

ProjectPro

Your search for Apache Kafka interview questions ends right here! Let us now dive directly into the Apache Kafka interview questions and answers and help you get started with your Big Data interview preparation! How to study for Kafka interview? What is Kafka used for? What are main APIs of Kafka?

Kafka 40