article thumbnail

Combining CDC Transactional Messages Using Kafka Streams

Confluent

How to use Kafka Streams to aggregate change data capture (CDC) messages from a relational database into transactional messages, powering a scalable microservices architecture.

Kafka 107
article thumbnail

How to Design a Modern, Robust Data Ingestion Architecture

Monte Carlo

Data Extraction with Apache Hadoop and Apache Sqoop : Hadoop’s distributed file system (HDFS) stores large data volumes; Sqoop transfers data between Hadoop and relational databases. Data Loading with Apache Hadoop and Apache Sqoop : Hadoop stores processed data; Sqoop loads it back into relational databases if needed.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Cloudera Operational Database application development concepts

Cloudera

Cloudera Operational Database is now available in three different form-factors in Cloudera Data Platform (CDP). . If you are new to Cloudera Operational Database, see this blog post. In this blog post, we’ll look at both Apache HBase and Apache Phoenix concepts relevant to developing applications for Cloudera Operational Database.

Database 101
article thumbnail

Oracle CDC Source Premium Connector is Now Generally Available

Confluent

One of the most common relational database systems that connects to Apache Kafka® is Oracle, which often holds highly critical enterprise transaction workloads. While Oracle Database (DB) excels at many […].

article thumbnail

4 Key Design Principles and Guarantees of Streaming Databases

Confluent

Classic relational database management systems (RDBMS) distribute and organize data in a relatively static storage layer. When queries are requested, they compute on the stored data and then return results […].

article thumbnail

Best Practices for Analyzing Kafka Event Streams

Rockset

Apache Kafka has seen broad adoption as the streaming platform of choice for building applications that react to streams of data in real time. In many organizations, Kafka is the foundational platform for real-time event analytics, acting as a central location for collecting event data and making it available in real time.

Kafka 40
article thumbnail

Big Data Technologies that Everyone Should Know in 2024

Knowledge Hut

NoSQL databases are designed for scalability and flexibility, making them well-suited for storing big data. The most popular NoSQL database systems include MongoDB, Cassandra, and HBase. Big data technologies can be categorized into four broad categories: batch processing, streaming, NoSQL databases, and data warehouses.