Tue.Oct 04, 2022

article thumbnail

Key-Value Databases, Explained

KDnuggets

Among the four big NoSQL database types, key-value stores are probably the most popular ones due to their simplicity and fast performance. Let’s further explore how key-value stores work and what are their practical uses.

Database 158
article thumbnail

Introducing Stream Designer: The Visual Builder for Streaming Data Pipelines

Confluent

Confluent’s new Stream Designer is the industry’s first visual interface for rapidly building, testing, and deploying streaming data pipelines natively on Apache Kafka.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Interview Kickstart Data Science Interview Course — What Makes It Different?

KDnuggets

Interview Kickstart’s Data Science Interview Course is built by Data Scientists from MAANG and other big tech companies, the course promises to get you interview-ready in 15 weeks.

article thumbnail

Scaling Kafka Brokers in Cloudera Data Hub

Cloudera

This blog post will provide guidance to administrators currently using or interested in using Kafka nodes to maintain cluster changes as they scale up or down to balance performance and cloud costs in production deployments. Kafka brokers contained within host groups enable the administrators to more easily add and remove nodes. This creates flexibility to handle real-time data feed volumes as they fluctuate.

Kafka 79
article thumbnail

Beyond the Basics of A/B Tests: Innovative Experimentation Tactics You Need to Know as a Data or Product Professional

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

Top Posts September 26 – October 2: Free Algorithms in Python Course

KDnuggets

Free Algorithms in Python Course • How to Select Rows and Columns in Pandas • Lessons from a Senior Data Scientist • A Day in the Life of a Data Scientist: Expert vs. Beginner • 7 Machine Learning Portfolio Projects to Boost the Resume.

Algorithm 108
article thumbnail

Reducing the Time to Value of your dbt Deployment with Slim CI

phData: Data Engineering

So you’ve been using dbt for a bit now… You have all of your transformations in dbt and your deployments are executing flawlessly, plus you noticed your development velocity has greatly increased. However, as your dbt repo has grown, you’ve begun to see that your deployments are taking even longer. You’ve spent a lot of time tagging your code to optimize your data refreshes, and while your refreshes run quickly, your deployments aren’t.

Cloud 52

More Trending

article thumbnail

A Brief Overview of Real-time Data

Striim

Traditionally, historical data (or batch data) was used for decision-making. However, lately, there’s a lot of focus on real-time data, which provides more business value. According to a survey by McKinsey , high-performing businesses are almost five times more likely to use real-time data, as compared to their counterparts. Real-time data is gaining prominence because it can help end-users to make decisions on the fly, allowing for more accurate and faster decision-making.

Media 52
article thumbnail

6 Best Free Online Courses to Jumpstart Your Learning of SQL

KDnuggets

We scoured the internet for the best free courses for anyone looking to learn SQL. We’re excited to share the top 6 resources we found.

SQL 100
article thumbnail

The Significance of O’Reilly’s Data Quality Fundamentals

Monte Carlo

In November of 2020, O’Reilly Media first approached us with the idea to author Data Quality Fundamentals: A Practitioner’s Guide to Building More Trustworthy Data Pipelines. It was an inflection point for a fledgling company that had only just begun to establish the category of data observability. We knew it wouldn’t be an easy feat, but we also knew it would be worthwhile – and important Poor data quality is one of the foremost challenges of our industry, and certainly one of the m

article thumbnail

Machine Learning for Everybody!

KDnuggets

Who is machine learning for? Everybody!

article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

Expensive Enterprise Hacks That Serve As A Lesson In Cybersecurity

U-Next

Every now and then, we come across news on data and network breaches in enterprises we thought had the most sophisticated and airtight cybersecurity measures. The fact is that, exploiters are not just becoming smarter but more creative as well. New avenues and loopholes are being exploited to infiltrate into networks and systems to extract sensitive data and information from businesses. .

Media 52
article thumbnail

Which Metric Should I Use? Accuracy vs. AUC

KDnuggets

Depending on the problem you’re trying to solve, one metric may be more insightful than another.

article thumbnail

Comparing ClickHouse vs Rockset for Event and CDC Streams

Rockset

Streaming data feeds many real-time analytics applications, from logistics tracking to real-time personalization. Event streams, such as clickstreams, IoT data and other time series data, are common sources of data into these apps. The broad adoption of Apache Kafka has helped make these event streams more accessible. Change data capture (CDC) streams from OLTP databases, which may provide sales, demographic or inventory data, are another valuable source of data for real-time analytics use cases

MySQL 52