2021

article thumbnail

What’s New in Apache Kafka 3.0.0

Confluent

I’m pleased to announce the release of Apache Kafka 3.0 on behalf of the Apache Kafka® community. Apache Kafka 3.0 is a major release in more ways than one. Apache […].

Kafka 145
article thumbnail

6 Predictive Models Every Beginner Data Scientist Should Master

KDnuggets

Data Science models come with different flavors and techniques — luckily, most advanced models are based on a couple of fundamentals. Which models should you learn when you want to begin a career as Data Scientist? This post brings you 6 models that are widely used in the industry, either in standalone form or as a building block for other advanced techniques.

Data 145
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Building Uber’s Fulfillment Platform for Planet-Scale using Google Cloud Spanner

Uber Engineering

Introduction. The Fulfillment Platform is a foundational Uber domain that enables the rapid scaling of new verticals. The platform handles billions of database transactions each day, ranging from user actions (e.g., a driver starting a trip) and system actions … The post Building Uber’s Fulfillment Platform for Planet-Scale using Google Cloud Spanner appeared first on Uber Engineering Blog.

article thumbnail

Turning the page

Cloudera

Today marks the beginning of an exciting new chapter for Cloudera. Cloudera will become a private company with the flexibility and resources to accelerate product innovation, cloud transformation and customer growth. Cloudera will benefit from the operating capabilities, capital support and expertise of Clayton, Dubilier & Rice (CD&R) and KKR – two of the most experienced and successful global investment firms in the world recognized for supporting the growth strategies of the businesses

Cloud 144
article thumbnail

The Definitive Entity Resolution Buyer’s Guide

Are you thinking of adding enhanced data matching and relationship detection to your product or service? Do you need to know more about what to look for when assessing your options? Our Entity Resolution Buyer’s Guide gives you step-by-step details about everything you should consider when evaluating entity resolution technologies. We discuss use cases, technology, and deployment options, top ten evaluation criteria and more.

article thumbnail

Natural Language Processing: A Guide to NLP Use Cases, Approaches, and Tools

AltexSoft

Humans have been trying to make machines chat for decades. Alan Turing considered computers’ ability to generate natural speech a proof of their ability to think. Today, we converse with virtual companions all the time. But despite years of research and innovation, their unnatural responses remind us that no, we’re not yet at the HAL 9000-level of speech sophistication.

Process 139

More Trending

article thumbnail

Tech workers warned they were going to quit. Now, the problem is spiralling out of control

DataKitchen

The post Tech workers warned they were going to quit. Now, the problem is spiralling out of control first appeared on DataKitchen.

145
145
article thumbnail

How Netflix uses eBPF flow logs at scale for network insight

Netflix Tech

By Alok Tiagi , Hariharan Ananthakrishnan , Ivan Porto Carrero and Keerti Lakshminarayan Netflix has developed a network observability sidecar called Flow Exporter that uses eBPF tracepoints to capture TCP flows at near real time. At much less than 1% of CPU and memory on the instance, this highly performant sidecar provides flow data at scale for network insight.

AWS 112
article thumbnail

Building a solid data team

KDnuggets

How do you put together a solid data science team when it comes to developing data-driven products? A variety of roles are available to consider, so which ones do you need and which are most crucial?

Data 145
article thumbnail

Why Machine Learning Engineers are Replacing Data Scientists

KDnuggets

The hiring run for data scientists continues along at a strong clip around the world. But, there are other emerging roles that are demonstrating key value to organizations that you should consider based on your existing or desired skill sets.

article thumbnail

3 Differences Between Coding in Data Science and Machine Learning

KDnuggets

The terms ‘data science’ and ‘machine learning’ are often used interchangeably. But while they are related, there are some glaring differences, so let’s take a look at the differences between the two disciplines, specifically as it relates to programming.

article thumbnail

Where NLP is heading

KDnuggets

Natural language processing research and applications are moving forward rapidly. Several trends have emerged on this progress, and point to a future of more exciting possibilities and interesting opportunities in the field.

Process 145
article thumbnail

Most Common SQL Mistakes on Data Science Interviews

KDnuggets

Sure, we all make mistakes -- which can be a bit more painful when we are trying to get hired -- so check out these typical errors applicants make while answering SQL questions during data science interviews.

SQL 145
article thumbnail

Design Patterns for Machine Learning Pipelines

KDnuggets

ML pipeline design has undergone several evolutions in the past decade with advances in memory and processor performance, storage systems, and the increasing scale of data sets. We describe how these design patterns changed, what processes they went through, and their future direction.

article thumbnail

10 Key AI & Data Analytics Trends for 2022 and Beyond

KDnuggets

What AI and data analytics trends are taking the industry by storm this year? This comprehensive review highlights upcoming directions in AI to carefully watch and consider implementing in your personal work or organization.

article thumbnail

Alternative Feature Selection Methods in Machine Learning

KDnuggets

Feature selection methodologies go beyond filter, wrapper and embedded methods. In this article, I describe 3 alternative algorithms to select predictive features based on a feature importance score.

article thumbnail

Real-Time Exactly-Once Ad Event Processing with Apache Flink, Kafka, and Pinot

Uber Engineering

Uber recently launched a new capability: Ads on UberEats. With this new ability came new challenges that needed to be solved at Uber, such as systems for ad auctions, bidding, attribution, reporting, and more. This article focuses on how we … The post Real-Time Exactly-Once Ad Event Processing with Apache Flink, Kafka, and Pinot appeared first on Uber Engineering Blog.

Kafka 145
article thumbnail

Top Stories, Nov 15-21: 19 Data Science Project Ideas for Beginners

KDnuggets

Also: How I Redesigned over 100 ETL into ELT Data Pipelines; Where NLP is heading; Don’t Waste Time Building Your Data Science Network; Data Scientists: How to Sell Your Project and Yourself.

article thumbnail

Kafka Summit Americas 2021 Recap

Confluent

The full inventory of three online Kafka Summits in 2021 is now complete. Kafka Summit Americas wrapped just yesterday. Being a part of the event team and the Program Committee, […].

Kafka 145
article thumbnail

How to Get Certified as a Data Scientist

KDnuggets

If you are early in your journey to becoming a Data Scientist, an interesting option is to earn certification by DataCamp, and this guide offers tips that will help beginners complete the challenges.

Data 144
article thumbnail

How Uber Achieves Operational Excellence in the Data Quality Experience

Uber Engineering

Uber delivers efficient and reliable transportation across the global marketplace, which is powered by hundreds of services, machine learning models, and tens of thousands of datasets. While growing rapidly, we’re also committed to maintaining data quality, as it can greatly … The post How Uber Achieves Operational Excellence in the Data Quality Experience appeared first on Uber Engineering Blog.

Data 145
article thumbnail

Top 4 Data Integration Tools for Modern Enterprises

KDnuggets

Maintaining a centralized data repository can simplify your business intelligence initiatives. Here are four data integration tools that can make data more valuable for modern enterprises.

article thumbnail

ORDAINED: The Python Project Template

KDnuggets

Recently I decided to take the time to better understand the Python packaging ecosystem and create a project boilerplate template as an improvement over copying a directory tree and doing find and replace.

Python 144
article thumbnail

10 AI Project Ideas in Computer Vision

KDnuggets

The field of computer vision has seen the development of very powerful applications leveraging machine learning. These projects will introduce you to these techniques and guide you to more advanced practice to gain a deeper appreciation for the sophistication now available.

Project 144
article thumbnail

Data Science & Analytics Industry Main Developments in 2021 and Key Trends for 2022

KDnuggets

We have solicited insights from experts at industry-leading companies, asking: "What were the main AI, Data Science, Machine Learning Developments in 2021 and what key trends do you expect in 2022?" Read their opinions here.

article thumbnail

Containerizing Apache Hadoop Infrastructure at Uber

Uber Engineering

Introduction. As Uber’s business grew, we scaled our Apache Hadoop (referred to as ‘Hadoop’ in this article) deployment to 21000+ hosts in 5 years, to support the various analytical and machine learning use cases. We built a team with varied … The post Containerizing Apache Hadoop Infrastructure at Uber appeared first on Uber Engineering Blog.

Hadoop 145
article thumbnail

5 Practical Data Science Projects That Will Help You Solve Real Business Problems for 2022

KDnuggets

This curated list of data science projects offers real-life problems that will help you master skills to demonstration that you are technically sound and know how to conduct data science projects that add business value.

article thumbnail

The Architecture of Uber’s API gateway

Uber Engineering

API gateways are an integral part of microservices architecture in recent years. An API gateway provides a single point of entry for all our apps and provides an interface to access data, logic, or functionality from back-end microservices. It also … The post The Architecture of Uber’s API gateway appeared first on Uber Engineering Blog.

article thumbnail

Streaming Real-Time Analytics with Redis, AWS Fargate, and Dash Framework

Uber Engineering

Introduction. Uber’s GSS (Global Scaled Solutions) team runs scaled programs for diverse products and businesses, including but not limited to Eats, Rides, and Freight. The team transforms Uber’s ideas into agile, global solutions by designing and implementing scalable solutions. One … The post Streaming Real-Time Analytics with Redis, AWS Fargate, and Dash Framework appeared first on Uber Engineering Blog.

AWS 144
article thumbnail

Node.js ❤️ Apache Kafka – Getting Started with KafkaJS

Confluent

One of the great things about using an Apache Kafka® based architecture is that it naturally decouples systems and allows you to use the best tool for the job. While […].

Kafka 145
article thumbnail

Apache Kafka Made Simple: A First Glimpse of a Kafka Without ZooKeeper

Confluent

At the heart of Apache Kafka® sits the log—a simple data structure that uses sequential operations that work symbiotically with the underlying hardware. Efficient disk buffering and CPU cache usage, […].

Kafka 145
article thumbnail

How ksqlDB Works: Internal Architecture and Advanced Features

Confluent

To effectively use ksqlDB, the streaming database for Apache Kafka®, you should of course be familiar with its features and syntax. However, a deeper understanding of what goes on underneath […].

Kafka 144
article thumbnail

Should You Become a Freelance Artificial Intelligence Engineer?

KDnuggets

Take the first step towards your machine learning engineering career and explore the UC San Diego Extension Machine Learning Engineering Bootcamp today. Those with prior software engineering or data science experience are encouraged to apply.

article thumbnail

42 Things You Can Stop Doing Once ZooKeeper Is Gone from Apache Kafka

Confluent

Soon, Apache Kafka® will no longer need ZooKeeper! With KIP-500, Kafka will include its own built-in consensus layer, removing the ZooKeeper dependency altogether. The next big milestone in this effort […].

Kafka 145