Top Data Engineering Digest Data Engineer Data Engineering Content for Week of Apr 27

Sat.Apr 27, 2019 - Fri.May 03, 2019

Running Your Database On Kubernetes With KubeDB

Data Engineering Podcast

APRIL 28, 2019

Summary Kubernetes is a driving force in the renaissance around deploying and running applications. However, managing the database layer is still a separate concern. The KubeDB project was created as a way of providing a simple mechanism for running your storage system in the same platform as your application. In this episode Tamal Saha explains how the KubeDB project got started, why you might want to run your database with Kubernetes, and how to get started.

Database

Database PostgreSQL MongoDB MySQL

Engineering a Studio Quality Experience With High-Quality Audio at Netflix

Netflix Tech

MAY 1, 2019

by Guillaume du Pontavice, Phill Williams and Kylee Peña (on behalf of our Streaming Algorithms, Audio Algorithms, and Creative Technologies teams) Remember the epic opening sequence of Stranger Things 2 ? The thrill of that car chase through Pittsburgh not only introduced a whole new set of mysteries, but it returned us to a beloved and dangerous world alongside Dustin, Lucas, Mike, Will and Eleven.

Engineering

Engineering Algorithm Media Entertainment

Join 16,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

MORE WEBINARS

Trending Sources

What Is the Biggest Challenge Facing CMOs Today? Building, Measuring, and Maintaining Brand Equity.

Teradata

MAY 1, 2019

Teradata CMO Martyn Etherington discusses how brands can build, measure, and maintain brand equity. He also explains why customer experience is critical to a brand's success.

Building

Webinars

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

MORE WEBINARS

Dawn of DevOps: Managing Apache Kafka Clusters at Scale with Confluent Control Center

Confluent

MAY 2, 2019

When managing Apache Kafka ® clusters at scale, tasks that are simple on small clusters turn into significant burdens. To be fair, a lot of things turn into significant burdens at scale, and it’s Confluent Control Center’s job to ease as many of them as possible. In Confluent Platform 5.2, Control Center has grown a couple of new features that make large deployments a little more pleasant to manage: It has become much better at managing configuration changes among a large number of brokers, and

Kafka

Kafka Management Food Consulting

Get Better Network Graphs & Save Analysts Time

Many organizations today are unlocking the power of their data by using graph databases to feed downstream analytics, enahance visualizations, and more. Yet, when different graph nodes represent the same entity, graphs get messy. Watch this essential video with Senzing CEO Jeff Jonas on how adding entity resolution to a graph database condenses network graphs to improve analytics and save your analysts time.

Database

Docker for Data Science: Getting Started & Installing Docker

Advancing Analytics: Data Engineering

MAY 2, 2019

In the last Docker for Data Science blog we looked at where Docker came from and why it is important. In this blog we will get Docker installed and configured on either Windows or Mac. Installing Docker. Below are instructions for installing Docker on both Windows and on Mac. <important>Before we begin, there are a few different methods for installing Docker on Windows and Mac.

Data Science

Data Science Data Accessible Accessibility

Android Rx onError Guidelines

Netflix Tech

MAY 1, 2019

By Ed Ballot “Creating a good API is hard.”?—? anyone who has created an API used by others As with any API, wrapping your data stream in a Rx observable requires consideration for reasonable error handling and intuitive behavior. The following guidelines are intended to help developers create consistent and intuitive API. Since we frequently create Rx Observables in our Android app, we needed a common understanding of when to use onNext() and when to use onError() to make the API more consisten

Database

Database Coding Systems Building

Why is a Real Time Interaction Manager (RTIM) Essential to Providing a Superior Customer Experience?

Teradata

MAY 2, 2019

Ritu Jain explains the value of the Teradata Real Time Interaction Manager (RTIM) and why personalized customer experiences are so critical for marketers.

Management

More Trending

Why is a Real Time Interaction Manager (RTIM) Essential to Providing a Superior Customer Experience?

Teradata

MAY 2, 2019

Ritu Jain explains the value of the Teradata Real Time Interaction Manager (RTIM) and why personalized customer experiences are so critical for marketers.

Management

Analytics on DynamoDB: Comparing Elasticsearch, Athena and Spark

Rockset

APRIL 29, 2019

In this blog post I compare options for real-time analytics on DynamoDB - Elasticsearch , Athena, and Spark - in terms of ease of setup, maintenance, query capability, latency. There is limited support for SQL analytics with some of these options. I also evaluate which use cases each of them are best suited for. Developers often have a need to serve fast analytical queries over data in Amazon DynamoDB.

NoSQL

NoSQL PostgreSQL AWS SQL

Dawn of Kafka DevOps: Managing Kafka Clusters at Scale with Confluent Control Center

Confluent

MAY 2, 2019

Kafka

Kafka Management Food Consulting

How to Manage Stakeholder Requests in Big Organizations

Zalando Engineering

MAY 2, 2019

An important factor of success in agile environment is that team works well together. It is also important for a software engineer to be able to focus for longer periods of time with limited interruptions. Many companies have solved the challenge of focus and dedication for the team by having a designated role, such as Scrum Master or Producer, who is responsible for managing stakeholder requests, prioritizing them and communicating to the development team.

Management

Management Software Engineer Software Engineering Machine Learning

How We Structure our dbt Projects

dbt Developer Hub

APRIL 30, 2019

As the maintainers of dbt, and analytics consultants, at Fishtown Analytics (now dbt Labs) we build a lot of dbt projects. Over time, we’ve developed internal conventions on how we structure them. This article does not seek to instruct you on how to design a final model for your stakeholders — it won’t cover whether you should denormalize everything into one wide master table , or have many tables that need to be joined together in the BI layer.

Project

Project Database-centric Raw Data Data Warehouse

Understanding User Needs and Satisfying Them

Speaker: Scott Sehlhorst

We know we want to create products which our customers find to be valuable. Whether we label it as customer-centric or product-led depends on how long we've been doing product management. There are three challenges we face when doing this. The obvious challenge is figuring out what our users need; the non-obvious challenges are in creating a shared understanding of those needs and in sensing if what we're doing is meeting those needs.

Certification

Secondary Indexes For Analytics On DynamoDB

Rockset

APRIL 29, 2019

In this post I explore how to support analytical queries without encountering prohibitive scan costs, by leveraging secondary indexes in DynamoDB. I also evaluate the pros and cons of this approach in contrast to extracting data to another system like Athena, Spark or Elastic. Rockset recently added support for DynamoDB - which basically means you can run fast SQL on DynamoDB tables without any ETL.

NoSQL

NoSQL SQL AWS Systems

The PipelineDB Team Joins Confluent

Confluent

MAY 1, 2019

Some years ago, when I was at LinkedIn, I didn’t really know what Apache Kafka ® would become but had an inkling that the next generation of applications would not be islands disconnected from one another, or lashed together with irregular, point-to-point bindings. When we founded Confluent, we took the radical approach of viewing data—and the infrastructure that supported it—as a series of real-time streaming events rather than something kept in static, sedentary data repositories.

Kafka

Kafka Datasets Database Technology

Python at Netflix

Netflix Tech

APRIL 29, 2019

By Pythonistas at Netflix, coordinated by Amjith Ramanujam and edited by Ellen Livengood As many of us prepare to go to PyCon, we wanted to share a sampling of how Python is used at Netflix. We use Python through the full content lifecycle, from deciding which content to fund all the way to operating the CDN that serves the final video to 148 million members.

Python

Python Amazon Web Services Machine Learning Algorithm

Women in Big Data Panel at DataWorks Summit 2019

Cloudera

MAY 2, 2019

Last month, I moderated The Women in Big Data panel hosted by DataWorks Summit and sponsored by Women in Big Data. This was a well-attended event with five amazing guest speakers – Hilary Mason , Tina Rosario , Violeta Ciurel , Ana Gillan and Devon Edwards Joseph. The theme for the discussion was “Top technology trends women and men business leaders need to be aware of”.

Big Data

Big Data Data Science Healthcare Technology

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

Data Science

Optimizing Kafka Streams Applications

Confluent

APRIL 30, 2019

With the release of Apache Kafka ® 2.1.0, Kafka Streams introduced the processor topology optimization framework at the Kafka Streams DSL layer. This framework opens the door for various optimization techniques from the existing data stream management system (DSMS) and data stream processing literature. In what follows, we provide some context around how a processor topology was generated inside Kafka Streams before 2.1, with a focus on stateful operations like aggregations and joins.

Kafka

Kafka Coding Process Bytes

Sat.Apr 27, 2019 - Fri.May 03, 2019

Running Your Database On Kubernetes With KubeDB

Engineering a Studio Quality Experience With High-Quality Audio at Netflix

Webinars

Trending Sources

What Is the Biggest Challenge Facing CMOs Today? Building, Measuring, and Maintaining Brand Equity.

Webinars

Dawn of DevOps: Managing Apache Kafka Clusters at Scale with Confluent Control Center

Get Better Network Graphs & Save Analysts Time

Docker for Data Science: Getting Started & Installing Docker

Android Rx onError Guidelines

Why is a Real Time Interaction Manager (RTIM) Essential to Providing a Superior Customer Experience?

Sign up to get articles personalized to your interests!

More Trending

Why is a Real Time Interaction Manager (RTIM) Essential to Providing a Superior Customer Experience?

Analytics on DynamoDB: Comparing Elasticsearch, Athena and Spark

Dawn of Kafka DevOps: Managing Kafka Clusters at Scale with Confluent Control Center

How to Manage Stakeholder Requests in Big Organizations

How We Structure our dbt Projects

Understanding User Needs and Satisfying Them

Secondary Indexes For Analytics On DynamoDB

The PipelineDB Team Joins Confluent

Python at Netflix

Women in Big Data Panel at DataWorks Summit 2019

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Optimizing Kafka Streams Applications

Stay Connected