Sat.Apr 27, 2019 - Fri.May 03, 2019

article thumbnail

Running Your Database On Kubernetes With KubeDB

Data Engineering Podcast

Summary Kubernetes is a driving force in the renaissance around deploying and running applications. However, managing the database layer is still a separate concern. The KubeDB project was created as a way of providing a simple mechanism for running your storage system in the same platform as your application. In this episode Tamal Saha explains how the KubeDB project got started, why you might want to run your database with Kubernetes, and how to get started.

Database 100
article thumbnail

Engineering a Studio Quality Experience With High-Quality Audio at Netflix

Netflix Tech

by Guillaume du Pontavice, Phill Williams and Kylee Peña (on behalf of our Streaming Algorithms, Audio Algorithms, and Creative Technologies teams) Remember the epic opening sequence of Stranger Things 2 ? The thrill of that car chase through Pittsburgh not only introduced a whole new set of mysteries, but it returned us to a beloved and dangerous world alongside Dustin, Lucas, Mike, Will and Eleven.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

What Is the Biggest Challenge Facing CMOs Today? Building, Measuring, and Maintaining Brand Equity.

Teradata

Teradata CMO Martyn Etherington discusses how brands can build, measure, and maintain brand equity. He also explains why customer experience is critical to a brand's success.

article thumbnail

Dawn of DevOps: Managing Apache Kafka Clusters at Scale with Confluent Control Center

Confluent

When managing Apache Kafka ® clusters at scale, tasks that are simple on small clusters turn into significant burdens. To be fair, a lot of things turn into significant burdens at scale, and it’s Confluent Control Center’s job to ease as many of them as possible. In Confluent Platform 5.2, Control Center has grown a couple of new features that make large deployments a little more pleasant to manage: It has become much better at managing configuration changes among a large number of brokers, and

Kafka 88
article thumbnail

Get Better Network Graphs & Save Analysts Time

Many organizations today are unlocking the power of their data by using graph databases to feed downstream analytics, enahance visualizations, and more. Yet, when different graph nodes represent the same entity, graphs get messy. Watch this essential video with Senzing CEO Jeff Jonas on how adding entity resolution to a graph database condenses network graphs to improve analytics and save your analysts time.

article thumbnail

Docker for Data Science: Getting Started & Installing Docker

Advancing Analytics: Data Engineering

In the last Docker for Data Science blog we looked at where Docker came from and why it is important. In this blog we will get Docker installed and configured on either Windows or Mac. Installing Docker. Below are instructions for installing Docker on both Windows and on Mac. <important>Before we begin, there are a few different methods for installing Docker on Windows and Mac.

article thumbnail

Android Rx onError Guidelines

Netflix Tech

By Ed Ballot “Creating a good API is hard.”?—? anyone who has created an API used by others As with any API, wrapping your data stream in a Rx observable requires consideration for reasonable error handling and intuitive behavior. The following guidelines are intended to help developers create consistent and intuitive API. Since we frequently create Rx Observables in our Android app, we needed a common understanding of when to use onNext() and when to use onError() to make the API more consisten

More Trending

article thumbnail

Analytics on DynamoDB: Comparing Elasticsearch, Athena and Spark

Rockset

In this blog post I compare options for real-time analytics on DynamoDB - Elasticsearch , Athena, and Spark - in terms of ease of setup, maintenance, query capability, latency. There is limited support for SQL analytics with some of these options. I also evaluate which use cases each of them are best suited for. Developers often have a need to serve fast analytical queries over data in Amazon DynamoDB.

NoSQL 52
article thumbnail

Dawn of Kafka DevOps: Managing Kafka Clusters at Scale with Confluent Control Center

Confluent

When managing Apache Kafka ® clusters at scale, tasks that are simple on small clusters turn into significant burdens. To be fair, a lot of things turn into significant burdens at scale, and it’s Confluent Control Center’s job to ease as many of them as possible. In Confluent Platform 5.2, Control Center has grown a couple of new features that make large deployments a little more pleasant to manage: It has become much better at managing configuration changes among a large number of brokers, and

Kafka 57
article thumbnail

How to Manage Stakeholder Requests in Big Organizations

Zalando Engineering

An important factor of success in agile environment is that team works well together. It is also important for a software engineer to be able to focus for longer periods of time with limited interruptions. Many companies have solved the challenge of focus and dedication for the team by having a designated role, such as Scrum Master or Producer, who is responsible for managing stakeholder requests, prioritizing them and communicating to the development team.

article thumbnail

How We Structure our dbt Projects

dbt Developer Hub

As the maintainers of dbt, and analytics consultants, at Fishtown Analytics (now dbt Labs) we build a lot of dbt projects. Over time, we’ve developed internal conventions on how we structure them. This article does not seek to instruct you on how to design a final model for your stakeholders — it won’t cover whether you should denormalize everything into one wide master table , or have many tables that need to be joined together in the BI layer.

Project 40
article thumbnail

Understanding User Needs and Satisfying Them

Speaker: Scott Sehlhorst

We know we want to create products which our customers find to be valuable. Whether we label it as customer-centric or product-led depends on how long we've been doing product management. There are three challenges we face when doing this. The obvious challenge is figuring out what our users need; the non-obvious challenges are in creating a shared understanding of those needs and in sensing if what we're doing is meeting those needs.

article thumbnail

Secondary Indexes For Analytics On DynamoDB

Rockset

In this post I explore how to support analytical queries without encountering prohibitive scan costs, by leveraging secondary indexes in DynamoDB. I also evaluate the pros and cons of this approach in contrast to extracting data to another system like Athena, Spark or Elastic. Rockset recently added support for DynamoDB - which basically means you can run fast SQL on DynamoDB tables without any ETL.

NoSQL 40
article thumbnail

The PipelineDB Team Joins Confluent

Confluent

Some years ago, when I was at LinkedIn, I didn’t really know what Apache Kafka ® would become but had an inkling that the next generation of applications would not be islands disconnected from one another, or lashed together with irregular, point-to-point bindings. When we founded Confluent, we took the radical approach of viewing data—and the infrastructure that supported it—as a series of real-time streaming events rather than something kept in static, sedentary data repositories.

Kafka 21
article thumbnail

Python at Netflix

Netflix Tech

By Pythonistas at Netflix, coordinated by Amjith Ramanujam and edited by Ellen Livengood As many of us prepare to go to PyCon, we wanted to share a sampling of how Python is used at Netflix. We use Python through the full content lifecycle, from deciding which content to fund all the way to operating the CDN that serves the final video to 148 million members.

Python 111
article thumbnail

Women in Big Data Panel at DataWorks Summit 2019

Cloudera

Last month, I moderated The Women in Big Data panel hosted by DataWorks Summit and sponsored by Women in Big Data. This was a well-attended event with five amazing guest speakers – Hilary Mason , Tina Rosario , Violeta Ciurel , Ana Gillan and Devon Edwards Joseph. The theme for the discussion was “Top technology trends women and men business leaders need to be aware of”.

article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

Optimizing Kafka Streams Applications

Confluent

With the release of Apache Kafka ® 2.1.0, Kafka Streams introduced the processor topology optimization framework at the Kafka Streams DSL layer. This framework opens the door for various optimization techniques from the existing data stream management system (DSMS) and data stream processing literature. In what follows, we provide some context around how a processor topology was generated inside Kafka Streams before 2.1, with a focus on stateful operations like aggregations and joins.

Kafka 90