Sat.Mar 20, 2021 - Fri.Mar 26, 2021

article thumbnail

Toward a Data Mesh (part 2) : Architecture & Technologies

François Nguyen

Just an illustration – not the truth and you certainly can do it with other technologies. TL;DR After setting up and organizing the teams, we are describing 4 topics to make data mesh a reality. the selfserve platform based on a serverless philisophy (life is too short to do provisioning) the building of data products (as code) : we are building data workflows not data pipelines the promotion of data domains where the metadata on the data life cycle is as important as your data The old dat

article thumbnail

Real World Change Data Capture At Datacoral

Data Engineering Podcast

Summary The world of business is becoming increasingly dependent on information that is accurate up to the minute. For analytical systems, the only way to provide this reliably is by implementing change data capture (CDC). Unfortunately, this is a non-trivial undertaking, particularly for teams that don’t have extensive experience working with streaming data and complex distributed systems.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

How to Host a Virtual Global Data Science Hackathon

Teradata

Learn how best to host a virtual hackathon, or any virtual event, with these tips and tricks from our Teradata team. Read more.

article thumbnail

Congratulations to our 2021 Partner Award Winners

Cloudera

We announced at our Partner Sales Kickoff, the winners of the 2021 Cloudera Partner Awards. These six awards recognize Cloudera partners who are dedicated to enabling customers to do more with their data by leveraging the power of an enterprise data cloud. Thank you to this year’s winners for their partnership in helping our joint customers’ ability to drive value from their data in the hybrid cloud.

article thumbnail

Beyond the Basics of A/B Tests: Innovative Experimentation Tactics You Need to Know as a Data or Product Professional

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

How to Run Confluent on Windows in Minutes

Confluent

I previously showed how to install and set up Apache Kafka® on Windows in minutes by using the Windows Subsystem for Linux 2 (WSL 2). From there, it’s only a […].

Kafka 70
article thumbnail

Scaling Revenue & Growth Tooling

Netflix Tech

Written by Nick Tomlin , Michael Possumato , and Rahul Pilani. This post shares how the Revenue & Growth Tools (RGT) team approaches creating full-stack tools for the teams that are the financial backbone of Netflix. Our primary partners are the teams of Revenue and Growth Engineering (RGE): Growth, Membership, Billing, Payments, and Partner Subscription.

More Trending

article thumbnail

CDP Endpoint Gateway provides Secure Access to CDP Public Cloud Services running in private networks

Cloudera

Cloudera Data Platform (CDP) Public Cloud allows users to deploy analytic workloads into their cloud accounts. These workloads cover the entire data lifecycle and are managed from a central multi-cloud Cloudera Control Plane. CDP provides the flexibility to deploy these resources into public or private subnets. Nearly unanimously, we’ve seen customers deploy their workloads to private subnets.

article thumbnail

Community, Metadata Management, and More: Top 10 Links From Across the Web

Data Council

Here's our March 2021 roundup of links from across the web that we selected for you: 1. How to Build a Community (Fishtown Analytics) Claire Carroll's first personal blog post on community-building is a must-read. As Fishtown Analytics' community manager for the last 2.5 years, she's arguably behind the success of the dbt community and its best-in-class practices, so we expected good advice… but she really hit the ball out of the park with this one!

article thumbnail

Promisifying Your Node Callback Functions

Grouparoo

The Grouparoo application is written in JavaScript (Node). It uses the modern promise-based pattern ( async / await ) for reading and writing data asynchronously. And we do this a lot — we are a data sync tool! Every once in awhile we'll come across a JavaScript library that is written around the old callback-based pattern, where the error object is the first parameter in the callback function, followed by the result.

article thumbnail

Building the Future of Payments With RippleNet’s VP of Engineering

Ripple Engineering

Amidst the work-from-home environment, Vidya Mani joined Ripple in early 2020 as the Vice President of Engineering for RippleNet. A year into her role, she focuses on improving Ripple’s infrastructure and strengthening her team to further the company’s vision for a more inclusive financial system. RippleNet is an enterprise solution which helps banks and other financial institutions streamline global payments and reach new customers.

article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

#ClouderaLife Spotlight: Christine Sherry, Director, Critical Incident Manager

Cloudera

For over 8 years, Christine Sherry has brought her hard-work ethic and skills to Cloudera. In her time here, she’s climbed up the ladder and currently sits as our Director, Global Critical Incident Management. . In her role, she leads Cloudera’s Global Support Critical Incident team who is responsible for managing our customers’ most critical technical issues to resolution.

article thumbnail

Integrating Azure and Confluent: Real-Time Search Powered by Azure Cache for Redis, Spring Cloud

Confluent

Self-managing a distributed system like Apache Kafka®, along with building and operating Kafka connectors, is complex and resource intensive. It requires significant Kafka skills and expertise in the development and […].

Kafka 52
article thumbnail

Ending Supply Chain Whack-a-Mole Management

Teradata

Working to optimize Retail & CPG Supply Chains often feels like a life-sized game of Whack-a-Mole -- making a change here creates an issue there. Find out how integrated, real-time data can help.

article thumbnail

How the DataKitchen Platform Delivers End-to-End Data Observability

DataKitchen

The post How the DataKitchen Platform Delivers End-to-End Data Observability first appeared on DataKitchen.

Data 52
article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

Will Data Privacy drive an Enterprise Data Strategy?

Cloudera

Data privacy is an increasingly complex and contentious topic. The appropriate use of data and transparency to the potential uses of the data are at the center of debate amongst the largest Big Tech companies. . The protection and controls around data become increasingly complex when used in the context of banking and insurance activities. Personal and confidential information carries heightened sensitivity in the light of financial, health and insurance activities.

article thumbnail

Contributing To The Superset Project - Github + Superset

Preset

How to contribute to the Apache Superset™ Project. Help others, advocate for Superset and code development.

Project 52
article thumbnail

5 Things Every Data Engineer Needs to Know About Data Observability

Monte Carlo

As a new or aspiring data engineer, there are some essential technologies and frameworks you should know. How to build a data pipeline? Check. How to clean, transform, and model your data? Check. How to prevent broken data workflows before you get that frantic call from your CEO about her missing data? Maybe not. By leveraging best practices from our friends in software engineering and developer operations (DevOps), we can think more strategically about tackling the “good pipelines, bad data” pr

article thumbnail

DataOps and the Cloud: A Match Made in Heaven

DataKitchen

The post DataOps and the Cloud: A Match Made in Heaven first appeared on DataKitchen.

Cloud 52
article thumbnail

Entity Resolution Checklist: What to Consider When Evaluating Options

Are you trying to decide which entity resolution capabilities you need? It can be confusing to determine which features are most important for your project. And sometimes key features are overlooked. Get the Entity Resolution Evaluation Checklist to make sure you’ve thought of everything to make your project a success! The list was created by Senzing’s team of leading entity resolution experts, based on their real-world experience.

article thumbnail

Filter more pay less with the latest Cloudera Data Warehouse runtime!

Cloudera

Introduction. One of the most effective ways to improve performance and minimize cost in database systems today is by avoiding unnecessary work, such as data reads from the storage layer (e.g., disks, remote storage), transfers over the network, or even data materialization during query execution. Since its early days, Apache Hive improves distributed query execution by pushing down column filter predicates to storage handlers like HBase or columnar data format readers such as Apache ORC.

article thumbnail

How Prophet Enables Time-Series Forecasting in Superset

Preset

Prophet is a popular time-series forecasting library created by Facebook. Learn how to use Prophet and Apache Druid for forecasting.

40
article thumbnail

On the Pursuit of Happiness (aka Squashing 502/504 Errors)

Rockset

Introduction 502 and 504 errors can be a nuisance for Rockset and our users. For many users running customer-facing applications on Rockset, availability and uptime are very important, so even a single 5xx error is cause for concern. As a cloud service, Rockset deploys code to our production clusters multiple times a week, which means that any component of our distributed system has to stop and restart with new code in an error-free way.

AWS 40
article thumbnail

DataKitchen Wins DataOps Company of the Year, 2021, from the Data Breakthrough Awards

DataKitchen

The post DataKitchen Wins DataOps Company of the Year, 2021, from the Data Breakthrough Awards first appeared on DataKitchen.

Data 40
article thumbnail

How to Build an Experimentation Culture for Data-Driven Product Development

Speaker: Margaret-Ann Seger, Head of Product, Statsig

Experimentation is often seen as an aspirational practice, especially at smaller, fast-moving companies who are strapped for time and resources. So, how can you get your team making decisions in a more data-driven way while continuing to remain lean and maintaining ship velocity? In this webinar, Margaret-Ann Seger, Head of Product at Statsig, will teach you how to build an experimentation culture from the ground-up, graduating from just getting started with data-driven development to operating

article thumbnail

Cracking the code on gender parity in the workplace

Cloudera

I used to work for a female CEO who said that in her organization, “the powder room is the power room”. It’s been a while since I heard that statement, yet such an environment is still far from the truth for many companies. Women are still underrepresented in the science, technology, engineering, and mathematics (STEM) field and more so in leadership positions.

Coding 61