Sat.Jul 25, 2020 - Fri.Jul 31, 2020

article thumbnail

Ensuring Data Quality, With Great Expectations

Start Data Engineering

What is data quality As the name suggest, it refers to the quality of our data. Quality should be defined based on your project requirements. It can be as simple as ensuring a certain column has only the allowed values present or falls within a given range of values to more complex cases like, when a certain column must match a specific regex pattern, fall within a standard deviation range, etc.

Data 130
article thumbnail

Build More Reliable Distributed Systems By Breaking Them With Jepsen

Data Engineering Podcast

Summary A majority of the scalable data processing platforms that we rely on are built as distributed systems. This brings with it a vast number of subtle ways that errors can creep in. Kyle Kingsbury created the Jepsen framework for testing the guarantees of distributed data processing systems and identifying when and why they break. In this episode he shares his approach to testing complex systems, the common challenges that are faced by engineers who build them, and why it is important to und

Systems 100
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

I’ve Got the Key, I’ve Got the Secret. Here’s How Keys Work in ksqlDB 0.10.

Confluent

ksqlDB 0.10 includes significant changes and improvements to how keys are handled. This is part of a series of enhancements that began with support for non-VARCHAR keys and will ultimately […].

Process 119
article thumbnail

Advancing the Telecom Industry through Network Experience Analytics

Teradata

For today's Telco providers, new products & services are all driven by the end consumer's experience. That's where Teradata's Network Experience Analytics comes to play.

76
article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

Quick Reports: Xero to Power BI

FreshBI

The objective of this blog To give you the tools and the skills to connect to Xero Accounting from the Power BI Desktop and to have immediate access to the categorized data that drives each of the built-in reports in Xero. What you need to get started To get quick immediate access to the data that drives the Xero Reports and push them into Power BI, you’ll need 3 tools : Power BI Desktop : Download here>> ‘Quick Reports’ Power BI Custom Connector for Xero AND Power BI Quick Reports Templ

BI 52
article thumbnail

Unbundling Data Science Workflows with Metaflow and AWS Step Functions

Netflix Tech

by David Berg, Ravi Kiran Chirravuri, Romain Cledat, Jason Ge, Savin Goyal, Ferras Hamad, Ville Tuulos Continue reading on Netflix TechBlog ».

AWS 59

More Trending

article thumbnail

Streaming Data Into Teradata Vantage Using Amazon Managed Kafka (MSK) Data Streams and AWS Glue Streaming ETL

Teradata

In this post, we provide step-by-step instructions on how to set up Vantage & author AWS Glue Streaming ETL jobs to stream data into Vantage from Amazon MSK and visualize the data.

AWS 52
article thumbnail

How To Build A Live-Updating COVID Dashboard Using Google Sheets and Apache Superset

Preset

The powerful combination of Google Sheets and Apache Superset

article thumbnail

Data Pipelines in the Healthcare Industry

DareData

The Challenges of Medical Data In recent times, there have been several developments in applications of machine learning to the medical industry. We have heard news of machine learning systems outperforming seasoned physicians on diagnosis accuracy, chatbots that present recommendations depending on your symptoms , or algorithms that can identify body parts from transversal image slices , just to name a few.

article thumbnail

How PushOwl Uses ksqlDB to Scale Their Analytics and Reporting Use Cases

Confluent

Using a declarative SQL-like interface, ksqlDB makes it easy to integrate event streaming applications into any tech stack. This article illustrates how ksqlDB was added to PushOwl’s Python tech stack, […].

SQL 97
article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

Why Teradata Has Never Been Afraid of High Demand

Teradata

Teradata's Advanced SQL Engine can take on more work concurrently than competitors, all while continuing to deliver high throughput under high stress. Learn more.

SQL 52
article thumbnail

Performance Isolation for Your Primary MongoDB Cluster

Rockset

Database performance is a critical aspect of ensuring a web application or service remains fast and stable. As the service scales up, there are often challenges with scaling the primary database along with it. While MongoDB is often used as a primary online database and can meet the demands of very large scale web applications, it does often become the bottleneck as well.

MongoDB 40
article thumbnail

What is a Data Mesh — and How Not to Mesh it Up

Monte Carlo

Updated: January 2023. Ask anyone in the data industry what’s hot these days and chances are “data mesh” will rise to the top of the list. But what is a data mesh and why should you build one? Inquiring minds want to know. In the age of self-service business intelligence , nearly every company considers themselves a data-first company, but not every company is treating their data architecture with the level of democratization and scalability it deserves.

IT 45