Sat.Feb 06, 2021 - Fri.Feb 12, 2021

article thumbnail

The agile manifesto : 20 years later

François Nguyen

Or Robert C Martin, this uncle you should pay a visit more often. Where was I 20 years ago at that time when these 17 brillant folks were in a ski station for the Agile Manifesto ? I was part of a small team with great individuals and in fact we were an alternative to IT unable to deliver what we wanted. So we are going to do it ourselves. Without knowing it, we were totally in that agile mindset : valuing interactions, working software, our collaborations with the users and be able to change be

article thumbnail

Node.js ❤️ Apache Kafka – Getting Started with KafkaJS

Confluent

One of the great things about using an Apache Kafka® based architecture is that it naturally decouples systems and allows you to use the best tool for the job. While […].

Kafka 145
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

How Shopify Is Building Their Production Data Warehouse Using DBT

Data Engineering Podcast

Summary With all of the tools and services available for building a data platform it can be difficult to separate the signal from the noise. One of the best ways to get a true understanding of how a technology works in practice is to hear from people who are running it in production. In this episode Zeeshan Qureshi and Michelle Ark share their experiences using DBT to manage the data warehouse for Shopify.

article thumbnail

How to Join a fact and a type 2 dimension (SCD2) table

Start Data Engineering

Introduction What is an SCD2 table and why use it? Application table Dimension table Setup Joining fact and SCD2 tables high_spenders user_items Educating end users Conclusion Further reading Introduction If you are using a data warehouse, you would have heard of fact and dimension tables. Simply put, fact tables are used to record a business event and dimension tables are used to record the attributes of business items(eg user, item tables in an e-commerce app).

article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

Is Your Data Holding You Back Instead of Driving You Forward?

Teradata

Everyone knows that data is vital for success in retail. But without a clear data strategy, retailers often eat up resources fighting small-scale battles, whilst gradually losing the war.

Retail 112
article thumbnail

Introducing Confluent Platform 6.1

Confluent

We are pleased to announce the release of Confluent Platform 6.1. With this release, we are further simplifying management tasks for Apache Kafka® operators and providing even higher availability for […].

Kafka 142

More Trending

article thumbnail

Monte Carlo Raises $25M Series B to Help Companies Achieve More Reliable Data

Monte Carlo

In 2021, data is your company’s most critical asset. As data pipelines become increasingly complex and companies ingest more and more data, it’s paramount that this data is reliable. After talking to hundreds of data teams over the past few years, I was struck by the fact that organizations were investing millions of dollars and strategic energy in data, but decision makers and others on the frontlines couldn’t use it or didn’t trust it.

article thumbnail

From Product Cycle to Digital Thread

Teradata

In order to survive, the auto industry needs to leverage 'digital threads’ that connect data from customers to dealers to products, & link R&D to production line & the aftermarket.

Data 69
article thumbnail

How to Write a Connector for Kafka Connect – Deep Dive into Configuration Handling

Confluent

Kafka Connect is part of Apache Kafka®, providing streaming integration of external systems in and out of Kafka. There are a large number of existing connectors, and you can also […].

Kafka 83
article thumbnail

Cloudera Operational Database application development concepts

Cloudera

Cloudera Operational Database is now available in three different form-factors in Cloudera Data Platform (CDP). . If you are new to Cloudera Operational Database, see this blog post. And, check out the documentation here. . In this blog post, we’ll look at both Apache HBase and Apache Phoenix concepts relevant to developing applications for Cloudera Operational Database.

Database 103
article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

Node.js Memory Error on Mac Using M1

Grouparoo

I was working with our fancy new CLI tool with my fancy new MacBook Pro with the M1 chip when I came across this scary error, courtesy of Node.js: FATAL ERROR: wasm code commit Allocation failed - process out of memory It began occurring regularly enough that I started digging. I've since come across two methods for solving this issue. Method #1: Upgrade to Node v15 I found this discussion which noted that Node.js versions prior to v15 do not natively support the Apple M1 chip.

Coding 52
article thumbnail

What is Teradata Unity and Why Do You Need It?

Teradata

Learn more about Teradata Unity, a powerful portfolio for high availability and data synchronization in a Teradata-powered analytical ecosystem.

IT 52
article thumbnail

Automatic Observer Promotion Brings Fast and Safe Multi-Datacenter Failover with Confluent Platform 6.1

Confluent

Persisting data in multiple regions has become crucial for modern businesses: They need their mission-critical data to be protected from accidents and disasters. They can achieve this goal by running […].

Data 57
article thumbnail

Coffee with Cloudera: Vinita Srivalsan

Cloudera

Meet Vinita Srivalsan, the powerhouse leader of the Partner Marketing team. Since this is Coffee with Cloudera, what’s your morning pick-me-up drink? I am a Chai person through and through and make it the traditional Indian way with milk and sugar! . What makes your role at Cloudera unique? . Partner Marketing is uniquely positioned to be the voice of Cloudera within a partner organization, and to represent the partner within Cloudera.

article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

Why Artificial Intelligence May Not Offer the Business Value You Think

DataKitchen

The post Why Artificial Intelligence May Not Offer the Business Value You Think first appeared on DataKitchen.

52
article thumbnail

7-Step Guide to Become a Machine Learning Engineer in 2023

ProjectPro

Spoiler Alert: Becoming a machine learning engineer can sound like a hard-to-reach goal but let us tell you the truth – it isn’t as hard as it seems. And yes, we’re talking to you - the person who’s reading this because they’re probably wondering what is a machine learning engineer, what does a machine learning engineer do, how to become a machine learning engineer , and, more importantly, whether they can pull it off.

article thumbnail

Better Understand Your Geospatial Data - PostGIS GeoJSON

Preset

Apache Superset™ can visualize your geodata stored in Postgres | PostGIS GeoJSON

Data 52
article thumbnail

Using COD and CML to build applications that predict stock data

Cloudera

No, not really. You probably won’t be rich unless you work really hard… As nice as it would be, you can’t really predict a stock price based on ML solely, but now I have your attention! . Continuing from my previous blog post about how awesome and easy it is to develop web-based applications backed by Cloudera Operational Database (COD), I started a small project to integrate COD with another CDP cloud experience, Cloudera Machine Learning (CML). .

article thumbnail

Entity Resolution Checklist: What to Consider When Evaluating Options

Are you trying to decide which entity resolution capabilities you need? It can be confusing to determine which features are most important for your project. And sometimes key features are overlooked. Get the Entity Resolution Evaluation Checklist to make sure you’ve thought of everything to make your project a success! The list was created by Senzing’s team of leading entity resolution experts, based on their real-world experience.

article thumbnail

Differentiation Through DataOps in Financial Services

DataKitchen

The post Differentiation Through DataOps in Financial Services first appeared on DataKitchen.

52
article thumbnail

Edge Authentication and Token-Agnostic Identity Propagation

Netflix Tech

by AIM Team Members Karen Casella , Travis Nelson , Sunny Singh ; with prior art and contributions by Justin Ryan , Satyajit Thadeshwar As most developers can attest, dealing with security protocols and identity tokens, as well as user and device authentication, can be challenging. Imagine having multiple protocols, multiple tokens, 200M+ users, and thousands of device types, and the problem can explode in scope.

article thumbnail

How to Choose the Right Data Chart Types

Preset

Data visualization offers thousands of charts to choose from. Understanding different data chart types lets you take control of your reporting.

Data 40
article thumbnail

A Cost-Effective Data Warehouse Solution in CDP Public Cloud – Part1

Cloudera

Today’s customers have a growing need for a faster end to end data ingestion to meet the expected speed of insights and overall business demand. This ‘need for speed’ drives a rethink on building a more modern data warehouse solution, one that balances speed with platform cost management, performance, and reliability. A typical approach that we have seen in customers’ environments is that ETL applications pull data with a frequency of minutes and land it into HDFS storage as an extra Hive table

article thumbnail

How to Build an Experimentation Culture for Data-Driven Product Development

Speaker: Margaret-Ann Seger, Head of Product, Statsig

Experimentation is often seen as an aspirational practice, especially at smaller, fast-moving companies who are strapped for time and resources. So, how can you get your team making decisions in a more data-driven way while continuing to remain lean and maintaining ship velocity? In this webinar, Margaret-Ann Seger, Head of Product at Statsig, will teach you how to build an experimentation culture from the ground-up, graduating from just getting started with data-driven development to operating

article thumbnail

Industry Leader Q&A with DataKitchen’s Chris Bergh

DataKitchen

52
article thumbnail

Growth Engineering at Netflix- Creating a Scalable Offers Platform

Netflix Tech

by Eric Eiswerth Background Netflix has been offering streaming video-on-demand (SVOD) for over 10 years. Throughout that time we’ve primarily relied on 3 plans (Basic, Standard, & Premium), combined with the 30-day free trial to drive global customer acquisition. The world has changed a lot in this time. Competition for people’s leisure time has increased, the device ecosystem has grown phenomenally, and consumers want to watch premium content whenever they want, wherever they are, and on w

article thumbnail

Data Observability: How to Build Your Own Data Anomaly Detectors Using SQL

Monte Carlo

In this article series, we walk through how you can create your own data observability monitors and data anomaly detectors from scratch, mapping to five key pillars of data health. Part I can be found here. Part II of this series was adapted from Barr Moses and Ryan Kearns’ O’Reilly training, Managing Data Downtime: Applying Observability to Your Data Pipelines , the industry’s first-ever course on data observability.

SQL 45
article thumbnail

#ClouderaLife Spotlight: Valaretha Brown, Sr. Partner Marketing Manager, ISV

Cloudera

Valaretha Brown (also known as Val) is Cloudera’s Sr. Partner Marketing Manager leading the strategy behind the go-to-market plans with our Independent Software Vendors. When she was young, she was always curious about corporate America. “My immediate family members received vocational school certificates and were hard working, blue collar workers.” This, along with her first job in fast food, helped her realize, “using my mind more than my hands to earn a living was right up my alley.” .

article thumbnail

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Speaker: David Bard, Principal at VP Product Coaching

In the fast-paced world of digital innovation, success is often accompanied by a multitude of challenges - like the pitfalls lurking at every turn, threatening to derail the most promising projects. But fret not, this webinar is your key to effective product development! Join us for an enlightening session to empower you to lead your team to greater heights.

article thumbnail

Indexing Amazon S3 for Real-Time Analytics on Data Lakes

Rockset

Amazon Simple Storage Service (Amazon S3) is one of the leading cloud object storage services available. It uses an HTTP interface, making it easy for application developers to integrate S3 into their applications. Athena is a serverless query service provided by Amazon to query the data stored in Amazon S3 using standard SQL. Because it integrates easily with S3, is serverless, and uses a familiar language, Athena has become the default service for most business intelligence (BI) decision maker

article thumbnail

Hawkins: Diving into the Reasoning Behind our Design System

Netflix Tech

Stranger Things imagery showcasing the inspiration for the Hawkins Design System by Hawkins team member Joshua Godi ; with art contributions by Wiki Chaves Hawkins may be the name of a fictional town in Indiana, most widely known as the backdrop for one of Netflix’s most popular TV series “Stranger Things,” but the name is so much more. Hawkins is the namesake that established the basis for a design system used across the Netflix Studio ecosystem.

article thumbnail

How to Configure Your dbt Repository (One or Many)?

dbt Developer Hub

At dbt Labs, as more folks adopt dbt, we have started to see more and more use cases that push the boundaries of our established best practices. This is especially true to those adopting dbt in the enterprise space. After two years of helping companies from 20-10,000+ employees implement dbt & dbt Cloud, the below is my best attempt to answer the question: “Should I have one repository for my dbt project or many?

SQL 52
article thumbnail

Fine-Grained Authorization with Apache Kudu and Apache Ranger

Cloudera

When Kudu was first introduced as a part of CDH in 2017, it didn’t support any kind of authorization so only air-gapped and non-secure use cases were satisfied. Coarse-grained authorization was added along with authentication in CDH 5.11 (Kudu 1.3.0) which made it possible to restrict access only to Apache Impala where Apache Sentry policies could be applied, enabling a lot more use cases.

Hadoop 52
article thumbnail

Reimagined: Building Products with Generative AI

“Reimagined: Building Products with Generative AI” is an extensive guide for integrating generative AI into product strategy and careers featuring over 150 real-world examples, 30 case studies, and 20+ frameworks, and endorsed by over 20 leading AI and product executives, inventors, entrepreneurs, and researchers.