Sat.Apr 30, 2022 - Fri.May 06, 2022

article thumbnail

Hypothesis Testing Explained

KDnuggets

This brief overview of the concept of Hypothesis Testing covers its classification in parametric and non-parametric tests, and when to use the most popular ones, including means, correlation, and distribution, in the case of one sample and two samples.

IT 160
article thumbnail

AI-First Benefits: 5 Real-World Outcomes

Cloudera

Artificial intelligence (AI) has been a focus for research for decades, but has only recently become truly viable. The availability and maturity of automated data collection and analysis systems is making it possible for businesses to implement AI across their entire operations to boost efficiency and agility. AI has the potential to transform operations by improving three fundamental business requirements: process automation, decision-making based on data insights, and customer interaction.

Insurance 129
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Evolving And Scaling The Data Platform at Yotpo

Data Engineering Podcast

Summary Building a data platform is an iterative and evolutionary process that requires collaboration with internal stakeholders to ensure that their needs are being met. Yotpo has been on a journey to evolve and scale their data platform to continue serving the needs of their organization as it increases the scale and sophistication of data usage. In this episode Doron Porat and Liran Yogev explain how they arrived at their current architecture, the capabilities that they are optimizing for, an

article thumbnail

From the Cellar to the Cloud – How Aedifion is Driving Next-Generation Building Automation with Confluent

Confluent

It is no exaggeration that a lot is going wrong in commercial buildings today. The building and construction sector consumes 36% of global final energy and accounts for almost 40% […].

article thumbnail

Get Better Network Graphs & Save Analysts Time

Many organizations today are unlocking the power of their data by using graph databases to feed downstream analytics, enahance visualizations, and more. Yet, when different graph nodes represent the same entity, graphs get messy. Watch this essential video with Senzing CEO Jeff Jonas on how adding entity resolution to a graph database condenses network graphs to improve analytics and save your analysts time.

article thumbnail

Machine Learning Is Not Like Your Brain Part One: Neurons Are Slow, Slow, Slow

KDnuggets

Artificial intelligence is not all that intelligent. While today’s AI can do some extraordinary things, the functionality underlying its accomplishments has very little to do with the way in which a human brain works to achieve the same tasks.

article thumbnail

Winning With Data in the Fight Against Fraud, Waste, and Abuse

Cloudera

Fraud, waste, and abuse (FWA) in government is a constant, multi-billion dollar issue that challenges agency leaders at all levels and across all sectors, from healthcare to education to taxation to Social Security. The scope and scale of public spending — federal outlays alone were approximately $6.6 trillion in fiscal year 2020 according to the Congressional Budget Office — make FWA an inherently difficult problem to solve.

More Trending

article thumbnail

Packaging generated code from protobuf files for gRPC Services

Eventbrite Engineering

Background At Eventbrite, we identified in our 3-year technical vision that one of our goals is to enable autonomous dev teams to own their code and architecture so as to be able to deliver reliable, high quality and cost effective solutions to our customers. However, this autonomy does not mean that our team has to … Continue reading "Packaging generated code from protobuf files for gRPC Services" The post Packaging generated code from protobuf files for gRPC Services appeared first on E

Coding 52
article thumbnail

SQL Notes for Professionals: The Free eBook Review

KDnuggets

The free book is a combination of SQL cheat sheets and practical database examples. It provided bite-size information about every SQL function and attribute with coding samples.

SQL 158
article thumbnail

Meet The Graduates: Guoda Paulikaite

Pipeline Data Engineering

In this interview series we’ll share some of the stories that Daniel and I get to watch unfold at Pipeline Academy. Check out what our graduates have to say about the course, how they’ve tackled its challenges and what they are doing now with their new data engineering superpowers. Peter: Can I ask you to please introduce yourself to the readers of Pipeline Academy’s blog?

article thumbnail

Announcing ksqlDB 0.25.1

Confluent

We are thrilled to announce ksqlDB 0.25! It comes with a slew of improvements and new features. In particular, we improved how UDAFs work with complex types like Structs and […].

IT 52
article thumbnail

Understanding User Needs and Satisfying Them

Speaker: Scott Sehlhorst

We know we want to create products which our customers find to be valuable. Whether we label it as customer-centric or product-led depends on how long we've been doing product management. There are three challenges we face when doing this. The obvious challenge is figuring out what our users need; the non-obvious challenges are in creating a shared understanding of those needs and in sensing if what we're doing is meeting those needs.

article thumbnail

DataKitchen In The The insideBIGDATA IMPACT 50 List

DataKitchen

109
109
article thumbnail

6 Highest Paying Companies for Data Scientists

KDnuggets

These are the six top paying companies for data scientists. I’ve looked at absolute salary, but I’ll fill you in on other factors you should consider as well when it comes to picking a data science job for money.

article thumbnail

Seven Benefits of a Powerful Data Fabric

Teradata

The value provided by a powerful data fabric is key for a successful digital transformation. Find out why.

Data 52
article thumbnail

Why Does Elder Research Need a Chief Scientist Committee?

Elder Research

The post Why Does Elder Research Need a Chief Scientist Committee? appeared first on Elder Research.

52
article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

Podcast: Storytime for DataOps

DataKitchen

The post Podcast: Storytime for DataOps first appeared on DataKitchen.

70
article thumbnail

How To Structure a Data Science Project: A Step-by-Step Guide

KDnuggets

Check out all the necessary steps to successfully structure your data science projects leveraging data science templates.

article thumbnail

Leading The Charge For The ELT Data Integration Pattern For Cloud Data Warehouses At Matillion

Data Engineering Podcast

Summary The predominant pattern for data integration in the cloud has become extract, load, and then transform or ELT. Matillion was an early innovator of that approach and in this episode CTO Ed Thompson explains how they have evolved the platform to keep pace with the rapidly changing ecosystem. He describes how the platform is architected, the challenges related to selling cloud technologies into enterprise organizations, and how you can adopt Matillion for your own workflows to reduce the ma

article thumbnail

Choose Compliance, Choose Hybrid Cloud

Cloudera

As digital transformation accelerates, and digital commerce increasingly becomes the dominant form of all commerce, regulators and governments around the world are recognizing the increased need for consumer protections and data protection measures. The European Union has been at the vanguard for some time (most recently having reached provisional agreement on the Digital Services Act ) but from Australia to Brazil , from South Africa to California (the rest of the US hasn’t quite caught on yet!

Cloud 99
article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

Dapper Data Podcast: DataKitchen and DataOps – Episode #54 w/ Chris Bergh

DataKitchen

Data 91
article thumbnail

9 Free Harvard Courses to Learn Data Science in 2022

KDnuggets

Learn Python programming, statistics, and machine learning online from one of the world’s top universities.

article thumbnail

A Real-Time Rockset Intern Experience

Rockset

I spent the spring of my junior year interning at Rockset , and it couldn’t have been a better decision. When I first arrived at the office on a sunny day in San Mateo, I had no idea that I was about to meet so many systems engineering gurus, or that I was about to consume immensely good food from the festive neighboring streets. Working with my talented and resourceful mentor, Ben (Software Engineer, Systems), I’ve been able to learn more than I ever thought I could in three months!

Food 52
article thumbnail

#Clouderalife Volunteer Spotlight: Lynne Montalbo!

Cloudera

This month we are proud to spotlight Lynne Montalbo, senior business systems analyst from Santa Clara, California, who volunteers as a professional development mentor with Braven. Braven’s mission is to empower promising, underrepresented young people—first-generation college students, students from low-income backgrounds, and students of color—with the skills, confidence, experiences, and networks necessary to transition from college to strong first jobs, which lead to meaningful careers and li

article thumbnail

Entity Resolution Checklist: What to Consider When Evaluating Options

Are you trying to decide which entity resolution capabilities you need? It can be confusing to determine which features are most important for your project. And sometimes key features are overlooked. Get the Entity Resolution Evaluation Checklist to make sure you’ve thought of everything to make your project a success! The list was created by Senzing’s team of leading entity resolution experts, based on their real-world experience.

article thumbnail

DataKitchen Noted For DataOps Thought LeaderShip

DataKitchen

The post DataKitchen Noted For DataOps Thought LeaderShip first appeared on DataKitchen.

52
article thumbnail

How to Build Strong Data Science Portfolio as a Beginner

KDnuggets

After learning the basics of data science, you can start to work on real-world problems. But how do you showcase your work? In this article, we are going to learn a unique way to create a data science portfolio.

Portfolio 123
article thumbnail

Making dbt Cloud API calls using dbt-cloud-cli

dbt Developer Hub

dbt Cloud is a hosted service that many organizations use for their dbt deployments. Among other things, it provides an interface for creating and managing deployment jobs. When triggered (e.g., cron schedule, API trigger), the jobs generate various artifacts that contain valuable metadata related to the dbt project and the run results. dbt Cloud provides a REST API for managing jobs, run artifacts and other dbt Cloud resources.

Cloud 52
article thumbnail

Building Ripple: Engineering Spotlight Pt. 2

Ripple Engineering

In part one of our two-part series, we heard from RippleX engineers that are ideating, creating and executing on new applications using cutting-edge blockchain and crypto technology. Now, we’ll explore how the RippleNet engineering team is building the foundational payments infrastructure on the XRP Ledger that will allow value to move as easily as information moves today.

article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

How Rockset Handles Data Deduplication

Rockset

There are two major problems with distributed data systems. The second is out-of-order messages, the first is duplicate messages, the third is off-by-one errors, and the first is duplicate messages. This joke inspired Rockset to confront the data duplication issue through a process we call deduplication. As data systems become more complex and the number of systems in a stack increases, data deduplication becomes more challenging.

Kafka 52
article thumbnail

Image Classification with Convolutional Neural Networks (CNNs)

KDnuggets

In this article, we’ll look at what Convolutional Neural Networks are and how they work.

article thumbnail

Slim CI/CD with Bitbucket Pipelines

dbt Developer Hub

Continuous Integration (CI) sets the system up to test everyone’s pull request before merging. Continuous Deployment (CD) deploys each approved change to production. “Slim CI” refers to running/testing only the changed code, thereby saving compute. In summary, CI/CD automates dbt pipeline testing and deployment. dbt Cloud , a much beloved method of dbt deployment, supports GitHub- and Gitlab-based CI/CD out of the box.

article thumbnail

How to Remove Apache Kafka Brokers the Easy Way

Confluent

The recent release of Confluent Cloud and Confluent Platform 7.0 introduced the ability to easily remove Apache Kafka® brokers and shrink your Confluent Server cluster with just a single command. […].

Kafka 83
article thumbnail

How to Build an Experimentation Culture for Data-Driven Product Development

Speaker: Margaret-Ann Seger, Head of Product, Statsig

Experimentation is often seen as an aspirational practice, especially at smaller, fast-moving companies who are strapped for time and resources. So, how can you get your team making decisions in a more data-driven way while continuing to remain lean and maintaining ship velocity? In this webinar, Margaret-Ann Seger, Head of Product at Statsig, will teach you how to build an experimentation culture from the ground-up, graduating from just getting started with data-driven development to operating