Sat.Jul 23, 2022 - Fri.Jul 29, 2022

article thumbnail

4 Must-Have Tests for Your Apache Kafka CI/CD with GitHub Actions

Confluent

Explore GitHub Actions for your Kafka CI/CD pipeline, automate Schema Registry, and transform the development and testing of Kafka client applications.

Kafka 141
article thumbnail

The 5 Hardest Things to Do in SQL

KDnuggets

The 5 hardest things Josh Berry, a 15 year analytics professional, experienced while switching from Python to SQL. Offering examples, SQL code, and a resource to customize the SQL to your own project.

SQL 126
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Re-Bundling The Data Stack With Data Orchestration And Software Defined Assets Using Dagster

Data Engineering Podcast

Summary The current stage of evolution in the data management ecosystem has resulted in domain and use case specific orchestration capabilities being incorporated into various tools. This complicates the work involved in making end-to-end workflows visible and integrated. Dagster has invested in bringing insights about external tools’ dependency graphs into one place through its "software defined assets" functionality.

MongoDB 100
article thumbnail

Being the Best Digital Bank is Not Enough

Teradata

For many, banking is now a digital activity. But the financial services industry still trails many others in leveraging cloud technologies to build deeper, emotional attachments to their customers.

Banking 94
article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

Modern Data Flow: A Better Way of Building Data Pipelines

Confluent

Complete guide to data pipelines, data integration, and modern data flow, the key to next generation, data-driven applications, systems, and organizations.

article thumbnail

Practical Deep Learning from fast.ai is Back!

KDnuggets

Looking for a great course to go from machine learning zero to hero quickly? fast.ai has released the latest version of Practical Deep Learning For Coders. And it won't cost you a thing.

More Trending

article thumbnail

Here Is The Most Fun Way Of Obtaining The Illustrious IIM Indore Alumni Status: Integrated Program In Business Analytics

U-Next

Every layer of business operations today uses the power of metrics and analytics to enhance their market growth and business success. With the fourth industrial revolution increasing the dependency on emerging technologies like Data Science, Cloud Computing, IoT, Business Analytics, etc., the need to master the nuances of the same is relatively high.

article thumbnail

How to Become a Data Scientist in 2022: The Ultimate Guide

Emeritus

Data science has become an integral part of every company, especially those who understand the value of data and what can be done with that information. The primary role of a data scientist is to extract actionable insights from complex data to inform your business decisions. If you are wondering how to become a data… The post How to Become a Data Scientist in 2022: The Ultimate Guide appeared first on Emeritus Online Courses.

article thumbnail

KDnuggets News, July 27: The AIoT Revolution: How AI and IoT Are Transforming Our World • Introduction to Hill Climbing Algorithm

KDnuggets

Calculus for Data Science • Real-time Translations with AI • Using Numpy's argmax() • Using the apply() Method with Pandas DataFrames • An Introduction to Hill Climbing Algorithm in AI.

Algorithm 121
article thumbnail

MongoDB CDC: When to Use Kafka, Debezium, Change Streams and Rockset

Rockset

MongoDB has grown from a basic JSON key-value store to one of the most popular NoSQL database solutions in use today. It is widely supported and provides flexible JSON document storage at scale. It also provides native querying and analytics capabilities. These attributes have caused MongoDB to be widely adopted especially alongside JavaScript web applications.

MongoDB 52
article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

Updating our permissioning guidelines: grants as configs in dbt Core v1.2

dbt Developer Hub

If you’ve needed to grant access to a dbt model between 2019 and today, there’s a good chance you’ve come across the "The exact grant statements we use in a dbt project" post on Discourse. It explained options for covering two complementary abilities: querying relations via the "select" privilege using the schema those relations are within via the "usage" privilege The solution then ​ Prior to dbt Core v1.2, we proposed three possible approaches (each coming with caveats and trade-offs ): Using

BI 52
article thumbnail

Difference Between Spring and Spring Boot

U-Next

Introduction . Spring Framework (Spring) is an open-source application framework that provides infrastructure assistance to develop Java applications. Spring is one of the most popular Java Enterprise Edition (Java EE) frameworks, which assists developers in creating high-performance applications using plain old Java objects (POJOs). It is used for developing stand-alone, production-grade applications on the Java Virtual Machine (JVM).

Java 52
article thumbnail

Is Domain Knowledge Important for Machine Learning?

KDnuggets

If you incorporate domain knowledge into your architecture and your model, it can make it a lot easier to explain the results, both to yourself and to an outside viewer. Every bit of domain knowledge can serve as a stepping stone through the black box of a machine learning model.

article thumbnail

Growth Engineering at Zalando

Zalando Engineering

We recently closed out our annual performance review for employees. Naturally, this period is for us to focus on how we are performing, what we aspire to achieve, and how we can progress towards those goals, with the support of our leads. As a leader, I’ve spent a great deal of time working with Software Engineers on their development, and helping them to drive their career progression.

article thumbnail

How to Build an Experimentation Culture for Data-Driven Product Development

Speaker: Margaret-Ann Seger, Head of Product, Statsig

Experimentation is often seen as an aspirational practice, especially at smaller, fast-moving companies who are strapped for time and resources. So, how can you get your team making decisions in a more data-driven way while continuing to remain lean and maintaining ship velocity? In this webinar, Margaret-Ann Seger, Head of Product at Statsig, will teach you how to build an experimentation culture from the ground-up, graduating from just getting started with data-driven development to operating

article thumbnail

What Is the Difference Between a Database and a Warehouse in Snowflake? | Propel Data Analytics Blog

Propel Data

Snowflake uses databases for data storage, while a “Snowflake warehouse” is a virtual computing cluster that processes analytical queries.

article thumbnail

Working in Cyber Security

U-Next

Is working in cyber security your dream job? If yes, this is the right place for you to learn how to become a cyber security expert and your role in the tech industry. Introduction. Cybersecurity aims at preventing cyber threats and protecting information and information systems. It includes protecting the company’s valuable information, hardware, software, and network.

article thumbnail

Does the Random Forest Algorithm Need Normalization?

KDnuggets

Normalization is a good technique to use when your data consists of being scaled and your choice of machine learning algorithm does not have the ability to make assumptions on the distribution of your data.

Algorithm 110
article thumbnail

Q&A Picnic Data Engineering Series

Picnic Engineering

The most important thing for a successful analytics strategy. Data Mesh, or Hub-and-Spoke? Is “lakeless” a thing!? … and other reflections on building data governance. Since the publication of the first blog post in this series, we have received numerous questions via social media, direct messages, public posts, and meet-up discussions. It’s been truly amazing to see so much interest and, as promised, we will address the most frequently raised topics in this post.

article thumbnail

Entity Resolution Checklist: What to Consider When Evaluating Options

Are you trying to decide which entity resolution capabilities you need? It can be confusing to determine which features are most important for your project. And sometimes key features are overlooked. Get the Entity Resolution Evaluation Checklist to make sure you’ve thought of everything to make your project a success! The list was created by Senzing’s team of leading entity resolution experts, based on their real-world experience.

article thumbnail

How to build Snowflake data apps with GraphQL | Propel Data Analytics Blog

Propel Data

Need to build a Snowflake data app? Here's how to create and query a Metric on top of Snowflake data warehouse using Propel’s GraphQL API.

article thumbnail

Driving Success With a Modern Data Architecture and a Hybrid Approach in the Financial Services and Telco Industries

Cloudera

Corporations are generating unprecedented volumes of data, especially in industries such as telecom and financial services industries (FSI). Many organizations are hoping to leverage these massive amounts of data by investing heavily in big data solutions – solutions that they hope can meet business goals such as increasing customer satisfaction, uncovering alternative revenue streams, or improving operational efficiency.

article thumbnail

Top Posts July 18-24: Free Python Automation Course

KDnuggets

Free Python Automation Course • Machine Learning Algorithms Explained in Less Than 1 Minute Each • Parallel Processing Large File in Python • 12 Most Challenging Data Science Interview Questions • Decision Tree Algorithm, Explained.

Python 110
article thumbnail

AI in Manufacturing: 5 Successful Use Cases of AI-Based Technologies

AltexSoft

In October 2019, Microsoft reported artificial intelligence helped manufacturing companies outperform rivals stating that manufacturers adopting AI perform 12 percent better than their competitors.Therefore, we are likely to see the outburst of AI-based technologies in manufacturing along with the advent of new highly-paid workplaces in this area. In this article, we’ll highlight 5 use cases of adopting AI-based technologies in manufacturing.

article thumbnail

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Speaker: David Bard, Principal at VP Product Coaching

In the fast-paced world of digital innovation, success is often accompanied by a multitude of challenges - like the pitfalls lurking at every turn, threatening to derail the most promising projects. But fret not, this webinar is your key to effective product development! Join us for an enlightening session to empower you to lead your team to greater heights.

article thumbnail

Data Contracts and 4 Other Ways to Overcome Schema Changes

Monte Carlo

There are virtually an unlimited number of ways data can break. It could be a bad JOIN statement, an untriggered Airflow job, or even just someone at a third-party provider who didn’t feel like hitting the send button that day. But perhaps one of the most common reasons for data quality challenges are software feature updates and other changes made upstream by software engineers.

article thumbnail

Understanding the components of the dbt Semantic Layer

dbt Developer Hub

TLDR: The Semantic Layer is made up of a combination of open-source and SaaS offerings and is going to change how your team defines and consumes metrics. At last year's Coalesce, Drew showed us the future 1 - a vision of what metrics in dbt could look like. Since then, we've been getting the infrastructure in place to make that vision a reality. We wanted to share with you where we are today and how it fits into the broader picture of where we're going.

article thumbnail

K-nearest Neighbors in Scikit-learn

KDnuggets

Learn about the k-nearest neighbours algorithm, one of the most prominent workhorse machine learning algorithms there is, and how to implement it using Scikit-learn in Python.

Algorithm 108
article thumbnail

The Good and the Bad of Hadoop Big Data Framework

AltexSoft

What does it take to store all New York Times articles published between 1855 and 1922? Depending on how you measure it, the answer will be 11 million newspaper pages or… just one Hadoop cluster and one tech specialist who can move 4 terabytes of textual data to a new location in 24 hours. The biggest star of the Big Data world, Hadoop was named after a yellow stuffed elephant that belonged to the 2-year son of computer scientist Doug Cutting.

Hadoop 59
article thumbnail

Reimagined: Building Products with Generative AI

“Reimagined: Building Products with Generative AI” is an extensive guide for integrating generative AI into product strategy and careers featuring over 150 real-world examples, 30 case studies, and 20+ frameworks, and endorsed by over 20 leading AI and product executives, inventors, entrepreneurs, and researchers.

article thumbnail

Using the Airflow ShortCircuitOperator to Stop Bad Data From Reaching ETL Pipelines 

Monte Carlo

I’m a huge fan of Apache Airflow and how the open source tool enables data engineers to scale data pipelines by more precisely orchestrating workloads. But what happens when Airflow testing doesn’t catch all of your bad data? What if “unknown unknown” data quality issues fall through the cracks and affect your Airflow jobs? One helpful but underutilized solution is to leverage the Airflow ShortCircuitOperator to create data circuit breakers to prevent bad data from flowing across your data

article thumbnail

What is Data Lineage?

Databand.ai

What is Data Lineage? Niv Sluzki 2022-07-28 10:20:02 The term “data lineage” has been thrown around a lot over the last few years. What started as an idea of connecting between datasets quickly became a very confusing term that now gets misused often. It’s time to put order to the chaos and dig deep into what it really is. Because the answer matters quite a lot.

article thumbnail

Why Upskilling in Data Vis Matters (& How to Get Started)

KDnuggets

How do you condense the information you collect and present it to decision-makers in a clear, concise, and memorable way? This August, Noah Iliinsky will be opening up an intimate cohort and presenting an online course, Effective and Efficient Data Visualization.

Data 108
article thumbnail

Cyber Security Syllabus

U-Next

Everything you need to study in any Cyber Security degree comprises a Cyber Security syllabus. Here we have elaborated the course syllabus on digital forensics, network programming, and other fields. Introduction to Cyber Security Syllabus. Cyber Security in the field of study will teach you how to protect your company’s operating systems. It focuses on making students aware of the methods required to protect the information.

article thumbnail

The Big Payoff of Application Analytics

Outdated or absent analytics won’t cut it in today’s data-driven applications – not for your end users, your development team, or your business. That’s what drove the five companies in this e-book to change their approach to analytics. Download this e-book to learn about the unique problems each company faced and how they achieved huge returns beyond expectation by embedding analytics into applications.