January, 2022

article thumbnail

Airflow TaskGroups: All you need to know!

Marc Lamberti

Airflow TaskGroups have been introduced to make your DAG visually cleaner and easier to read. They are meant to replace SubDAGs which was the historic way of grouping your tasks. The problem with SubDAGs is that they are much more than that. They bring a lot of complexity as you need to create a DAG in a DAG, import the SubDagOperator which is in fact a sensor, define the parameters properly, and so on.

Coding 130
article thumbnail

The Best Python Courses: An Analysis Summary

KDnuggets

What does the data reveal if we ask: "What are the 10 Best Python Courses?". Collecting almost all of the courses from top platforms shows there are plenty to choose from, with over 3000 offerings. This article summarizes my analysis and presents the top three courses.

Python 160
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

5 Common Pitfalls When Using Apache Kafka

Confluent

Whether you’re a seasoned Apache Kafka® developer or just getting started you’re likely to hit a snag at some point or another—either in configuring and understanding your clients or setting […].

Kafka 138
article thumbnail

Effective Pandas Patterns For Data Engineering

Data Engineering Podcast

Summary Pandas is a powerful tool for cleaning, transforming, manipulating, or enriching data, among many other potential uses. As a result it has become a standard tool for data engineers for a wide range of applications. Matt Harrison is a Python expert with a long history of working with data who now spends his time on consulting and training. He recently wrote a book on effective patterns for Pandas code, and in this episode he shares advice on how to write efficient data processing routines

article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

Why Choose a Hybrid Data Cloud in Financial Services?

Cloudera

As I meet with our customers, there are always a range of discussions regarding the use of the cloud for financial services data and analytics. Customers vary widely on the topic of public cloud – what data sources, what use cases are right for public cloud deployments – beyond sandbox, experimentation efforts. Private cloud continues to gain traction with firms realizing the benefits of greater flexibility and dynamic scalability.

Cloud 111
article thumbnail

Three Ways Integrated Data Can Deliver Outstanding Customer Experience

Teradata

The use of integrated data to restore customer confidence will be big in 2022. Building a customer insights foundation should be high on the to-do list for retail & CPG businesses this year.

Retail 105

More Trending

article thumbnail

How to Grow as a Data Scientist in an Ever-Changing World

KDnuggets

Just like tradespeople need to grow in their skill sets, data scientists must also grow in the ever-changing world we inhabit. With that said, let’s break down how you can evolve your data science skills while progressing your career.

article thumbnail

The Link To Cloud: How to Build a Seamless and Secure Hybrid Data Bridge with Cluster Linking

Confluent

Chances are your business is migrating to the cloud. But if you operate business applications in an on-premises datacenter, you know firsthand that the journey to the cloud is fraught […].

Cloud 122
article thumbnail

Building And Managing Data Teams And Data Platforms In Large Organizations With Ashish Mrig

Data Engineering Podcast

Summary Data engineering is a relatively young and rapidly expanding field, with practitioners having a wide array of experiences as they navigate their careers. Ashish Mrig currently leads the data analytics platform for Wayfair, as well as running a local data engineering meetup. In this episode he shares his career journey, the challenges related to management of data professionals, and the platform design that he and his team have built to power analytics at a large company.

Building 100
article thumbnail

Fire Your Super-Smart Data Consultants with DataOps

DataKitchen

Analytics are prone to frequent data errors and deployment of analytics is slow and laborious. The strategic value of analytics is widely recognized, but the turnaround time of analytics teams typically can’t support the decision-making needs of executives coping with fast-paced market conditions. Perhaps it is no surprise that the average tenure of a CDO or CAO is only about 2.5 years.

article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

Avoid Data Sharing Lock-in and Take the Open Road

Teradata

There is a lot of hype today around data sharing and the value it brings to your business. But what exactly is data sharing, and why should you and your company care? Find out more.

Data 97
article thumbnail

Auto-Diagnosis and Remediation in Netflix Data Platform

Netflix Tech

By Vikram Srivastava and Marcelo Mayworm Netflix has one of the most complex data platforms in the cloud on which our data scientists and engineers run batch and streaming workloads. As our subscribers grow worldwide and Netflix enters the world of gaming , the number of batch workflows and real-time data pipelines increases rapidly. The data platform is built on top of several distributed systems, and due to the inherent nature of these systems, it is inevitable that these workloads run into fa

Kafka 95
article thumbnail

Top Programming Languages and Their Uses

KDnuggets

The landscape of programming languages is rich and expanding, which can make it tricky to focus on just one or another for your career. We highlight some of the most popular languages that are modern, widely used, and come with loads of packages or libraries that will help you be more productive and efficient in your work.

article thumbnail

What’s New in Apache Kafka 3.1.0

Confluent

On behalf of the Apache Kafka® community, it is my pleasure to announce the release of Apache Kafka 3.1.0. The 3.1.0 release contains many improvements and new features. We’ll highlight […].

Kafka 104
article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

Data Observability Out Of The Box With Metaplane

Data Engineering Podcast

Summary Data observability is a set of technical and organizational capabilities related to understanding how your data is being processed and used so that you can proactively identify and fix errors in your workflows. In this episode Metaplane founder Kevin Hu shares his working definition of the term and explains the work that he and his team are doing to cut down on the time to adoption for this new set of practices.

BI 100
article thumbnail

DataOps For Business Analytics Teams

DataKitchen

Business analysts often find themselves in a no-win situation with constraints imposed from all sides. Their business unit colleagues ask an endless stream of urgent questions that require analytic insights. Business analysts must rapidly deliver value and simultaneously manage fragile and error-prone analytics production pipelines. Data tables from IT and other data sources require a large amount of repetitive, manual work to be used in analytics.

article thumbnail

Channel Your Inner Business Analyst With The Right Upskilling Program

U-Next

A domain with applications across multiple industries from Agriculture to Transport, Business Analytics is all about making data-driven decisions for maximum business revenue. Even though this field has established a strong presence over the years, there’s an array of opportunities and growth still waiting to be transformed into reality. . According to IMARC Group’s latest report , the global BPO business analytics market is expected to grow at a CAGR of around 25% during 2021-2026.

article thumbnail

How Data is Helping Organizations to Improve the Employee Lifecycle

Cloudera

Each year, the Cloudera Data Impact Awards recognize organizations that have accomplished amazing things with innovative data solutions. . For 2021, the awards will include a new category: People First. Entrants in this category were asked to demonstrate how they have addressed the world’s “most difficult workplace and societal challenges” with solutions aimed at transforming work culture and society as a whole.

Banking 92
article thumbnail

Entity Resolution Checklist: What to Consider When Evaluating Options

Are you trying to decide which entity resolution capabilities you need? It can be confusing to determine which features are most important for your project. And sometimes key features are overlooked. Get the Entity Resolution Evaluation Checklist to make sure you’ve thought of everything to make your project a success! The list was created by Senzing’s team of leading entity resolution experts, based on their real-world experience.

article thumbnail

3 Reasons Why Data Scientists Should Use LightGBM

KDnuggets

There are many great boosting Python libraries for data scientists to reap the benefits of. In this article, the author discusses LightGBM benefits and how they are specific to your data science job.

article thumbnail

Auto-Balance and Optimize Apache Kafka Clusters with Improved Observability and Elasticity in Confluent Platform 7.0

Confluent

While Self-Balancing Clusters (SBC) perform effectively in balancing Apache Kafka® clusters, one of the common themes we hear from our users is that they would love some visibility into the […].

Kafka 104
article thumbnail

A Reflection On The Data Ecosystem For The Year 2021

Data Engineering Podcast

Summary This has been an active year for the data ecosystem, with a number of new product categories and substantial growth in existing areas. In an attempt to capture the zeitgeist Maura Church, David Wallace, Benn Stancil, and Gleb Mezhanskiy join the show to reflect on the past year and share their thought son the year to come. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to

article thumbnail

The Top FinServ Trends & Predictions for 2022

Teradata

From Open Finance and Insurance to FinCrime and Crypto, hear from one of our expert on the top FinServe trends and predictions to look out for in 2022. Read more.

article thumbnail

The Big Payoff of Application Analytics

Outdated or absent analytics won’t cut it in today’s data-driven applications – not for your end users, your development team, or your business. That’s what drove the five companies in this e-book to change their approach to analytics. Download this e-book to learn about the unique problems each company faced and how they achieved huge returns beyond expectation by embedding analytics into applications.

article thumbnail

A busy year ahead in low-code and no-code development

DataKitchen

The post A busy year ahead in low-code and no-code development first appeared on DataKitchen.

Coding 110
article thumbnail

Gartner® Magic Quadrant™ for Cloud Database Report Recognizes Cloudera as a Visionary

Cloudera

Gartner® recognized Cloudera in three recent reports – Magic Quadrant for Cloud Database Management Systems (DBMS), Critical Capabilities for Cloud Database Management Systems for Analytical Use Cases and Critical Capabilities for Cloud Database Management Systems for Operational Use Cases. Our position as a Visionary in the Gartner Magic Quadrant for Cloud DBMS market speaks to our product excellence and market-leading-vision of a hybrid, multifunction integrated platform with built-in security

article thumbnail

Why Do Machine Learning Models Die In Silence?

KDnuggets

A critical problem for companies when integrating machine learning in their business processes is not knowing why they don't perform well after a while. The reason is called concept drift. Here's an informational guide to understanding the concept well.

article thumbnail

Announcing ksqlDB 0.23.1

Confluent

We’re pleased to announce ksqlDB 0.23.1! This release allows you to now perform pull queries on streams, which makes it much easier to find a given record in a topic. […].

IT 98
article thumbnail

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Speaker: David Bard, Principal at VP Product Coaching

In the fast-paced world of digital innovation, success is often accompanied by a multitude of challenges - like the pitfalls lurking at every turn, threatening to derail the most promising projects. But fret not, this webinar is your key to effective product development! Join us for an enlightening session to empower you to lead your team to greater heights.

article thumbnail

U-Next - Untitled Article

U-Next

Introduction. The evolution of workplaces has seen people being hired for more than just their educational qualifications. The criteria for being hired has seen a tremendous shift in the digital age. Along with skill and knowledge in the necessary domain, companies are keen on hiring professionals with strong critical thinking capabilities. This ensures that the employees are able to deal with real-time issues with a practical approach. .

article thumbnail

Data Mesh and the City Planner

Teradata

Data mesh planning is a lot like city planning, with both city and data mesh planners aiming to provide as much freedom and flexibility as possible to encourage business growth.

Data 52
article thumbnail

Trend-Setting Products in Data and Information Management for 2022

DataKitchen

The post Trend-Setting Products in Data and Information Management for 2022 first appeared on DataKitchen.

article thumbnail

Cloudera Streaming Analytics 1.6 Release Notes

Cloudera

We are excited to announce the release of Cloudera Streaming Analytics (CSA) 1.6 for CDP Private Cloud Base. With this release, we build on the foundation on 1.4 and 1.5 – with a number of fixes, enhancements, and features. Starting with this release, we now have an aligned release cycle for CSA Community Edition (CE). You can now expect simultaneous releases of CSA for both CE and CDP Private Cloud Base versions.

Java 86
article thumbnail

Driving Business Impact for PMs

Speaker: Jon Harmer, Product Manager for Google Cloud

Move from feature factory to customer outcomes and drive impact in your business! This session will provide you with a comprehensive set of tools to help you develop impactful products by shifting from output-based thinking to outcome-based thinking. You will deepen your understanding of your customers and their needs as well as identifying and de-risking the different kinds of hypotheses built into your roadmap.