Sat.May 17, 2025 - Fri.May 23, 2025

article thumbnail

From Python to AI Engineer: A Self-Study Roadmap

KDnuggets

A practical roadmap for Python programmers to develop the advanced skills, specialized knowledge, and engineering mindset needed to become successful AI engineers in 2025.

Python 133
article thumbnail

Configure, Don't Code: How Declarative Data Stacks Enable Enterprise Scale

Simon Späti

Imagine building enterprise data infrastructure where you write 90% less code but deliver twice the value. This is the promise of declarative data stacks. The open and modern data stack freed us from vendor lock-in, allowing teams to select best-of-breed tools for ingestion, ETL, and orchestration. But this freedom comes at a cost: fragmented governance, security gaps, and potential technical debt when stacking disconnected tools across your organization.

Coding 130
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

When the Model Isn’t the Problem: How Data Gaps Undermine AI Systems

Monte Carlo

AI quality issues are on the rise and data + AI leaders are just beginning to feel the pain. One of the most common perpetrators? Data quality issues. At Monte Carlo, were no strangers to the impact of data qualityparticularly at the scale and complexity of AI applications. However, we recently experienced that impact first-handand learned a valuable lesson about the nature of AI reliability in the process.

Systems 52
article thumbnail

How I Broke Our SLA and Delighted Our Customer

DataKitchen

I broke one of our most critical SLAs just last week, and it was the best thing that could have happened. It was shaping up to be a major embarrassment. One of our key data warehouse refreshes had failed. No new data. No dashboard updates. The refresh was long past its deadline, the projects key data engineer was on vacation, and I was playing backup.

article thumbnail

A Guide to Debugging Apache Airflow® DAGs

In Airflow, DAGs (your data pipelines) support nearly every use case. As these workflows grow in complexity and scale, efficiently identifying and resolving issues becomes a critical skill for every data engineer. This is a comprehensive guide with best practices and examples to debugging Airflow DAGs. You’ll learn how to: Create a standardized process for debugging to quickly diagnose errors in your DAGs Identify common issues with DAGs, tasks, and connections Distinguish between Airflow-relate

article thumbnail

Big data and forecasting solutions

InData Labs

Today, the world, its businesses and organizations, and their processes rely on Big data and forecasting more than ever. According to the latest Data Age report from IDC, the global data sphere may reach a size of 175 zettabytes (ZB) by the end of 2025. The increased prevalence of cloud-based data storage, Internet of Things-connected. Big data and forecasting solutions InData Labs.

article thumbnail

GenAI Applications - AI Sales Assistant

DareData

Sales results will never be the same after GenAI In today's fast-paced sales environment, efficiency is the name of the game. Sales consultants are constantly juggling multiple clients, struggling to condense crucial information, and spending valuable time crafting personalized email offers. Worse - the majority of Sales Developers spend a staggering low amount of time selling.

More Trending

article thumbnail

Report: Why IT Leaders Are Prioritizing Data Streaming Platforms

Confluent

Discover key insights from the 2025 Data Streaming Report, including why IT leaders are prioritizing data streaming platforms to unlock the true value of their data.

IT 45
article thumbnail

Building a Foundation of Trust: How to Improve the Quality of Your Critical Data

Precisely

Key Takeaways Inaccurate data undermines analytics, drives up costs, and damages customer trust. Implement core processes like validation, enrichment, entity resolution, and reconciliation to reduce risk and improve operational efficiency. For long-term success, build a data quality strategy that combines the right tools with clear goals, ownership, and metrics.

article thumbnail

How Jacopo Tagliabue is Cutting Data Pipeline Latency with Fast Functions

Striim

Get More Insights In Your Inbox What if your data pipeline could run 10x faster without the overhead? Jacopo Tagliabue, CTO of Bauplan and NYU adjunct professor, is pushing the boundaries of data infrastructure with lightweight DAGs, Apache Arrow, and a radically different take on functions as a service. In this episode, he breaks down the tech stack behind Bauplan and why the future of scalable data pipelines is all about speed, modularity, and zero-copy design.

article thumbnail

Surprising Things You Can Do with Python’s csv Module

KDnuggets

Think it's just for reading simple tables? See what else you can do with this Python standard library module.

Python 117
article thumbnail

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

Extending the Malbec subsea cable to Southern Brazil

Engineering at Meta

Meta is partnering with V.tal to extend the Malbec subsea cable to Porto Alegre, Brazil by 2027. With this new extension, Malbec will become the first subsea cable to land in the state of Rio Grande do Sul, bringing more connectivity to millions of people in Southern Brazil and neighboring countries. Malbec will improve the scale and reliability of digital infrastructure in Porto Alegre, establishing it as a digital hub and improving online experiences across Southern Brazil, Argentina, Chile, P

article thumbnail

Lakeflow Connect: Efficient and Easy Data Ingestion using the SQL Server connector

databricks

Complexities of Extracting SQL Server Data While digital native companies recognize AI's critical role in driving innovation, many still face challenges in making their data

article thumbnail

FM-Intent: Predicting User Session Intent with Hierarchical Multi-Task Learning

Netflix Tech

Authors: Sejoon Oh , Moumita Bhattacharya , Yesu Feng , Sudarshan Lamkhede , Ko-Jen Hsiao , and JustinBasilico Motivation Recommender systems have become essential components of digital services across e-commerce, streaming media, and social networks [1, 2]. At Netflix, these systems drive significant product and business impact by connecting members with relevant content at the right time [3, 4].

article thumbnail

Top 7 Python Frameworks for AI Agents

KDnuggets

Design, test, and deploy multi-agent systems in hours using the powerful agentic frameworks.

Python 138
article thumbnail

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage

There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.

article thumbnail

The Regulatory Wave Is Here—and Building. Are You Ready for the Impact?

Teradata

Navigate the evolving regulatory landscape with smarter data strategies to ensure compliance, mitigate risks, and turn regulatory challenges into opportunities.

article thumbnail

Adopting Databricks and Unity Catalog Governance to Support ITGC Compliance

databricks

Introduction The Sarbanes-Oxley Act of 2002 (SOX) is a U.S.

article thumbnail

How Snowflake’s Data Collaboration Will Help Make the 2028 Olympics and Paralympics the Most Data-Driven Ever

Snowflake

Consider the numbers: 23, 9.63, 368. On the surface, they are but simple data points the most Olympic gold medals won by a single individual (Hint: Hes a Team USA athlete!); the winning time, in seconds, of the mens 100-meter dash in London; the world record, in pounds, a female Paralympic powerlifter has ever benchpressed. But behind every number there is a story of hardship, perseverance and achievement.

Data 105
article thumbnail

Choosing the Right Machine Learning Algorithm: A Decision Tree Approach

KDnuggets

Amid so many different machine learning algorithms to choose from. This guide has been designed to help you navigate towards the right one for you, depending on your data and the problem to address.

article thumbnail

How to Modernize Manufacturing Without Losing Control

Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives

Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri

article thumbnail

Telco’s Role in Digital Competitiveness, and the AI Imperative

Teradata

AI-native telcos embed AI deeply to drive decisions, boost productivity, and lead digital growth—not just efficiency, but innovation and market expansion.

52
article thumbnail

Register now and save 50% on training at Data + AI Summit

databricks

This year, Databricks Training and Certification returns to the Data + AI Summit in San Francisco from June 912, with an expanded program featuring even

article thumbnail

5 Steps to Building With AI: What It Can Do Reliably (and How to Start)

Confluent

Use the R.I.C.E. framework to evaluate and launch your first AI project. Learn how to balance reach, impact, confidence, and effort for better outcomes.

IT 45
article thumbnail

Run Python in Your Browser with PyScript: A Beginner’s Guide

KDnuggets

You dont need an additional setup to run the Python web application.

Python 107
article thumbnail

The Ultimate Guide to Apache Airflow DAGS

With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you

article thumbnail

Enhancing support for Parquet files

ArcGIS

Announcing enhanced support for Parquet files in ArcGIS Pro 3.5 release.

article thumbnail

Introducing next-level identity security at Databricks

databricks

The future of identity security is passwordless, context-aware, and frictionlessand were continuing to build toward that future at Databricks.

article thumbnail

Confluent Releases Managed V2 Connector for Apache Kafka® for Azure Cosmos DB

Confluent

Learn how to deliver real-time event streams to Azure Cosmos DB for low-latency apps, live analytics, and seamless cloud-native integration.

Kafka 40
article thumbnail

The 3 Horizons of LLM Evolution

KDnuggets

The shift from native LLMs (2018) to LLM agents (2025) has enabled AI to move beyond static knowledge, integrating retrieval, reasoning, and real-world interaction for autonomous problem-solving.

117
117
article thumbnail

Apache Airflow® Best Practices: DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

ゲームチェンジャー:Snowflakeのデータコラボレーションが、2028年LAオリンピックとパラリンピックを史上最もデータドリブンな大会に変える

Snowflake

239.

52
article thumbnail

Denny’s top session picks for Data + AI Summit 2025

databricks

Data + AI Summit 2025 is just a few weeks away!

Data 75
article thumbnail

Bridging the AI Valley of Doubt by David Rees

Scott Logic

I recently attended the thought provoking AI Ethics, Risks and Safety Conference 2025. It featured presentations from the UK governments Department for Science, Innovation and Technology (DSIT), the Alan Turing Institute, Mind Foundry and Google. The talks focused on trying to stimulate economic growth by building competence and confidence for UK businesses looking to adopt AI, the balance of human-AI collaboration and the wider societal and environmental impacts.

article thumbnail

The Sun is Setting on PowerCenter Support: What’s Next?

KDnuggets

As standard PowerCenter support winds down, the path forward requires careful consideration of your organization's specific needs and constraints.

120
120
article thumbnail

How to Achieve High-Accuracy Results When Using LLMs

Speaker: Ben Epstein, Stealth Founder & CTO | Tony Karrer, Founder & CTO, Aggregage

When tasked with building a fundamentally new product line with deeper insights than previously achievable for a high-value client, Ben Epstein and his team faced a significant challenge: how to harness LLMs to produce consistent, high-accuracy outputs at scale. In this new session, Ben will share how he and his team engineered a system (based on proven software engineering approaches) that employs reproducible test variations (via temperature 0 and fixed seeds), and enables non-LLM evaluation m