Sat.Dec 14, 2024 - Fri.Dec 20, 2024

article thumbnail

Top 10 Data & AI Trends for 2025

Towards Data Science

Agentic AI, small data, and the search for value in the age of the unstructured datastack. Image credit: MonteCarlo According to industry experts, 2024 was destined to be a banner year for generative AI. Operational use cases were rising to the surface, technology was reducing barriers to entry, and general artificial intelligence was obviously right around thecorner.

article thumbnail

Part 1: A Survey of Analytics Engineering Work at Netflix

Netflix Tech

This article is the first in a multi-part series sharing a breadth of Analytics Engineering work at Netflix, recently presented as part of our annual internal Analytics Engineering conference. We kick off with a few topics focused on how were empowering Netflix to efficiently produce and effectively deliver high quality, actionable analytic insights across the company.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Translating Java to Kotlin at Scale

Engineering at Meta

Meta has been on a years-long undertaking to translate our entire Android codebase from Java to Kotlin. Today, despite having one of the largest Android codebases in the world, we’re well past the halfway point and still going. We’re sharing some of the tradeoffs we’ve made to support automating our transition to Kotlin, seemingly simple transformations that are surprisingly tricky, and how we’re collaborating with other companies to capture hundreds more corner cases.

Java 89
article thumbnail

Monte Carlo Recognized as the #1 Leader in Data Observability and Data Quality by G2

Monte Carlo

As we turn the corner into 2025, were excited to announce that for the 7th quarter in a row, Monte Carlo has been named G2s #1 Data Observability Platform, as well as #1 in the Data Quality category. This recognition never gets old because G2 bases their rankings on feedback and insights from real customers who work in these tools every day to add value to their business.

article thumbnail

Precision in Motion: Why Process Optimization Is the Future of Manufacturing

Speaker: Jason Chester, Director, Product Management

In today’s manufacturing landscape, staying competitive means moving beyond reactive quality checks and toward real-time, data-driven process control. But what does true manufacturing process optimization look like—and why is it more urgent now than ever? Join Jason Chester in this new, thought-provoking session on how modern manufacturers are rethinking quality operations from the ground up.

article thumbnail

Key Takeaways from AWS re:Invent 2024

Cloudera

AWS re:Invent is one of my favorite trade shows. It is one of the biggest technology conferences of the year and is an opportunity to have hundreds of conversations with customers and prospects, listen to their priorities and challenges, hopes, and give them a Cloudera tote bag or a pair of orange sunglasses. What follows is a collection of just a few things I learned and observed during my week in Las Vegas.

AWS 75
article thumbnail

Introducing Configurable Metaflow

Netflix Tech

David J. Berg * , David Casler ^, Romain Cledat * , Qian Huang * , Rui Lin * , Nissan Pow * , Nurcan Sonmez * , Shashank Srikanth * , Chaoying Wang * , Regina Wang * , Darin Yu * *: Model Development Team, Machine Learning Platform ^: Content Demand ModelingTeam A month ago at QConSF, we showcased how Netflix utilizes Metaflow to power a diverse set of ML and AI use cases , managing thousands of unique Metaflow flows.

More Trending

article thumbnail

Redefining AIOps IT Workflows with Legacy System Visibility

Precisely

Key Takeaways: Centralized visibility of data is key. Modern IT environments require comprehensive data for successful AIOps, that includes incorporating data from legacy systems like IBM i and IBM Z into ITOps platforms. Predictive of AIOps capabilities will revolutionize IT operations. The shift from reactive to proactive IT operations is driven by AI-powered analysis, automation and insights.

Systems 59
article thumbnail

Cloudera’s Take: What’s in Store for Data and AI in 2025

Cloudera

In the last year, weve seen the explosion of AI in the enterprise, leaving organizations to consider the infrastructure and processes for AI to successfullyand securelydeploy across an organization. As we head into 2025, its clear that next year will be just as exciting as past years. Here, Cloudera experts share their insights on what to expect in data and AI for the enterprise in 2025.

article thumbnail

Cloud Efficiency at Netflix

Netflix Tech

By J Han , PallaviPhadnis Context At Netflix, we use Amazon Web Services (AWS) for our cloud infrastructure needs, such as compute, storage, and networking to build and run the streaming platform that we love. Our ecosystem enables engineering teams to run applications and services at scale, utilizing a mix of open-source and proprietary solutions. In turn, our self-serve platforms allow teams to create and deploy, sometimes custom, workloads more efficiently.

Cloud 89
article thumbnail

The Developer Experience Upgrade: From Create React App to Vite

Tweag

We all know how it feels: staring at the terminal while your development server starts up, or watching your CI/CD pipeline crawl through yet another build process. For many React developers using Create React App (CRA), this waiting game has become an unwanted part of the daily routine. While CRA has been the go-to build tool for React applications for years, its aging architecture is increasingly becoming a bottleneck for developer productivity.

Coding 52
article thumbnail

Airflow Best Practices for ETL/ELT Pipelines

Speaker: Kenten Danas, Senior Manager, Developer Relations

ETL and ELT are some of the most common data engineering use cases, but can come with challenges like scaling, connectivity to other systems, and dynamically adapting to changing data sources. Airflow is specifically designed for moving and transforming data in ETL/ELT pipelines, and new features in Airflow 3.0 like assets, backfills, and event-driven scheduling make orchestrating ETL/ELT pipelines easier than ever!

article thumbnail

The High Price of Poor Address Data: Solutions for Better Business Outcomes

Precisely

Key Takeaways : Poor address data can lead to missed deliveries, incorrect customer information, and wasted resources negatively impacting overall customer satisfaction, operational efficiency, and profitability. Correcting bad addresses is just the beginning you need to then connect those clean addresses to other valuable data points to unlock real value.

article thumbnail

Telco Enterprise Data Platforms: Key Success Factors in Building for an AI Future

Cloudera

Since 5G networks began rolling out commercially in 2019, telecom carriers have faced a wide range of new challenges: managing high-velocity workloads, reducing infrastructure costs, and adopting AI and automation. Because data management is a key variable for overcoming these challenges, carriers are turning to hybrid cloud solutions, which provide the flexibility and scalability needed to adapt to the evolving landscape 5G enables.

article thumbnail

Title Launch Observability at Netflix Scale

Netflix Tech

Part 1: Understanding The Challenges By: VarunKhaitan With special thanks to my stunning colleagues: Mallika Rao , Esmir Mesic , HugoMarques Introduction At Netflix, we manage over a thousand global content launches each month, backed by billions of dollars in annual investment. Ensuring the success and discoverability of each title across our platform is a top priority, as we aim to connect every story with the right audience to delight our members.

article thumbnail

Maximizing Fuel Efficiency with Real-Time Data: A New Era in Airline Operations

Striim

In 2024 , the global airline industry is projected to spend $291 billion on fuel, making it one of the most significant expenses for airlines. Inefficient fuel management not only drives up operational costs but also hampers environmental targets. However, optimizing fuel usage is complex, often hindered by limited real-time monitoring, which can lead to unnecessary waste due to inefficient routes, weather adjustments, excess weight, and outdated practices.

article thumbnail

Whats New in Apache Airflow 3.0 –– And How Will It Reshape Your Data Workflows?

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

Using JSpecify 1.0 to Tame Nulls in Java by Magnus Smith

Scott Logic

Introduction In the Java ecosystem, dealing with null values has always been a source of confusion and bugs. A null value can represent various states: the absence of a value, an uninitialized object, or even an error. However, there has never been a consistent, standardized approach for annotating and ensuring null-safety at the language level. Nullability annotations like @Nullable and @NonNull are often used, but theyre not part of the core Java language, leading to inconsistencies across lib

Java 52
article thumbnail

Women Leaders in Technology: A Conversation with Cloudera CMO, Mary Wells

Cloudera

Its no secret that women have long been underrepresented in the tech space. This issue demands our attention, as it not only limits opportunities for women to work, grow, and thrive but also hinders companies in their pursuit of top talent. Although global organizations, policies and programs to address this issue have gained momentum in recent years, theres work still left to do.

article thumbnail

File Archival in Snowflake: Snowpark-Powered Solution

Cloudyard

Read Time: 2 Minute, 38 Second In data-driven organizations, File Archival in Snowflake: A Snowpark-Powered Solutionhas become a game-changer. Handling feed files in data pipelines is a critical task for many organizations. These files, often stored in stages such as Amazon S3 or Snowflake internal stages, are the backbone of data ingestion workflows.

Retail 52
article thumbnail

How GenAI is Transforming Quality Control and Safety in the F&B Industry.

RandomTrees

The food and beverage (F&B) sector is constantly under pressure to comply with strict food safety compliance while also ensuring that operations run efficiently. In light of rapid changes in consumer demand, policies, and supply chain management, there is an urgent need to utilize new technologies. Generative AI (GenAI), an area of artificial intelligence, is enhancing the automation of quality control processes, thereby increasing the safety and efficiency of the industry.

Food 52
article thumbnail

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage

There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.

article thumbnail

Queues in Apache Kafka®: Enhancing Message Processing and Scalability

Confluent

Queue support in Apache Kafka 4.0, enabled by share groups, lets you accommodate traditional queue-type workloads through cooperative consumption.

Kafka 136
article thumbnail

Secure External Access to Unity Catalog Assets via Open APIs

databricks

We're excited to announce the Public Preview of credential vending for Unity Catalogs open APIs, allowing external clients to securely access Unity Catalog.

article thumbnail

Designing a Declarative Data Stack: From Theory to Practice

Simon Späti

What started as a straightforward implementation guide for a declarative data stack quickly evolved into something more fundamental. While attempting to build a system that could define an entire data stack through a single YAML file, I encountered architectural questions that challenged my initial assumptions: Should we generate production-ready code from templates or create a boilerplate repository with best-in-class tools?

Designing 130
article thumbnail

How to reference a seed from a different dbt project?

Start Data Engineering

1. Introduction 2. Ways to reuse seed data across multiple dbt projects 2.1. Code setup 2.1.1. Prerequisites 2.1.2. Setup project environment 2.2. Turn the source repo into a dbt package 2.2.1. Define package version in dbt_project.yml 2.2.2. Store your package for other dbt projects to reference 2.3. Use project dependencies (dbt enterprise only) 2.4.

Project 130
article thumbnail

How to Modernize Manufacturing Without Losing Control

Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives

Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri

article thumbnail

Schema Evolution with View Refresh in Snowflake

Cloudyard

Read Time: 2 Minute, 57 Second In fast-paced data environments, schemas evolve frequently to meet new business requirements. One of the common challenges in managing database views is ensuring they stay in sync with the underlying table schema. For example, when new columns are added to a table, the corresponding view might not automatically reflect these changes, leading to errors or incomplete data in downstream processes.

Retail 52
article thumbnail

Introducing Git Support for Queries in Databricks

databricks

Were excited to announce the Public Preview of Query Git integration as part of the new SQL Editor. Git support for queries.

SQL 126
article thumbnail

Semantic Layer and AI: The Future of Data Querying with Natural Language

Simon Späti

Data-driven decision-making is crucial for business success, but organizations face a growing challenge of complexity and data governance. These challenges make it difficult to access data in a unified way. In Part 1 , we explored the semantic layer through the lens of MVC, and in Part 2 , we outlined its benefits. In this final piece of the series, we examine the integration of a semantic layer with artificial intelligence and why it might be the best place to start with GenAI.

article thumbnail

How to Get Addicted to Machine Learning

KDnuggets

A simple guide for getting hooked to machine learning and building a successful career in the field.

article thumbnail

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

Exploring the Potential of Graph Neural Networks to Transform Recommendations at Zalando

Zalando Engineering

Recommender systems are vital for personalizing user experiences across various platforms. At Zalando, these systems play a crucial role in tailoring content to individual users, thereby enhancing engagement and satisfaction. This is particularly important for Zalando Homepage, which serves as the customers' first impression of the company. Our current recommendation system employed on the Home page excels by leveraging user-content interactions and optimizing for predicted click through rate (C

article thumbnail

Benchmarking Domain Intelligence

databricks

Large language models are improving rapidly; to date, this improvement has largely been measured via academic benchmarks. These benchmarks, such as MMLU and.

116
116
article thumbnail

Integrating Microservices with Confluent Cloud Using Micronaut® Framework

Confluent

Real-time data streaming and messaging are essential for building scalable, resilient, event-driven microservices. Explore integrating the Micronaut framework with Confluent Cloud.

Cloud 115
article thumbnail

How to Use Docker for Local Development Environments

KDnuggets

Using Docker for local development brings stability, flexibility, and ease of management of the environment. No matter what operating system you're using. Learn how to use Docker on Windows, Linux, and macOS to simplify your development setup, from creating your first container to managing complex environments with Docker Compose.

Systems 118
article thumbnail

A Guide to Debugging Apache Airflow® DAGs

In Airflow, DAGs (your data pipelines) support nearly every use case. As these workflows grow in complexity and scale, efficiently identifying and resolving issues becomes a critical skill for every data engineer. This is a comprehensive guide with best practices and examples to debugging Airflow DAGs. You’ll learn how to: Create a standardized process for debugging to quickly diagnose errors in your DAGs Identify common issues with DAGs, tasks, and connections Distinguish between Airflow-relate