Trending Articles

article thumbnail

Being Data Driven At Stripe With Trino And Iceberg

Data Engineering Podcast

Summary Stripe is a company that relies on data to power their products and business. To support that functionality they have invested in Trino and Iceberg for their analytical workloads. In this episode Kevin Liu shares some of the interesting features that they have built by combining those technologies, as well as the challenges that they face in supporting the myriad workloads that are thrown at this layer of their data platform.

Data Lake 130
article thumbnail

Unpacking the Latest Streaming Announcements: A Comprehensive Analysis

Jesse Anderson

This video covers the latest announcements from StreamNative, Confluent, and WarpStream. We discuss communication protocols, how they’re used, and what they mean for you. We also discuss the various systems using Kafka’s protocol. Finally, we discuss the announcements about writing to Iceberg and DeltaLake directly from the broker and what that means for costs and operational ease.

Kafka 147
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Data News — Week 24.24

Christophe Blefari

hey ( credits ) 🥹It's been a long time since I've put words down on paper or hit the keyboard to send bytes across the network. We're in the age of AI, and my lords computer science have evolved over the last 30 years. I'm writing this edition from my child's home, and it brings back memories. I got my first computer at the age of 6 and spent my days installing Windows 98 over and over again, getting lost between the BIOS and the Windows installation pages, playi

Data 100
article thumbnail

Building Open-Source Python Packages – SparklePop

Confessions of a Data Guy

One of the things I love about Python is its flexibility and huge community, a community that puts out a never-ending stream of useful packages for the average Software Engineer. In a show of solidarity to the open-source community, I thought I would publish a PYPI package that will probably be used by 5 people […] The post Building Open-Source Python Packages – SparklePop appeared first on Confessions of a Data Guy.

Python 100
article thumbnail

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Speaker: Maher Hanafi, VP of Engineering at Betterworks & Tony Karrer, CTO at Aggregage

Executive leaders and board members are pushing their teams to adopt Generative AI to gain a competitive edge, save money, and otherwise take advantage of the promise of this new era of artificial intelligence. There's no question that it is challenging to figure out where to focus and how to advance when it’s a new field that is evolving everyday. 💡 This new webinar featuring Maher Hanafi, VP of Engineering at Betterworks, will explore a practical framework to transform Generative AI pr

article thumbnail

Introducing Databricks LakeFlow: A unified, intelligent solution for data engineering

databricks

Today, we are excited to announce Databricks LakeFlow, a new solution that contains everything you need to build and operate production data pipelines.

article thumbnail

How Meta trains large language models at scale

Engineering at Meta

As we continue to focus our AI research and development on solving increasingly complex problems, one of the most significant and challenging shifts we’ve experienced is the sheer scale of computation required to train large language models (LLMs). Traditionally, our AI model training has involved a training massive number of models that required a comparatively smaller number of GPUs.

Algorithm 114

More Trending

article thumbnail

Data Engineering Weekly #175

Data Engineering Weekly

Experience Enterprise-Grade Apache Airflow Astro augments Airflow with enterprise-grade features to enhance productivity, meet scalability and availability demands across your data pipelines, and more. Learn More → Cube Research: Crystallizing Snowflake Summit 2024 We should officially call the first week of June the data engineering week, as two major data companies are running their developer conference.

article thumbnail

Top 5 Tips for Styling Published Layers and Maps

ArcGIS

The Living Atlas team publishes a lot of web layers. Here's some of our favorite tips and tricks for customizing your layers and maps.

114
114
article thumbnail

How FactSet Implemented an Enterprise Generative AI Platform with Databricks and MLflow

databricks

“FactSet’s mission is to empower clients to make data-driven decisions and supercharge their workflows and productivity. To deliver AI-driven solutions across our entire.

article thumbnail

Observability in Snowflake: A New Era with Snowflake Trail

Snowflake

Discovering and surfacing telemetry traditionally can be a tedious and challenging process, especially when it comes to pinpointing specific issues for debugging. However, as applications and pipelines grow in complexity, understanding what’s happening beneath the surface becomes increasingly crucial. A lack of visibility hinders the development and maintenance of high-quality applications and pipelines, ultimately impacting customer experience.

Python 99
article thumbnail

Leading the Development of Profitable and Sustainable Products

Speaker: Jason Tanner

While growth of software-enabled solutions generates momentum, growth alone is not enough to ensure sustainability. The probability of success dramatically improves with early planning for profitability. A sustainable business model contains a system of interrelated choices made not once but over time. Join this webinar for an iterative approach to ensuring solution, economic and relationship sustainability.

article thumbnail

Using SQL with Python: SQLAlchemy and Pandas

KDnuggets

A simple tutorial on how to connect to databases, execute SQL queries, and analyze and visualize data.

SQL 124
article thumbnail

Serverless Jupyter Notebooks at Meta

Engineering at Meta

At Meta, Bento , our internal Jupyter notebooks platform, is a popular tool that allows our engineers to mix code, text, and multimedia in a single document. Use cases run the entire spectrum from what we call “lite” workloads that involve simple prototyping to heavier and more complex machine learning workflows. However, even though the lite workflows require limited compute, users still have to go through the same process of reserving and provisioning remote compute – a process that takes time

SQL 91
article thumbnail

Data Engineering Weekly #176

Data Engineering Weekly

Experience Enterprise-Grade Apache Airflow Astro augments Airflow with enterprise-grade features to enhance productivity, meet scalability and availability demands across your data pipelines, and more. Learn More → Databricks: Open Sourcing Unity Catalog This week brought many exciting developments, with Snowflake and Databricks announcing open-source catalogs.

article thumbnail

Introducing AI/BI: Intelligent Analytics for Real-World Data

databricks

Today, we are excited to announce Databricks AI/BI , a new type of business intelligence product built from the ground up to deeply.

BI 133
article thumbnail

Navigating the Future: Generative AI, Application Analytics, and Data

Generative AI is upending the way product developers & end-users alike are interacting with data. Despite the potential of AI, many are left with questions about the future of product development: How will AI impact my business and contribute to its success? What can product managers and developers expect in the future with the widespread adoption of AI?

article thumbnail

Ingest Data Faster, Easier and Cost-Effectively with New Connectors and Product Updates

Snowflake

The journey toward achieving a robust data platform that secures all your data in one place can seem like a daunting one. But at Snowflake, we’re committed to making the first step the easiest — with seamless, cost-effective data ingestion to help bring your workloads into the AI Data Cloud with ease. Snowflake is launching native integrations with some of the most popular databases, including PostgreSQL and MySQL.

article thumbnail

Understanding Data Privacy in the Age of AI

KDnuggets

Data privacy has been a long-standing issue that continues to challenge the data industry. Let’s understand how rapid developments in the world of AI have elevated data privacy concerns.

Data 88
article thumbnail

Setting a Geoprocessing Extent Just Got Better in ArcGIS Pro 3.3

ArcGIS

Sketch an extent on your map and choose between more new features with the Processing Extent control in ArcGIS Pro 3.3!

Process 108
article thumbnail

Safety First: A Conversation Between Robinhood’s Security Team Leaders

Robinhood

Robinhood was founded on a simple idea: that our financial markets should be accessible to all. With customers at the heart of our decisions, Robinhood is lowering barriers and providing greater access to financial information and investing. Together, we are building products and services that help create a financial system everyone can participate in. … The Robinhood team is incredibly excited to welcome Katelyn Perna as Crypto Chief Information Security Officer.

Finance 80
article thumbnail

Get Better Network Graphs & Save Analysts Time

Many organizations today are unlocking the power of their data by using graph databases to feed downstream analytics, enahance visualizations, and more. Yet, when different graph nodes represent the same entity, graphs get messy. Watch this essential video with Senzing CEO Jeff Jonas on how adding entity resolution to a graph database condenses network graphs to improve analytics and save your analysts time.

article thumbnail

Mosaic AI: Build and deploy production-quality Compound AI Systems

databricks

Over the last year, we have seen a surge of commercial and open-source foundation models showing strong reasoning abilities on general knowledge tasks.

Systems 117
article thumbnail

Accelerate Development and Productivity with DevOps in Snowflake 

Snowflake

Today’s data-driven world requires an agile approach. Modern data teams are constantly under pressure to deliver innovative solutions faster than ever before. Fragmented tooling across data engineering, application development and AI/ML development creates a significant bottleneck, hindering the speed of value delivery required to stay competitive.

Python 91
article thumbnail

5 Free University Courses to Learn Coding for Data Science

KDnuggets

Learn programming for free from top-tier universities like Harvard and MIT.

article thumbnail

Building Change Detection in the Region of Cataluña

ArcGIS

Revolutionizing GIS: Streamlining Change Detection for Mapping Agencies.

Building 121
article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

Where Does Data Governance Fit Into Hybrid Cloud?

Cloudera

At a time when artificial intelligence (AI) and tools like generative AI (GenAI) and large language models (LLMs) have exploded in popularity, getting the most out of organizational data is critical to driving business value and carving out a competitive market advantage. To reach that goal, more businesses are turning toward hybrid cloud infrastructure – with data on-premises, in the cloud, or both – as a means to tap into valuable data.

article thumbnail

Open Sourcing Unity Catalog

databricks

We are excited to announce that we are open sourcing Unity Catalog, the industry’s first open source catalog for data and AI governance.

article thumbnail

Snowflake ML Now Supports Expanded MLOps Capabilities for Streamlined Management of Features and Models 

Snowflake

Bringing machine learning (ML) models into production is often hindered by fragmented MLOps processes that are difficult to scale with the underlying data. Many enterprises stitch together a complex mix of various MLOps tools to build an end-to-end ML pipeline. The friction of having to set up and manage separate environments for features and models creates operational complexity that can be costly to maintain and difficult to use.

article thumbnail

Step-by-Step Tutorial to Building Your First Machine Learning Model

KDnuggets

Machine Learning model is an exciting project. Learn how to develop your first model that the company would want to use.

article thumbnail

Reimagined: Building Products with Generative AI

“Reimagined: Building Products with Generative AI” is an extensive guide for integrating generative AI into product strategy and careers featuring over 150 real-world examples, 30 case studies, and 20+ frameworks, and endorsed by over 20 leading AI and product executives, inventors, entrepreneurs, and researchers.

article thumbnail

Unlocking the power of mixed reality devices with MobileConfig

Engineering at Meta

MobileConfig enables developers to centrally manage a mobile app’s configuration parameters in our data centers. Once a parameter value is changed on our central server, billions of app devices automatically fetch and apply the new value without app updates. These remotely managed configuration parameters serve various purposes such as A/B testing, feature rollout, and app personalization.

Java 74
article thumbnail

Robinhood and IVMF Bring Retirement Education to Veteran Entrepreneurs 

Robinhood

Sessions kicked off in Las Vegas on April 29th and Chicago on May 1st Robinhood Markets, Inc. has partnered with Syracuse University’s D’Aniello Institute for Veterans and Military Families (IVMF) to bring retirement education workshops to entrepreneurs across the U.S. We’re honored to partner with an organization helping veterans and veteran family members launch and grow their own businesses.

article thumbnail

What’s New with Databricks Unity Catalog at Data + AI Summit 2024

databricks

In an era marked by rapid advancements in artificial intelligence and an explosion of data and Gen AI tools, enterprises face fragmented data.

Data 100
article thumbnail

Streamline Operations and Empower Business Teams to Unlock Unstructured Data with Document AI 

Snowflake

It is estimated that between 80% and 90% of the world’s data is unstructured 1 , with text files and documents making up a significant portion. Every day, countless text-based documents, like contracts and insurance claims, are stored for safekeeping. Despite containing a wealth of insights, this vast trove of information often remains untapped, as the process of extracting relevant data from these documents is challenging, tedious and time-consuming.

article thumbnail

How To Get Promoted In Product Management

Speaker: John Mansour

If you're looking to advance your career in product management, there are more options than just climbing the management ladder. Join our upcoming webinar to learn about highly rewarding career paths that don't involve management responsibilities. We'll cover both career tracks and provide tips on how to position yourself for success in the one that's right for you.