Trending Articles

article thumbnail

Data News — Week 24.30

Christophe Blefari

Tallinn ( credits ) Dear members, it's Summer Data News, the only news you can consume by the pool, the beach or at the office—if you're not lucky. This week, I'm writing from the Baltics, nomading a bit in Eastern and Northern Europe. I'm pleased to announce that we have successfully closed the CfP for Forward Data Conf, we received nearly 100 submissions and the program committee is currently reviewing all submissions.

MySQL 130
article thumbnail

PyArrow vs Polars (vs DuckDB) for Data Pipelines.

Confessions of a Data Guy

I’ve had something rattling around in the old noggin for a while; it’s just another strange idea that I can’t quite shake out. We all keep hearing about Arrow this and Arrow that … seems every new tool built today for Data Engineering seems to be at least partly based on Arrow’s in-memory format. So, […] The post PyArrow vs Polars (vs DuckDB) for Data Pipelines. appeared first on Confessions of a Data Guy.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Snowflake Cortex Search: State-of-the-Art Hybrid Search for RAG Applications

Snowflake

Snowflake Cortex Search, a fully managed search service for documents and other unstructured data, is now in public preview. With Cortex Search, organizations can effortlessly deploy retrieval-augmented generation (RAG) applications with Snowflake, powering use cases like customer service, financial research and sales chatbots. Cortex Search offers state-of-the-art semantic and lexical search over your text data in Snowflake behind an intuitive user interface, and it comes with the robust securi

article thumbnail

Data Engineering Weekly #181

Data Engineering Weekly

Editor’s Note: A New Series on Data Engineering Tools Evaluation There are plenty of data tools and vendors in the industry. But how can we choose a tool for the specific need? The traditional evaluation of running PoC on all the selected vendor tools is time-consuming and practically unviable for growth-driven companies. Data Engineering Weekly is launching a new series on software evaluation focused on data engineering to better guide data engineering leaders in evaluating data tools.

article thumbnail

The AI Superhero Approach to Product Management

Speaker: Conrado Morlan

In this engaging and witty talk, we’ll explore how artificial intelligence can transform the daily tasks of product managers into streamlined, efficient processes. Using the lens of a superhero narrative, we’ll uncover how AI can be the ultimate sidekick, aiding in decision-making, enhancing productivity, and boosting innovation. Attendees will leave with practical tools and actionable insights, motivated to embrace AI and leverage its potential in their work. 🦸 🏢 Key objectives:

article thumbnail

A New Standard in Open Source AI: Meta Llama 3.1 on Databricks

databricks

We are excited to partner with Meta to release the Llama 3.1 series of models on Databricks, further advancing the standard of powerful.

124
124
article thumbnail

Bayesian Thinking in Modern Data Science

KDnuggets

Discover how Bayesian thinking transforms decision-making with its unique approach to updating initial beliefs with new evidence.

More Trending

article thumbnail

Getting the Most From Your Modern Data Platform: A Three-Phase Approach

Snowflake

A robust, modern data platform is the starting point for your organization’s data and analytics vision. At first, you may use your modern data platform as a single source of truth to realize operational gains — but you can realize far greater benefits by adding additional use cases. In this blog, we offer guidance for leveraging Snowflake’s capabilities around data and AI to build apps and unlock innovation.

article thumbnail

Zero Downtime Upgrades – Redefining Your Platform Upgrade Experience

Cloudera

Cloudera recently unveiled the latest version of Cloudera Private Cloud Base with the Zero Downtime Upgrade (ZDU) feature to enhance your user experience. The goal of ZDU is to make upgrades simpler for you and your stakeholders by increasing the availability of Cloudera’s services. How Do You Keep IT Infrastructure (and Buses) Running and Avoid Downtime?

article thumbnail

Databricks on Databricks: Kicking off the Journey to Governance with Unity Catalog

databricks

In this blog, we are excited to share Databricks's journey in migrating to Unity Catalog for enhanced data governance. We'll discuss our high-level strategy and the tools we developed to facilitate the migration. Our goal is to highlight the benefits of Unity Catalog and make you feel confident about transitioning to it.

article thumbnail

5 Tools Every Data Scientist Needs in Their Toolbox in 2024

KDnuggets

From the soft tools to the hard tools, these are what make a data scientist successful.

Data 127
article thumbnail

Provide Real Value in Your Applications with Data and Analytics

The complexity of financial data, the need for real-time insight, and the demand for user-friendly visualizations can seem daunting when it comes to analytics - but there is an easier way. With Logi Symphony, we aim to turn these challenges into opportunities. Our platform empowers you to seamlessly integrate advanced data analytics, generative AI, data visualization, and pixel-perfect reporting into your applications, transforming raw data into actionable insights.

article thumbnail

Celebrating Empowerment: Robinhood Market’s Women in Tech Conference 2024

Robinhood

Robinhood was founded on a simple idea: that our financial markets should be accessible to all. With customers at the heart of our decisions, Robinhood is lowering barriers and providing greater access to financial information and investing. Together, we are building products and services that help create a financial system everyone can participate in. … Recently, Robinhood Markets hosted its highly anticipated Annual Women in Tech (WIT) Conference, a day-long event designed to empower and inspi

Finance 73
article thumbnail

How Snowflake Accelerates Business Growth for Providers of Data, Apps and AI Products 

Snowflake

Let’s say you are building a house that you plan to put up for sale. You focus on an amazing design, beautiful entry, large windows for plenty of sunlight — things that will create a delightful experience for your future buyer. At the same time, the house also needs less glamorous but vitally important infrastructure, like plumbing, running water, electricity, heating, cooling and so on.

article thumbnail

Beyond Automation: Unveiling the True Essence of BDD by Xin Chen

Scott Logic

Many organisations claim they are applying Behaviour-Driven Development (BDD). When you discuss BDD with them, they often present numerous feature files as evidence of their adoption of the BDD methodology. These days, some still believe that writing a test case in the Given-When-Then format constitutes BDD, or that BDD is synonymous with a test automation framework.

article thumbnail

Introducing Mosaic AI Model Training for Fine-Tuning GenAI Models

databricks

Today, we're thrilled to announce that Mosaic AI Model Training's support for fine-tuning GenAI models is now available in Public Preview. At Databricks.

article thumbnail

Entity Resolution: Your Guide to Deciding Whether to Build It or Buy It

Adding high-quality entity resolution capabilities to enterprise applications, services, data fabrics or data pipelines can be daunting and expensive. Organizations often invest millions of dollars and years of effort to achieve subpar results. This guide will walk you through the requirements and challenges of implementing entity resolution. By the end, you'll understand what to look for, the most common mistakes and pitfalls to avoid, and your options.

article thumbnail

Visualizing Data: A Statology Primer

KDnuggets

This collection of tutorials from our sister site Statology center on data visualization. Learn more about visualizing your data right here.

Data 102
article thumbnail

Introducing Joint Investing Accounts at Robinhood

Robinhood

Today, we are excited to launch joint investing accounts, which allow customers to seamlessly manage investments with their partner while keeping their shared assets in one place. Joint accounts make investing more collaborative for families and loved ones, providing shared access for account holders that allows them to combine funds and increase their investment power as they work towards their financial goals.

Banking 66
article thumbnail

Accelerate your data streaming journey with the latest in Confluent Cloud

Confluent

CC 2024 Q2 adds Flink Private Networking (AWS), Flink SQL Interactive Tables; Enterprise:Connect w/Confluent, Connector Custom Offsets; SI: Build w/Confluent, etc.

Cloud 63
article thumbnail

Snowflake Cortex AI Launches Cortex Guard to Implement LLM Safeguards

Snowflake

Over the last year, as Snowflake has focused on putting AI tools in the hands of its customers, we have prioritized easy, efficient and safe enterprise generative AI. With that in mind, we’re happy to announce the general availability of safety guardrails for Snowflake Cortex AI with Cortex Guard, a new feature that enables enterprises to easily implement safeguards that filter out potentially inappropriate or unsafe large language model (LLM) responses.

article thumbnail

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Speaker: Maher Hanafi, VP of Engineering at Betterworks & Tony Karrer, CTO at Aggregage

Executive leaders and board members are pushing their teams to adopt Generative AI to gain a competitive edge, save money, and otherwise take advantage of the promise of this new era of artificial intelligence. There's no question that it is challenging to figure out where to focus and how to advance when it’s a new field that is evolving everyday. 💡 This new webinar featuring Maher Hanafi, VP of Engineering at Betterworks, will explore a practical framework to transform Generative AI pr

article thumbnail

Data-Driven Quality: Change the Game with Knowledge Graphs & Generative AI

databricks

Written in collaboration with Navin Sharma and Joe Pindell, Stardog Across industries, the impact of post-delivery failure costs (recalls, warranty claims, lost goodwill.

Data 82
article thumbnail

How to Use Conditional Formatting in Pandas to Enhance Data Visualization

KDnuggets

Tired of staring at bland dataframes? Discover how conditional formatting in Pandas can transform your data visualization experience!

Data 102
article thumbnail

Pickup in 3 minutes: Uber’s implementation of Live Activity on iOS

Uber Engineering

From WWDC reveal to delivery, discover how we tackled new tech, design challenges, and tight timelines to enhance rider & driver experiences with Live Activity® from Apple.

article thumbnail

Node.js and the tale of worker threads

Zalando Engineering

A disrupted gaming night I do not usually read code when dealing with production incidents, as it is one of the slower ways to understand and mitigate what is happening. But on that Friday night, I was glad I did. I was about to start another session of Elden Ring (a video game in which everything is pretty much trying to kill the player) when I was paged with the following: "campaign service is consuming all resources we throw at it".

Coding 57
article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

How are Apache Iceberg Tables Optimizing Data Lake Management?

Hevo

A data lake is a central storage place for an organization’s data in its original format. Unlike data warehouses, data lakes can handle all kinds of data, including unstructured and semi-structured data like images, video, audio, and documents.

article thumbnail

Building Industry IoT and M2M Solutions With Databricks for Communications

databricks

The communications industry is experiencing immense change due to rapid technological advancements and evolving market trends. Communications service providers (CSP) build various solutions.

article thumbnail

Learn Data Analysis with Julia

KDnuggets

Setup the environment, load the data, perform data analysis and visualization, and create the data pipeline all using Julia programming language.

article thumbnail

Organizations’ Machine Learning Investment is (or should be) Incremental

DareData

Embedding ML systems into production is still a hard thing to do (for most companies) Photo by Glen Carrie @ Unsplash.com Have you ever heard of a company that successfully integrated Machine Learning into their business processes overnight, completely transforming the way the organization operated from one day to the next? Yup, me neither! And did you did you know that most ML models never make it to production?

article thumbnail

Demystifying DAPs: A Practical Guide to Digital Adoption Success

Speaker: Pulkit Agrawal

Digital Adoption Platforms (DAPs) are revolutionizing the way organizations interact with and optimize their software applications. As digital transformation continues to accelerate, DAPs have become essential tools for enhancing user engagement and software efficiency. This session is your guide into the robust world of DAPs, exploring their origins, evolution, and the current trends shaping their development.

article thumbnail

Marketing Questions phData Can Answer with Data

phData: Data Engineering

Effective marketing is crucial for business growth, yet achieving cost-effective and impactful results from marketing can be challenging for companies of all sizes. Marketing leaders are tasked with driving results and determining the best course of action for their team by asking questions like: How much should we spend on this new campaign? Should we focus on retaining our customers or trying to find new ones?

article thumbnail

Iceberg Architecture Examples: How Iceberg powers data and ML applications

Hevo

In recent years, Apache Iceberg has seen considerable advancements that highlights its growing importance. Major tech companies like Google, Snowflake, and Databricks have increasingly embraced this table format. This trend, driven by major tech companies, highlights a transformative shift in the data warehousing landscape as Iceberg gains traction.

article thumbnail

A Framework for Multi-Model Forecasting on Databricks

databricks

Introduction Time series forecasting serves as the foundation for inventory and demand management in most enterprises. Using data from past periods along with.

article thumbnail

How to Use the pivot_table Function for Advanced Data Summarization in Pandas

KDnuggets

Let's learn to use Pandas pivot_table in Python to perform advance data summarization

Python 112
article thumbnail

Deliver Mission Critical Insights in Real Time with Data & Analytics

In the fast-moving manufacturing sector, delivering mission-critical data insights to empower your end users or customers can be a challenge. Traditional BI tools can be cumbersome and difficult to integrate - but it doesn't have to be this way. Logi Symphony offers a powerful and user-friendly solution, allowing you to seamlessly embed self-service analytics, generative AI, data visualization, and pixel-perfect reporting directly into your applications.