Sat.Nov 02, 2024 - Fri.Nov 08, 2024

article thumbnail

The Race For Data Quality in a Medallion Architecture

DataKitchen

The Race For Data Quality In A Medallion Architecture The Medallion architecture pattern is gaining traction among data teams. It is a layered approach to managing and transforming data. The Medallion architecture is a design pattern that helps data teams organize data processing and storage into three distinct layers, often called Bronze, Silver, and Gold.

article thumbnail

What Is AWS DMS And Why You Shouldn’t Use It As An ELT

Seattle Data Guy

Recently, I’ve encountered a few projects that used AWS DMS, which is almost like an ELT solution. Whether it was moving data from a local database instance to S3 or some other data storage layer. It was interesting to see AWS DMS used in this manner. But it’s not what DMS was built for. As… Read more The post What Is AWS DMS And Why You Shouldn’t Use It As An ELT appeared first on Seattle Data Guy.

AWS 130
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Data News — Week 24.45

Christophe Blefari

Métro-boulot-dodo ( credits ) It's Data News time. Time really flies on my side, and apart from the bad news from across the Atlantic, all is well on my side. To be honest, I miss you folks. Writing here has been my little thing for the last 3 years and because I haven't been able to get back to my previous frequency since July, I feel empty every Friday.

Data 130
article thumbnail

BI-as-Code and the New Era of GenBI

Simon Späti

BI-as-Code and the New Era of GenBI Imagine creating business dashboards by simply describing what you want to see. No more clicking through complex interfaces or writing SQL queries - just have a conversation with AI about your data needs. This is the promise of Generative Business Intelligence (GenBI). At its core, GenBI delivers an unreasonably effective human interface , where we iterate quickly, based on BI-as-Code.

BI 130
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Best No-Code LLM App Builders

KDnuggets

Build an LLM application by easily picking and dropping components and connecting them, such as a vector store, web search, memory, and custom prompt.

Coding 149
article thumbnail

9 Must-Watch Videos for Aspiring Data Leaders: Bridging Tech and Business for Data Team Success

Seattle Data Guy

Leading data teams can be challenging. You’ve got management and non-technical teams constantly reaching out with ad-hoc data requests; you’re likely trying to figure out what tools will work best and not blow the bank. Not to mention, you’ve got to bridge the gap between business and technology. All while trying to grow your data… Read more The post 9 Must-Watch Videos for Aspiring Data Leaders: Bridging Tech and Business for Data Team Success appeared first on Seattle D

Banking 130

More Trending

article thumbnail

What’s new in ArcGIS Data Interoperability at Pro 3.4

ArcGIS

An overview of all the enhancements and improves with ArcGIS Data Interoperability with the latest release of ArcGIS Pro at version 3.4.

Data 103
article thumbnail

7 Python Projects to Boost Your Data Science Portfolio

KDnuggets

Enhance your data science portfolio with these seven engaging Python projects that demonstrate essential programming and software engineering skills.

Portfolio 119
article thumbnail

Calling All Builders: Get Hands-On With AI and Apps

Snowflake

You’ve heard about Snowflake’s new capabilities, our fresh products and innovations that help bring AI and apps to life. Now, it’s time to BUILD. Join us for BUILD 2024, a three-day global virtual conference taking place Nov. 12-15, to hear major Snowflake product announcements firsthand and to learn how to build with our latest innovations through dozens of technical sessions and hands-on labs.

article thumbnail

What’s New in AI/BI Dashboards - Fall ‘24

databricks

Introduction Databricks AI/BI Dashboards have made significant strides since we announced their General Availability. Built on Databricks SQL and powered by Data Intelligence.

BI 112
article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Meet Michelle Hoover, Cloudera’s new SVP of Global Alliances and Channels

Cloudera

Cloudera’s partner ecosystem delivers best-of-breed technology solutions to joint customers from the biggest names in the industry and is a core pillar of the company’s growth strategy. Cloudera is committed to fostering collaboration with partners, growing relationships, and innovating for the future. To elevate Cloudera’s partner ecosystem, the company recently announced the promotion of Michelle Hoover to Senior Vice President of Global Alliances & Channels.

article thumbnail

Roadmap for Becoming a Data Scientist

KDnuggets

From learning Python to creating analytical reports, learn about ten easy steps to become a data scientist.

Python 137
article thumbnail

Gen AI in Action: Customers’ Cortex AI Stories and Outcomes

Snowflake

For years, companies have operated under the prevailing notion that AI is reserved only for the corporate giants — the ones with the resources to make it work for them. But as technology speeds forward, organizations of all sizes are realizing that generative AI isn’t just aspirational: It’s accessible and applicable now. With Snowflake’s easy-to-use, unified AI and data platform, businesses are removing the manual drudgery, bottlenecks and error-prone labor that stymie productivity, and are usi

article thumbnail

Season's Speedings: Databricks SQL Delivers 4x Performance Boost Over Two Years

databricks

As the season of giving approaches, we at Databricks have been making our list and checking it twice--but instead of toys and treats.

SQL 96
article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.

article thumbnail

Unlocking Faster Insights: How Cloudera and Cohere can deliver Smarter Document Analysis

Cloudera

Today we are excited to announce the release of a new Cloudera Accelerator for Machine Learning (ML) Projects (AMP) for PDF document analysis, “ Document Analysis with Command R and FAISS ”, leveraging Cohere’s Command R Large Language Model (LLM), the Cohere Toolkit for retrieval augmented generation (RAG) applications, and Facebook’s AI Similarity Search (FAISS).

article thumbnail

Navigating AI Regulation: Balancing Innovation and Protection

KDnuggets

In this article, we will learn how to navigate the fine balance building AI regulation while simultaneously fostering innovation.

Building 123
article thumbnail

Introducing Apache Kafka® 3.9

Confluent

Apache Kafka 3.9 includes multiple KIPs covering Kafka Core, Connect, and Streams—adding dynamic KRaft quorums, better ZK migration, Tiered Storage improvements & more.

Kafka 69
article thumbnail

Adopting Spark Connect

Towards Data Science

How we use a shared Spark server to make our Spark infrastructure more efficient Image by Kanenori from Pixabay Spark Connect is a relatively new component in the Spark ecosystem that allows thin clients to run Spark applications on a remote Spark cluster. This technology can offer some benefits to Spark applications that use the DataFrame API. Spark has long allowed to run SQL queries on a remote Thrift JDBC server.

Scala 70
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

What's new with Databricks SQL, October 2024

databricks

We are excited to share the latest features and performance improvements that make Databricks SQL simpler, faster, and more affordable than ever. Databricks.

SQL 74
article thumbnail

Mastering f-strings in Python

KDnuggets

Discover how to leverage Python's f-strings (formatted string literals) to write cleaner, more efficient, and more readable code.

Python 115
article thumbnail

Data Engineering Weekly #196

Data Engineering Weekly

Foundation Capital: A System of Agents brings Service-as-Software to life software is no longer simply a tool for organizing work; software becomes the worker itself, capable of understanding, executing, and improving upon traditionally human-delivered services. The author narrates that multiple agents working together achieve better results than one.

article thumbnail

Ransomware Attacks: 3 Keys to Resilience for Your IBM i Systems

Precisely

Key Takeaways: In the face of ransomware attacks, a resilience strategy for IBM i systems must include measures for prevention, detection, and recovery. Built-in security features and enterprise-wide security operations help create a robust defense against ransomware. AI-driven tools are emerging to help you combat these attacks more efficiently and effectively.

Systems 59
article thumbnail

Prepare Now: 2025's Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, Terrence Sheflin, and Mahyar Ghasemali

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

Loading data into Redshift with DBT

Yelp Engineering

At Yelp, we embrace innovation and thrive on exploring new possibilities. With our consumers’ ever growing appetite for data, we recently revisited how we could load data into Redshift more efficiently. In this blog post, we explore how DBT can be used seamlessly with Redshift Spectrum to read data from Data Lake into Redshift to significantly reduce runtime, resolve data quality issues, and improve developer productivity.

article thumbnail

5 No-Cost Learning Resources for LLM Agents

KDnuggets

Curious about LLM agents? Here’s a list of free courses, guides, and blogs that make it easy to start learning and stay updated.

IT 111
article thumbnail

Discover the Future of Data Streaming with Confluent at AWS re:Invent 2024

Confluent

Join Confluent at AWS re:Invent 2024 to learn how to stream, connect, process, and govern data, unlocking its full potential. Visit our booth for demos, sessions, and more.

AWS 59
article thumbnail

Women on Wednesday with Sierra Weltha

Precisely

According to CIO magazine , “diversity is critical to IT performance. Diverse teams perform better, hire better talent, have more engaged members, and retain workers better than those that don’t focus on diversity.” And while more women are joining the technology industry, the fact remains that it’s important to continue closing the gender gap for these reasons.

article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Why Open Table Format Architecture is Essential for Modern Data Systems

phData: Data Engineering

The world we live in today presents larger datasets, more complex data, and diverse needs, all of which call for efficient, scalable data systems. Though basic and easy to use, traditional table storage formats struggle to keep up. Open Table Format (OTF) architecture now provides a solution for efficient data storage, management, and processing while ensuring compatibility across different platforms.

article thumbnail

Language Models Explained in 5 Minutes

KDnuggets

Familiarize yourself with the technology behind ChatGPT and Google Gemini in the time it takes to enjoy a cup of coffee.

article thumbnail

The “Gold-Rush Paradox” in Data: Why Your KPIs Need a Rethink

Towards Data Science

You’re not doing as good a job as you think you are Continue reading on Towards Data Science »

article thumbnail

2025 Planning Insights: Data Quality Remains the Top Data Integrity Challenge and Priority

Precisely

Key Takeaways: Data quality is the top challenge impacting data integrity – cited as such by 64% of organizations. Data trust is impacted by data quality issues, with 67% of organizations saying they don’t completely trust their data used for decision-making. Data quality is the top data integrity priority in 2024, cited by 60% of respondents. The 2025 Outlook: Data Integrity Trends and Insights report is here!

article thumbnail

The Cloud Development Environment Adoption Report

Cloud Development Environments (CDEs) are changing how software teams work by moving development to the cloud. Our Cloud Development Environment Adoption Report gathers insights from 223 developers and business leaders, uncovering key trends in CDE adoption. With 66% of large organizations already using CDEs, these platforms are quickly becoming essential to modern development practices.