November, 2023

article thumbnail

Use Data Enrichment to Supercharge AI

Precisely

AI transforms how we interact with technology, make decisions, and solve complex problems. It has been at the heart of many innovations over the past two years, powering everything from the chatbots that enhance our customer experiences to the predictive analytics engines that help us make financial decisions. What defines a successful AI initiative, and how can your organization ensure that your investments and hard work deliver maximum value for your organization?

Raw Data 121
article thumbnail

What is an Open Table Format? & Why to use one?

Start Data Engineering

1. Introduction 2. What is an Open Table Format (OTF) 3. Why use an Open Table Format (OTF) 3.0. Setup 3.1. Evolve data and partition schema without reprocessing 3.2. See previous point-in-time table state, aka time travel 3.3. Git like branches & tags for your tables 3.4. Handle multiple reads & writes concurrently 4. Conclusion 5. Further reading 6.

Data 322
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

The Roots of Today's Modern Backend Engineering Practices

The Pragmatic Engineer

👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover topics related to Big Tech and startups through the lens of engineering managers and senior engineers. In this article, we cover thee out of nine topics from today’s subscriber-only issue: The Past and Future of Modern Backend Practices.

article thumbnail

Unlocking the Power of Analytics with Dr. Swati Jain

Analytics Vidhya

In this Leading with Data episode, explore the analytics landscape with Dr. Swati Jain, a seasoned leader boasting over two decades of experience. From her unforeseen foray into analytics to steering EXL Analytics’ India business, Dr. Jain imparts invaluable insights into the ever-evolving world of data science. Read on to know more about her career, […] The post Unlocking the Power of Analytics with Dr.

article thumbnail

Get Better Network Graphs & Save Analysts Time

Many organizations today are unlocking the power of their data by using graph databases to feed downstream analytics, enahance visualizations, and more. Yet, when different graph nodes represent the same entity, graphs get messy. Watch this essential video with Senzing CEO Jeff Jonas on how adding entity resolution to a graph database condenses network graphs to improve analytics and save your analysts time.

article thumbnail

7 Machine Learning Algorithms You Can’t Miss

KDnuggets

This list of machine learning algorithms is a good place to start your journey as a data scientist. You should be able to identify the most common models and use them in the right applications.

article thumbnail

Patching the PostgreSQL JDBC Driver

Zalando Engineering

Introduction This blog post describes a recent contribution from Zalando to the Postgres JDBC driver to address a long-standing issue with the driver’s integration with Postgres’ logical replication that resulted in runaway Write-Ahead Log (WAL) growth. We will describe the issue, how it affected us at Zalando, and detail the fix made upstream in the JDBC driver that fixes the issue for Debezium and all other clients of the Postgres JDBC driver.

More Trending

article thumbnail

A Deep Dive Into Sending With librdkafka

Confluent

Learn how to write code that produces messages via librdkafka, how it will behave during error situations, and how your application should detect and respond to them.

Coding 130
article thumbnail

Asked to do something illegal at work? Here’s what these software engineers did

The Pragmatic Engineer

The below topic was sent out to full subscribers of The Pragmatic Engineer , three weeks ago, in The Pulse #66. I have received several messages from people asking if they can pay to “unlock” this information for others, given how vital it is for software engineers. It is vital, and so I’m sharing this with all readers, without a paywall.

article thumbnail

Enhancing the security of WhatsApp calls

Engineering at Meta

New optional features in WhatsApp have helped make calling on WhatsApp more secure. “Silence Unknown Callers” is a new setting on WhatsApp that not only quiets annoying calls but also blocks sophisticated cyber attacks. “Protect IP Address in Calls” is a new setting on WhatsApp that helps hide your location from other parties on the call. Privacy and security are at the core of WhatsApp.

Metadata 124
article thumbnail

Learn Probability in Computer Science with Stanford University for FREE

KDnuggets

Probability is one of the foundational elements of computer science. Some bootcamps will skim over the topic, however, it is integral to your computer science knowledge.

article thumbnail

Understanding User Needs and Satisfying Them

Speaker: Scott Sehlhorst

We know we want to create products which our customers find to be valuable. Whether we label it as customer-centric or product-led depends on how long we've been doing product management. There are three challenges we face when doing this. The obvious challenge is figuring out what our users need; the non-obvious challenges are in creating a shared understanding of those needs and in sensing if what we're doing is meeting those needs.

article thumbnail

Why Spatial Data Governance is Critical to Your Business Strategy

Precisely

When speaking to organizations about data integrity , and the key role that both data governance and location intelligence play in making more confident business decisions, I keep hearing the following statements: “For any organization, data governance is not just a nice-to-have! “ “Everyone knows that 80% of data contains location information. Why are you still telling us this, Monica?

article thumbnail

Databricks + Arcion: Real-time enterprise data replication to the Lakehouse

databricks

We are excited to announce that we have completed our acquisition of Arcion, a leading provider for real-time data replication technologies. Arcion’s capabilities w.

Data 133
article thumbnail

Dialpad Turns to Confluent and StarTree for Real-Time Customer Intelligence

Confluent

Learn how AI-powered customer intelligence platform Dialpad modernized its data infrastructure and improved customer satisfaction rates with Confluent and Startree.

IT 125
article thumbnail

Source filtering with file sets

Tweag

Sponsored by Antithesis (distributed systems reliability testing experts), I’ve developed a new library to filter local files in Nix which I’d like to introduce! This post requires some familiarity with Nix and its language. So if you don’t know what Nix is yet, take a look first, it’s pretty neat. In this post we’re going to look at what source filtering is, why it’s useful, why a new library was needed for it, and the basics of the new library.

Building 117
article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

What’s New in ArcGIS Pro 3.2

ArcGIS

From oriented imagery to engaging thematic map series, there is something for everyone in this release of ArcGIS Pro 3.2.

143
143
article thumbnail

A Comprehensive List of Resources to Master Large Language Models

KDnuggets

Large Language Models (LLMs) have now become an integral part of various applications. This article provides an extensive list of resources for anyone interested to dive into the world of LLMs.

139
139
article thumbnail

1. Streamlining Membership Data Engineering at Netflix with Psyberg

Netflix Tech

By Abhinaya Shetty , Bharath Mummadisetty At Netflix, our Membership and Finance Data Engineering team harnesses diverse data related to plans, pricing, membership life cycle, and revenue to fuel analytics, power various dashboards, and make data-informed decisions. Many metrics in Netflix’s financial reports are powered and reconciled with efforts from our team!

article thumbnail

Data Intelligence Platforms

databricks

The observation that "software is eating the world" has shaped the modern tech industry. Today, software is ubiquitous in our lives, from the.

Data 142
article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

Top 7 Free Apache Kafka Tutorials and Courses for Beginners in 2023

Confluent

The top 7 free online courses, tutorials, get started guides, and examples for the easiest way to learn Apache Kafka.

Kafka 131
article thumbnail

Organist: stay sane managing your development environments

Tweag

tl;dr: We’re pleased to announce the beta release of Organist , a tool designed to ease the definition of reliable and low-friction development environments and workflows, building on the combined strengths of Nix and Nickel. A mess of cables and knobs I used to play piano as a kid. As a teenager, I became frustrated by the limitations of the instrument and started getting into synthesizers.

article thumbnail

What’s new from the geodatabase team in ArcGIS Pro 3.2

ArcGIS

Here's everything new in ArcGIS Pro 3.2 from the Geodatabase Team. Schema Reports, 64-bit OIDs, Big Integer fields, new date fields, etc.

article thumbnail

11 Python Magic Methods Every Programmer Should Know

KDnuggets

Want to support the behavior of built-in functions and method calls in your Python classes? Magic methods in Python let you do just that! So let’s uncover the method behind the magic.

Python 135
article thumbnail

The Big Payoff of Application Analytics

Outdated or absent analytics won’t cut it in today’s data-driven applications – not for your end users, your development team, or your business. That’s what drove the five companies in this e-book to change their approach to analytics. Download this e-book to learn about the unique problems each company faced and how they achieved huge returns beyond expectation by embedding analytics into applications.

article thumbnail

Harness the Power of Pinecone with Cloudera’s New Applied Machine Learning Prototype

Cloudera

Elevate your AI applications with our latest applied ML prototype At Cloudera, we continuously strive to empower organizations to unlock the full potential of their data, catalyzing innovation and driving actionable insights. And so we are thrilled to introduce our latest applied ML prototype (AMP) — a large language model (LLM) chatbot customized with website data using Meta’s Llama2 LLM and Pinecone’s vector database.

article thumbnail

Enhancing your team’s performance by building a data culture

databricks

Defining what a data culture is can vary by organization. A data culture is the shared values, attitudes, and behaviors that enable organizations.

Building 121
article thumbnail

Druid Deprecation and ClickHouse Adoption at Lyft

Lyft Engineering

Written by Ritesh Varyani and Jeana Choi at Lyft. Introduction At Lyft, we have used systems like Apache ClickHouse and Apache Druid for near real-time and sub-second analytics. Sub-second query systems allow for near real-time data explorations and low latency, high throughput queries, which are particularly well-suited for handling time-series data.

Kafka 104
article thumbnail

Great Nickel configurations from little merges grow

Tweag

This blog post is part of the series exploring the foundations of the Nickel configuration language. Presenting Nickel: better configuration for less Programming with contracts in Nickel Types à la carte in Nickel Great configurations from little merges grow We previously looked at the core language, then contracts, and finally typing. The last important remaining piece to explore is the merge system.

Metadata 107
article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

Deep Learning with ArcGIS Pro Tips & Tricks: Part 1

ArcGIS

Prepare your environment to run out-of-the-box deep learning geoprocessing tools in ArcGIS Pro. Machine learning is more accessible than ever with pre-trained models enabling you to extract data from your imagery.

article thumbnail

Introduction to Giskard: Open-Source Quality Management for AI Models

KDnuggets

To solve the conundrum of ensuring the quality of AI models in production — especially given the emergence of LLMs — we are thrilled to announce the official launch of Giskard, the premier open-source AI quality management system.

article thumbnail

5 Reasons to Attend BUILD 2023: The Dev Conference for AI & Apps

Snowflake

BUILD 2023 is where AI gets real. Join our two-day virtual global conference and learn how to build with the app dev innovations you heard about at Snowflake Summit and Snowday. We have more demos and hands-on virtual labs than ever before—and you won’t find a bunch of slideware here. The focus is on tools and capabilities that are generally available or in public and private preview, so you can leave BUILD and put your new skills into action immediately.

Building 103
article thumbnail

Introducing Python User-Defined Table Functions (UDTFs)

databricks

Apache Spark™ 3.5 and Databricks Runtime 14.0 have brought an exciting feature to the table: Python user-defined table functions (UDTFs). In this blog p.

Python 106
article thumbnail

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Speaker: David Bard, Principal at VP Product Coaching

In the fast-paced world of digital innovation, success is often accompanied by a multitude of challenges - like the pitfalls lurking at every turn, threatening to derail the most promising projects. But fret not, this webinar is your key to effective product development! Join us for an enlightening session to empower you to lead your team to greater heights.