Sat.Aug 27, 2022 - Fri.Sep 02, 2022

article thumbnail

How to Select Rows and Columns in Pandas Using [ ],loc, iloc,at and.iat

KDnuggets

Subset selection is one of the most frequently performed tasks while manipulating data. Pandas provides different ways to efficiently select subsets of data from your DataFrame.

Data 160
article thumbnail

An Exploration Of What Data Automation Can Provide To Data Engineers And Ascend's Journey To Make It A Reality

Data Engineering Podcast

Summary The dream of every engineer is to automate all of their tasks. For data engineers, this is a monumental undertaking. Orchestration engines are one step in that direction, but they are not a complete solution. In this episode Sean Knapp shares his views on what constitutes proper automation and the work that he and his team at Ascend are doing to help make it a reality.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Incremental Strategies to Move Your Data Strategy Forward Remove Obstacles to Unlock Possibilities in Financial Services

Cloudera

Firms are burdened with tech debt and endless regulatory compliance, often leaving innovation last to receive the necessary budgets. Data-fuelled innovation requires a pragmatic strategy. This blog lays out some steps to help you incrementally advance efforts to be a more data-driven, customer-centric organization. Embrace incremental progress. The financial sector’s evolution is unleashing myriad demands on firms operating in the market.

article thumbnail

Teradata VantageCloud Lake and ClearScape Analytics: Empowering Enterprise Analytical Innovation

Teradata

Teradata's new offerings, VantageCloud Lake and ClearScape Analytics, make it the complete cloud analytics & data platform, with cloud-native deployment and expanded analytics capabilities.

Cloud 98
article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

The Difference Between Training and Testing Data in Machine Learning

KDnuggets

When building a predictive model, the quality of the results depends on the data you use. In order to do so, you need to understand the difference between training and testing data in machine learning.

article thumbnail

Alumni Of AirBnB's Early Years Reflect On What They Learned About Building Data Driven Organizations

Data Engineering Podcast

Summary AirBnB pioneered a number of the organizational practices that have become the goal of modern data teams. Out of that culture a number of successful businesses were created to provide the tools and methods to a broader audience. In this episode several almuni of AirBnB’s formative years who have gone on to found their own companies join the show to reflect on their shared successes, missed opportunities, and lessons learned.

Building 100

More Trending

article thumbnail

What Do You Want to be Famous for?

Teradata

Financial services organizations that exhibit true data literacy avoid bottlenecks and instead choose to build best in class solutions that meet current and future needs. Find out more.

article thumbnail

Build a Reproducible and Maintainable Data Science Project: A Free Online Book

KDnuggets

This free online book is a fantastic resource on how to structure, manage, and maintain your real-world data science projects.

article thumbnail

Expert Roundtable: How to Build Real-Time Personalization and Recommendation Systems

Rockset

I recently had the good fortune to host a small-group discussion on personalization and recommendation systems with two technical experts with years of experience at FAANG and other web-scale companies. Raghavendra Prabhu (RVP) is Head of Engineering and Research at Covariant , a Series C startup building an universal AI platform for robotics starting in the logistics industry.

Systems 52
article thumbnail

Celebrate Back-to-School Season With Data Streaming Basics

Confluent

All the best data streaming resources, tips, and guides to help you learn introductory concepts, streaming architecture basics, common tools and technologies, and more.

article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

5 Ways To Ensure High Functioning Data Engineering Teams 

Monte Carlo

Data engineering is a relatively young profession, even for the tech space. To put it in perspective, front-end engineering has twice the number of years in industry maturity. While the role itself is rapidly evolving, the tooling, processes, and team structure are fragmented and amorphous at best. As a result, the day-to-day responsibilities of a data engineer can look radically different from one company to another, depending on the needs of the business and the data that drives it.

article thumbnail

Decision Tree Pruning: The Hows and Whys

KDnuggets

Decision trees are a machine learning algorithm that is susceptible to overfitting. One of the techniques you can use to reduce overfitting in decision trees is pruning.

article thumbnail

MarkLogic And Machine Learning: Easy way of ML

Knoldus

Reading Time: 6 minutes Introduction Machine learning is a subfield of computer science. Used to deal with the construction of artificial intelligence systems that can learn without being explicitly programmed. It has been applied in many areas such as data analysis, pattern recognition, and understanding human behavior. MarkLogic combines database internals, search-style indexing, and application server behavior into a unified system.

article thumbnail

Declarative Connectors with Confluent for Kubernetes

Confluent

Manage connectors declaratively with Confluent for Kubernetes.

article thumbnail

How to Build an Experimentation Culture for Data-Driven Product Development

Speaker: Margaret-Ann Seger, Head of Product, Statsig

Experimentation is often seen as an aspirational practice, especially at smaller, fast-moving companies who are strapped for time and resources. So, how can you get your team making decisions in a more data-driven way while continuing to remain lean and maintaining ship velocity? In this webinar, Margaret-Ann Seger, Head of Product at Statsig, will teach you how to build an experimentation culture from the ground-up, graduating from just getting started with data-driven development to operating

article thumbnail

Data Quality Monitoring – You’re Doing It Wrong

Monte Carlo

Occasionally, we’ll talk with data teams interested in applying data quality monitoring narrowly across only a specific set of key tables. The argument goes something like: “You may have hundreds or thousands of tables in your environment, but most of your business value derives from only a few that really matter. That’s where you really want to focus your efforts.

IT 52
article thumbnail

Machine Learning Metadata Store

KDnuggets

In this article, we will learn about metadata stores, the need for them, their components, and metadata store management.

Metadata 154
article thumbnail

Five Reasons for Migrating HBase Applications to the Cloudera Operational Database in the Public Cloud

Cloudera

Apache HBase has long been the database of choice for business-critical applications across industries. This is primarily because HBase provides unmatched scale, performance, and fault-tolerance that few other databases can come close to. Think petabytes of data spread across trillions of rows, ready for consumption in real-time. While application developers and database admins are well aware of the benefits of using HBase, they also know about a few shortcomings that the database has historical

article thumbnail

Loan Prediction using Machine Learning Project Source Code

ProjectPro

This article will walk you through how one can start by exploring a loan prediction system as a data science and machine learning problem and build a system/application for loan prediction using your own machine learning project. Loan sanctioning and credit scoring forms a multi-billion dollar industry -- in the US alone. With everyone from young students, entrepreneurs, and multi-million dollar companies turning to banks to seek financial support for their ventures, processing these application

article thumbnail

Entity Resolution Checklist: What to Consider When Evaluating Options

Are you trying to decide which entity resolution capabilities you need? It can be confusing to determine which features are most important for your project. And sometimes key features are overlooked. Get the Entity Resolution Evaluation Checklist to make sure you’ve thought of everything to make your project a success! The list was created by Senzing’s team of leading entity resolution experts, based on their real-world experience.

article thumbnail

August 2022 dbt Update: v1.3 beta, Tech Partner Program, and Coalesce!

dbt Developer Hub

Semantic layer, Python model support, the new dbt Cloud UI and IDE… there’s a lot our product team is excited to share with you at Coalesce in a few weeks. But how these things fit together—because of where dbt Labs is headed—is what I’m most excited to discuss. You’ll hear more in Tristan’s keynote , but this feels like a good time to remind you that Coalesce isn’t just for answering tough questions… it’s for surfacing them.

article thumbnail

3 Ways to Append Rows to Pandas DataFrames

KDnuggets

Learn a simple way to append rows in the form of arrays, dictionaries, series, and dataframes to another dataframe.

Python 152
article thumbnail

Breaking State and Local Data Silos with Modern Data Architectures

Cloudera

Data is the fuel that drives government, enables transparency, and powers citizen services. But while state and local governments seek to improve policies, decision making, and the services constituents rely upon, data silos create accessibility and sharing challenges that hinder public sector agencies from transforming their data into a strategic asset and leveraging it for the common good. .

article thumbnail

AI in Drug Discovery and Repurposing: Benefits, Approaches, and Use Cases

AltexSoft

According to McKesson, a company with a two-hundred-year history delivering a third of all drugs across North America, you need around six months to start a pharmacy and another seven to nine months to see any revenue. If this seems too long and complex, just make a comparison with a drug development process. It takes at least ten years and $2.6 billion to get a new medicine to the market.

Medical 52
article thumbnail

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Speaker: David Bard, Principal at VP Product Coaching

In the fast-paced world of digital innovation, success is often accompanied by a multitude of challenges - like the pitfalls lurking at every turn, threatening to derail the most promising projects. But fret not, this webinar is your key to effective product development! Join us for an enlightening session to empower you to lead your team to greater heights.

article thumbnail

What is Data Discovery: Definitions & Overview

Monte Carlo

In the world of data engineering, data discovery refers to the ability to find relevant data sets across your data platform and understand their context. Data discovery makes data engineering and analytical engineering tasks more efficient and can enable self-service access for other types of data consumers. Just like knowledge workers need to tap into a shared repository to discover and combine relevant information across documents or slide decks, data professionals need to do the same with dat

article thumbnail

Machine Learning in the Enterprise: Use Cases & Challenges

KDnuggets

This article provides insights into how leading data scientists are embracing machine learning in their organizations and covers some of the major ML challenges and trends in the enterprise.

article thumbnail

The Benefits of Natural Language AI for Content Creators

KDnuggets

In this article, we will discuss the benefits of natural language AI for content creators, highlighting the key reasons why you should consider using it to improve your content output.

IT 110
article thumbnail

The Complete Data Science Study Roadmap

KDnuggets

This article will map out the things you need to do to become a data scientist.

article thumbnail

Reimagined: Building Products with Generative AI

“Reimagined: Building Products with Generative AI” is an extensive guide for integrating generative AI into product strategy and careers featuring over 150 real-world examples, 30 case studies, and 20+ frameworks, and endorsed by over 20 leading AI and product executives, inventors, entrepreneurs, and researchers.

article thumbnail

Top Posts August 22-28: Free Python Project Coding Course

KDnuggets

Free Python Project Coding Course • 5 Tricky SQL Queries Solved • Decision Tree Algorithm, Explained • Free AI for Beginners Course • The Complete Collection of Data Science Projects & Part 2.

Coding 103
article thumbnail

KDnuggets News, August 31: The Complete Data Science Study Roadmap • 7 Techniques to Handle Imbalanced Data

KDnuggets

The Complete Data Science Study Roadmap • 7 Techniques to Handle Imbalanced Data • 3 Ways to Append Rows to Pandas DataFrames • The Bias-Variance Trade-off • How to Package and Distribute Machine Learning Models with MLFlow.

article thumbnail

Combining Pandas DataFrames Made Simple

KDnuggets

For this tutorial, we will work through examples to understand how different mehtods for combining Pandas DataFrames work.

Python 119
article thumbnail

Data Governance and Observability, Explained

KDnuggets

Let’s dive in and understand the ins and outs of data observability and data governance - the two keys to a more robust data foundation.

article thumbnail

The Big Payoff of Application Analytics

Outdated or absent analytics won’t cut it in today’s data-driven applications – not for your end users, your development team, or your business. That’s what drove the five companies in this e-book to change their approach to analytics. Download this e-book to learn about the unique problems each company faced and how they achieved huge returns beyond expectation by embedding analytics into applications.