Mon.Oct 17, 2022

article thumbnail

Working With Sparse Features In Machine Learning Models

KDnuggets

Sparse features can cause problems like overfitting and suboptimal results in learning models, and understanding why this happens is crucial when developing models. Multiple methods, including dimensionality reduction, are available to overcome issues due to sparse features.

article thumbnail

#ClouderaLife Spotlight: Elias Avila, Sr. Staff Proactive Support Engineer

Cloudera

As we wrap up Hispanic Heritage month this #ClouderaLife Spotlight features Elias Avila, senior staff proactive support engineer for Cloudera. In this spotlight, we talk about his career in technology and his philosophy for getting the most out of work in terms of satisfaction and advancement. We also talk about his upbringing in the primarily Mexican American community of Salinas, California, and the important role Hispanics play in California’s Central Valley. .

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

7 Free Platforms for Building a Strong Data Science Portfolio

KDnuggets

Outshine others and increase your odds of getting hired by maintaining a data science portfolio with projects, resumes, blogs, and reports.

Portfolio 158
article thumbnail

5 Steps To A Successful Data Warehouse Migration

Monte Carlo

Platform and data warehouse migrations aren’t something you do everyday or even every few years, but they’re becoming much more frequent as organizations seek to modernize their data infrastructure with the new capabilities being offered by Snowflake, Databricks, Google, AWS, and others. [Editor’s note: We agree. Cloud database migrations were listed in our latest ebook The 22 Hottest Trends In Data Right Now ] Migrations are like Schrodinger’s cat.

article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

Implementing Adaboost in Scikit-learn

KDnuggets

It is called Adaptive Boosting due to the fact that the weights are re-assigned to each instance, with higher weights being assigned to instances that are not correctly classified - therefore it ‘adapts’.

IT 110
article thumbnail

Monitoring for the dbt Semantic Layer and Beyond

Monte Carlo

The dbt Semantic Layer is poised to take the spotlight at this year’s Coalesce conference. It’s a solution the data world has been eagerly anticipating as dbt Labs has teased its development since last year’s Coalesce conference. For those that are unfamiliar, a semantic layer is the component of the modern data stack that defines and locks down the aggregated metrics that are important to business operations.

BI 52

More Trending

article thumbnail

The Fight for Controlled Freedom of the Data Warehouse

Monte Carlo

A silent alarm rings in my head whenever I hear someone utter the phrase, “data is everyone’s responsibility.” You can just as easily translate this to mean that “data is no one’s responsibility,” too. This, readers, is what I call the “data tragedy of the commons.” Here’s why. The term tragedy of the commons comes from economic science and refers to situations where a common set of resources is accessed without any strong regulations or guardrails to curtail abuse.

article thumbnail

Join Dr. Kirk Borne’s Applied Machine Learning Live Course

KDnuggets

Explore Machine Learning with hands-on labs and real world applications with Dr. Kirk Borne, ex-NASA Scientist and former Principal Data Scientist at Booz Allen Hamilton.

article thumbnail

Recovering from Crashes with Safe Mode

Lyft Engineering

Feature flags are everywhere in modern software development: They’re a great tool for running A/B experiments, slowly rolling out changes to users, and even turning off problematic codepaths during incidents. When an engineer implements a new feature, it’s practically second-nature to gate it behind a feature flag. While this practice is largely beneficial for the most part, incidents are occasionally caused when a feature flag enables a buggy codepath and causes a crash or an otherwise degraded

article thumbnail

Stronger together: Python, dataframes, and SQL

dbt Developer Hub

For years working in data and analytics engineering roles, I treasured the daily camaraderie sharing a small office space with talented folks using a range of tools - from analysts using SQL and Excel to data scientists working in Python. I always sensed that there was so much we could work on in collaboration with each other - but siloed data and tooling made this much more difficult.

SQL 52
article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.