Sat.Nov 27, 2021 - Fri.Dec 03, 2021

article thumbnail

Why Machine Learning Engineers are Replacing Data Scientists

KDnuggets

The hiring run for data scientists continues along at a strong clip around the world. But, there are other emerging roles that are demonstrating key value to organizations that you should consider based on your existing or desired skill sets.

article thumbnail

A Guide to Stream Processing and ksqlDB Fundamentals

Confluent

Event streaming applications are a powerful way to react to events as they happen and to take advantage of data while it is fresh. However, they can be a challenge […].

Process 141
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

In AI we trust? Why we Need to Talk About Ethics and Governance (part 2 of 2)

Cloudera

In part 1 of this blog post, we discussed the need to be mindful of data bias and the resulting consequences when certain parameters are skewed. Surely there are ways to comb through the data to minimise the risks from spiralling out of control. We need to get to the root of the problem. In 2019, the Gradient institute published a white paper outlining the practical challenges for Ethical AI.

article thumbnail

A Systematic Approach to Reducing Technical Debt

Zalando Engineering

Introduction While technical debt is a recurring issue in software engineering, the case of the Merchant Orders team within Zalando Direct was a an outlier as, due to a lack of a clearly defined process, technical debt more or less only ever accumulated. When I joined this team in autumn 2020 as its new engineering lead, the technical debt backlog had entries dating back to 2018.

article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

5 Practical Data Science Projects That Will Help You Solve Real Business Problems for 2022

KDnuggets

This curated list of data science projects offers real-life problems that will help you master skills to demonstration that you are technically sound and know how to conduct data science projects that add business value.

article thumbnail

Best Tutorials for Getting Started with Apache Kafka

Confluent

Each one of the more than 50 tutorials for Apache Kafka® on Confluent Developer answers a question that you might ask a knowledgeable friend or colleague about Kafka and its […].

Kafka 135

More Trending

article thumbnail

The Cloudera Enterprise Data Cloud Maturity Report: Uncovering progressive steps towards a hybrid future

Cloudera

This guest blog was written by Shanice Omare, Research Manager, Vanson Bourne. Organizations’ resiliency in the wake of the pandemic . So much has changed for organizations in recent times, with the pandemic accelerating shifts toward a more digital world. Some organizations have taken this as an opportunity for positive change by moving workloads to the cloud and utilizing enterprise data strategies that are key to their business resiliency.

Cloud 74
article thumbnail

How to Get Certified as a Data Scientist

KDnuggets

If you are early in your journey to becoming a Data Scientist, an interesting option is to earn certification by DataCamp, and this guide offers tips that will help beginners complete the challenges.

article thumbnail

Why Rockset Is My Next Job After Facebook

Rockset

“At every step in this process, I’ve been consistently impressed by the quality and caliber of the team." On Monday, I joined Rockset. I am joining Rockset as its first director of engineering and first external manager hire. I come here from Facebook, where I spent the last 10 years building and supporting teams. Most of my work was in the core C++ libraries and distributed systems components that power Facebook’s infrastructure.

article thumbnail

Data Mining vs Machine Learning. Here’s the Difference

ProjectPro

We all are aware of the advancements in technology; new terminologies are coming in with these advancements. Everyone wants to keep up with this, wanting to sound tech-savvy. To ensure this, it is important to understand the exact meaning of the terminologies before we use them. Data is the New Fuel. We all know this , so you might have heard terms like Artificial Intelligence (AI), Machine Learning, Data Mining, Neural Networks, etc.

article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

The (Missing) Role of Design in Analytics

dbt Developer Hub

If you’ve spoken to me lately, follow me on Twitter , or have taken my order at Wendy’s , you probably know how much I hate traditional dashboards. My dad, a psychotherapist, has been working with me to get to the root of my upbringing that led to this deep-rooted feeling. As it turns out, the cause of my feelings towards traditional dashboarding are actually quite obvious.

article thumbnail

2021: A Year Full of Amazing AI papers — A Review

KDnuggets

A curated list of the latest breakthroughs in AI by release date with a clear video explanation, link to a more in-depth article, and code.

Coding 159
article thumbnail

From Strategy to Action: How to ‘Break the Code’ of Analytics at Scale in Retail and CPG

Teradata

Retail and CPG leaders of the future need to successfully leverage analytics at speed and scale to drive performance. Find out more.

Retail 52
article thumbnail

What is a Data Source?

Grouparoo

The data source is the location of the data that the processing will consume for data processing functions. This can be the point of origin of the data, the place of its creation. Alternatively, this can be data generated by another process and then made available for subsequent processing. Therefore, the source data may be raw, unfiltered, and unrefined, or polished and fully formed.

article thumbnail

How to Build an Experimentation Culture for Data-Driven Product Development

Speaker: Margaret-Ann Seger, Head of Product, Statsig

Experimentation is often seen as an aspirational practice, especially at smaller, fast-moving companies who are strapped for time and resources. So, how can you get your team making decisions in a more data-driven way while continuing to remain lean and maintaining ship velocity? In this webinar, Margaret-Ann Seger, Head of Product at Statsig, will teach you how to build an experimentation culture from the ground-up, graduating from just getting started with data-driven development to operating

article thumbnail

The Spiritual Alignment of dbt + Airflow

dbt Developer Hub

Airflow and dbt are often framed as either / or: You either build SQL transformations using Airflow’s SQL database operators (like SnowflakeOperator ), or develop them in a dbt project. You either orchestrate dbt models in Airflow, or you deploy them using dbt Cloud. In my experience, these are false dichotomies, that sound great as hot takes but don’t really help us do our jobs as data people.

article thumbnail

What Percentage of Your Machine Learning Models Have Been Deployed?

KDnuggets

Take a moment to participate in the latest KDnuggets poll and let the community know what percentage of your machine learning models have been deployed.

article thumbnail

Making Data Science Responsible

Elder Research

The post Making Data Science Responsible appeared first on Elder Research.

article thumbnail

RudderStack Secures SOC 2 Type II Certification

RudderStack

We consider security to be vital, especially when it comes to our customers’ data. We’re excited that we have attained SOC 2 Type II compliance.

article thumbnail

Entity Resolution Checklist: What to Consider When Evaluating Options

Are you trying to decide which entity resolution capabilities you need? It can be confusing to determine which features are most important for your project. And sometimes key features are overlooked. Get the Entity Resolution Evaluation Checklist to make sure you’ve thought of everything to make your project a success! The list was created by Senzing’s team of leading entity resolution experts, based on their real-world experience.

article thumbnail

Welcome to the dbt Developer Blog

dbt Developer Hub

Doing analytics is hard. Doing analytics right is even harder. There are a massive number of factors to consider: Is data missing? How do we make this insight discoverable? Why is my database locked? Are we even asking the right questions? Compounding this is the fact that analytics can sometimes feel like a lonely pursuit. Sure, our data is generally proprietary and therefore we can’t talk much about it.

SQL 52
article thumbnail

KDnuggets: Personal History and Nuggets of Experience

KDnuggets

After 28+ years of publishing and editing KDnuggets, I am retiring and transitioning KDnuggets to Matthew Mayo, who will become the new editor-in-chief. I want to share with you my story of KDnuggets and highlight some of the useful nuggets of experience I learned along this amazing journey.

128
128
article thumbnail

Create your Private Data Warehousing Environment Using Azure Kubernetes Service

Cloudera

For Cloudera ensuring data security is critical because we have large customers in highly regulated industries like financial services and healthcare, where security is paramount. Also, for other industries like retail, telecom or public sector that deal with large amounts of customer data and operate multi-tenant environments, sometimes with end users who are outside of their company, securing all the data may be a very time intensive process.

article thumbnail

Eight Top DataOps Trends for 2022

DataKitchen

DataOps adoption continues to expand as a perfect storm of social, economic, and technological factors drive enterprises to invest in process-driven innovation. From our unique vantage point in the evolution toward DataOps automation, we publish an annual prediction of trends that most deeply impact the DataOps enterprise software industry as a whole.

article thumbnail

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Speaker: David Bard, Principal at VP Product Coaching

In the fast-paced world of digital innovation, success is often accompanied by a multitude of challenges - like the pitfalls lurking at every turn, threatening to derail the most promising projects. But fret not, this webinar is your key to effective product development! Join us for an enlightening session to empower you to lead your team to greater heights.

article thumbnail

So You Want to Build a dbt Package

dbt Developer Hub

Packages are the easiest way for a dbt user to contribute code to the dbt community. This is a belief that I hold close as someone who is a contributor to packages and has helped many partners create their own during my time here at dbt Labs. The reason is simple: packages, as an inherent part of dbt, follow our principle of being built by and for analytics engineers.

article thumbnail

Sentiment Analysis with KNIME

KDnuggets

Check out this tutorial on how to approach sentiment classification with supervised machine learning algorithms.

Algorithm 160
article thumbnail

Doing DataOps For External Data Sources As A Service at Demyst

Data Engineering Podcast

Summary The data that you have access to affects the questions that you can answer. By using external data sources you can drastically increase the range of analysis that is available to your organization. The challenge comes in all of the operational aspects of finding, accessing, organizing, and serving that data. In this episode Mark Hookey discusses how he and his team at Demyst do all of the DataOps for external data sources so that you don’t have to, including the systems necessary t

article thumbnail

How to Become a Deep Learning Engineer in 2023?

ProjectPro

Deep learning was developed in the early 1940s to mimic the neural networks of the human brain. However, it did not garner enough interest due to limited computation power and storage options. However, in the last few decades, deep learning has unleashed itself into the world. Its massive evolution is also the result of substantial research labs and industry players like Facebook, Google, Apple, Netflix, Microsoft, Baidu, and IBM investing in its research. 85% of data science platform vendors ha

article thumbnail

Reimagined: Building Products with Generative AI

“Reimagined: Building Products with Generative AI” is an extensive guide for integrating generative AI into product strategy and careers featuring over 150 real-world examples, 30 case studies, and 20+ frameworks, and endorsed by over 20 leading AI and product executives, inventors, entrepreneurs, and researchers.

article thumbnail

On the Importance of Naming: Model Naming Conventions (Part 1)

dbt Developer Hub

? This article is for anyone who has ever questioned the sanity of a date not in ISO 8601 format Have you ever been assigned to add new fields or concepts to an existing set of models and wondered: Why are there multiple models named almost the same but slightly different? Which model has the fields I need? Which model is upstream or downstream from which?

BI 52
article thumbnail

Sentiment Analysis API vs Custom Text Classification: Which one to choose?

KDnuggets

In this article, we are going to compare the sentiment extraction performance between Sentiment Analysis engines and Custom Text classification engines. The idea is to show pros and cons of these two types of engines on a concrete dataset.

Datasets 124
article thumbnail

Creating A Unified Experience For The Modern Data Stack At Mozart Data

Data Engineering Podcast

Summary The modern data stack has been gaining a lot of attention recently with a rapidly growing set of managed services for different stages of the data lifecycle. With all of the available options it is possible to run a scalable, production grade data platform with a small team, but there are still sharp edges and integration challenges to work through.

BI 100
article thumbnail

Maps with PostgreSQL and PostGIS

Zalando Engineering

This blog post explains to you which tools to use to serve geospatial data from a database system (PostgreSQL) to your web browser. All you need is a database server for the data, a web map application for the frontend and a small service in between to transfer user requests. I will also show you how these components can run on top of Kubernetes in a highly available cloud native fashion.

article thumbnail

The Big Payoff of Application Analytics

Outdated or absent analytics won’t cut it in today’s data-driven applications – not for your end users, your development team, or your business. That’s what drove the five companies in this e-book to change their approach to analytics. Download this e-book to learn about the unique problems each company faced and how they achieved huge returns beyond expectation by embedding analytics into applications.