Sat.Oct 19, 2019 - Fri.Oct 25, 2019

article thumbnail

Data Orchestration For Hybrid Cloud Analytics

Data Engineering Podcast

Summary The scale and complexity of the systems that we build to satisfy business requirements is increasing as the available tools become more sophisticated. In order to bridge the gap between legacy infrastructure and evolving use cases it is necessary to create a unifying set of components. In this episode Dipti Borkar explains how the emerging category of data orchestration tools fills this need, some of the existing projects that fit in this space, and some of the ways that they can work to

Cloud 100
article thumbnail

Everything a Data Scientist Should Know About Data Management

KDnuggets

For full-stack data science mastery, you must understand data management along with all the bells and whistles of machine learning. This high-level overview is a road map for the history and current state of the expansive options for data storage and infrastructure solutions.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Teradata is Moving the Cloud Forward

Teradata

With four new offerings, Teradata is helping companies move from analytics to answers wherever they are on their cloud journey. Read more.

Cloud 66
article thumbnail

Open-sourcing Polynote: an IDE-inspired polyglot notebook

Netflix Tech

Jeremy Smith , Jonathan Indig , Faisal Siddiqi We are pleased to announce the open-source launch of Polynote : a new, polyglot notebook with first-class Scala support, Apache Spark integration, multi-language interoperability including Scala, Python, and SQL, as-you-type autocomplete, and more. Polynote provides data scientists and machine learning researchers with a notebook environment that allows them the freedom to seamlessly integrate our JVM-based ML platform ?

Scala 93
article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

The Role of UX in Making Rockset the Shortest Path from Data to Applications

Rockset

At Rockset, our singular focus is to be the shortest (and most efficient) path from data to applications for our users. We recognize and truly believe that our success lies in the success of our users. We constantly think about improving our workflows, coming up with new ones and iterating on them in ways that takes the user experience to a whole new level.

article thumbnail

How YouTube is Recommending Your Next Video

KDnuggets

If you are interested in learning more about the latest Youtube recommendation algorithm paper, read this post for details on its approach and improvements.

Algorithm 123

More Trending

article thumbnail

Open-sourcing Polynote: an IDE-inspired polyglot notebook

Netflix Tech

Jeremy Smith , Jonathan Indig , Faisal Siddiqi We are pleased to announce the open-source launch of Polynote : a new, polyglot notebook with first-class Scala support, Apache Spark integration, multi-language interoperability including Scala, Python, and SQL, as-you-type autocomplete, and more. Polynote provides data scientists and machine learning researchers with a notebook environment that allows them the freedom to seamlessly integrate our JVM-based ML platform ?

Scala 43
article thumbnail

Getting Started with Rust and Apache Kafka

Confluent

I’ve written an event sourcing bank simulation in Clojure (a lisp build for Java virtual machines or JVMs) called open-bank-mark , which you are welcome to read about in my previous blog post explaining the story behind this open source example. As a next step, specifically for this article I’ve added SSL and combined some topics together, using the subject name strategy option of Confluent Schema Registry , making it more production like, adding security, and making it possible to put multiple

Kafka 18
article thumbnail

Introduction to Natural Language Processing (NLP)

KDnuggets

Have you ever wondered how your personal assistant (e.g: Siri) is built? Do you want to build your own? Perfect! Let’s talk about Natural Language Processing.

Process 120
article thumbnail

Survey: Success of Global Enterprise Depends on Adaptation to Hyper-Digitization

Teradata

A new study by Teradata and research firm, Vanson Bourne, shines a light on the market forces impacting the world's largest companies. Find out more.

40
article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

Open-sourcing Polynote: an IDE-inspired polyglot notebook

Netflix Tech

Jeremy Smith , Jonathan Indig , Faisal Siddiqi We are pleased to announce the open-source launch of Polynote : a new, polyglot notebook with first-class Scala support, Apache Spark integration, multi-language interoperability including Scala, Python, and SQL, as-you-type autocomplete, and more. Polynote provides data scientists and machine learning researchers with a notebook environment that allows them the freedom to seamlessly integrate our JVM-based ML platform ?

Scala 40
article thumbnail

4 Steps to Creating Dynamic Kafka Connectors with the Kafka Connect API

Confluent

If you’ve worked with the Apache Kafka ® and Confluent ecosystem before, chances are you’ve used a Kafka Connect connector to stream data into Kafka or stream data out of it. While there is an ever-growing list of connectors available—whether Confluent or community supported?you still might find yourself needing to integrate with a technology for which no connectors exist.

Kafka 15
article thumbnail

Feature Selection: Beyond feature importance?

KDnuggets

In this post, you will see 3 different techniques of how to do Feature Selection to your datasets and how to build an effective predictive model.

Datasets 122
article thumbnail

Anomaly Detection, A Key Task for AI and Machine Learning, Explained

KDnuggets

One way to process data faster and more efficiently is to detect abnormal events, changes or shifts in datasets. Anomaly detection refers to identification of items or events that do not conform to an expected pattern or to other items in a dataset that are usually undetectable by a human expert.

article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

Intro to Adversarial Machine Learning and Generative Adversarial Networks

KDnuggets

In this crash course on GANs, we explore where they fit into the pantheon of generative models, how they've changed over time, and what the future has in store for this area of machine learning.

article thumbnail

5 Advanced Features of Pandas and How to Use Them

KDnuggets

The pandas library offers core functionality when preparing your data using Python. But, many don't go beyond the basics, so learn about these lesser-known advanced methods that will make handling your data easier and cleaner.

Python 88
article thumbnail

Time Series Analysis: A Simple Example with KNIME and Spark

KDnuggets

The task: train and evaluate a simple time series model using a random forest of regression trees and the NYC Yellow taxi dataset.

article thumbnail

How to Measure Foot Traffic Using Data Analytics

KDnuggets

You need to know how many people visit your store now and what sort of audience you're acquiring. Foot traffic data is going to be invaluable to the success of your business.

article thumbnail

Entity Resolution Checklist: What to Consider When Evaluating Options

Are you trying to decide which entity resolution capabilities you need? It can be confusing to determine which features are most important for your project. And sometimes key features are overlooked. Get the Entity Resolution Evaluation Checklist to make sure you’ve thought of everything to make your project a success! The list was created by Senzing’s team of leading entity resolution experts, based on their real-world experience.

article thumbnail

Convolutional Neural Network for Breast Cancer Classification

KDnuggets

See how Deep Learning can help in solving one of the most commonly diagnosed cancer in women.

article thumbnail

Bye Data Scientists, Hello AI? Not Likely!

KDnuggets

AI is becoming more mainstream. The fact that computers/robots will learn after being built and will surpass a human's intelligence is terrifying.

Data 73
article thumbnail

How to Write Web Apps Using Simple Python for Data Scientists

KDnuggets

Convert your Data Science Projects into cool apps easily without knowing any web frameworks.

Python 90
article thumbnail

Addressing the Growing Need for Skills in Data Science

KDnuggets

To address the current difficulties in hiring data scientists due to their short supply, many companies can benefit from retraining existing analytically minded employees.

article thumbnail

How to Build an Experimentation Culture for Data-Driven Product Development

Speaker: Margaret-Ann Seger, Head of Product, Statsig

Experimentation is often seen as an aspirational practice, especially at smaller, fast-moving companies who are strapped for time and resources. So, how can you get your team making decisions in a more data-driven way while continuing to remain lean and maintaining ship velocity? In this webinar, Margaret-Ann Seger, Head of Product at Statsig, will teach you how to build an experimentation culture from the ground-up, graduating from just getting started with data-driven development to operating

article thumbnail

Seven Myths About the True Costs of AI Systems

KDnuggets

While there is much excitement today around implementing AI at the enterprise level, the financial costs of this process are often unexpected and underappreciated. These seven myths are crucial lessons learned that executives should know before heading down the road to AI.

Systems 51
article thumbnail

KDnuggets™ News 19:n40, Oct 23: How to Become a (Good) Data Scientist; Writing Your First Neural Net in 30 Lines with Keras

KDnuggets

Read useful advice on how to become a good data scientist; see how you can write your 1st neural net in under 30 lines of Keras code; Understand why AI salaries are heading skywards and what skills you need for them; and read about key ideas and methods in anomaly detection.

Coding 50
article thumbnail

Harnessing Semiotics and Discourse Communities to Understand User Intent

KDnuggets

Semiotics helps us understand the importance of context to determining the meaning of a term and discourse communities provide us with the background context (mental model) by which to correctly interpret its meaning correctly.

article thumbnail

Top KDnuggets tweets, Oct 16-22: How YouTube is Recommending Your Next Video

KDnuggets

Also: The 5 Classification Evaluation Metrics Every Data Scientist Must Know; How to Recognize a Good Data Scientist Job From a Bad One; How to Easily Deploy Machine Learning Models Using Flask.

article thumbnail

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Speaker: David Bard, Principal at VP Product Coaching

In the fast-paced world of digital innovation, success is often accompanied by a multitude of challenges - like the pitfalls lurking at every turn, threatening to derail the most promising projects. But fret not, this webinar is your key to effective product development! Join us for an enlightening session to empower you to lead your team to greater heights.

article thumbnail

This Microsoft Neural Network can Answer Questions About Scenic Images with Minimum Training

KDnuggets

Recently, a group of AI experts from Microsoft Research published a paper proposing a method for scene understanding that combines two key tasks: image captioning and visual question answering (VQA).

46
article thumbnail

Top Stories, Oct 14-20: How to Become a (Good) Data Scientist Beginner Guide

KDnuggets

Also: The 5 Classification Evaluation Metrics Every Data Scientist Must Know; Artificial Intelligence: Salaries Heading Skyward; Writing Your First Neural Net in Less Than 30 Lines of Code with Keras; How to select rows and columns in Pandas using [ ],loc, iloc,at and.iat; The Last SQL Guide for Data Analysis You'll Ever Need.

article thumbnail

Samsung Tech Day: Today’s Electronic Devices Seem Magical, But the Real Super-Power is in Silicon

KDnuggets

Samsung’s Tech Day event showcases processor and memory advances for 5G, AI, Cloud and Edge Computing, Automotive, IoT, and more.

article thumbnail

Open Sourcing Mantis: A Platform For Building Cost-Effective, Realtime, Operations-Focused…

Netflix Tech

Open Sourcing Mantis: A Platform For Building Cost-Effective, Realtime, Operations-Focused Applications By Jeff Chao on behalf of the Mantis team Today we’re excited to announce that we’re open sourcing Mantis , a platform that helps Netflix engineers better understand the behavior of their applications to ensure the highest quality experience for our members.

article thumbnail

Reimagined: Building Products with Generative AI

“Reimagined: Building Products with Generative AI” is an extensive guide for integrating generative AI into product strategy and careers featuring over 150 real-world examples, 30 case studies, and 20+ frameworks, and endorsed by over 20 leading AI and product executives, inventors, entrepreneurs, and researchers.