Sat.Mar 14, 2020 - Fri.Mar 20, 2020

article thumbnail

Simplistic Ways to Find Interesting Data Sets

Team Data Science

I am taking you through my recent experience to find a dataset for my project. Industry Search To work with data, I need to narrow down the industry like health care, finance, insurance or other. I defined a few sources in my earlier blog post, which will give a sneak peek of techniques to extract industries. For Instance, most of the job listings introduce their job description as, One of the top insurance client looking for Data Engineer which exposes the industry.

Insurance 130
article thumbnail

The 4 Best Jupyter Notebook Environments for Deep Learning

KDnuggets

Many cloud providers, and other third-party services, see the value of a Jupyter notebook environment which is why many companies now offer cloud hosted notebooks that are hosted on the cloud. Let's have a look at 3 such environments.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Advanced Analytics for Coronavirus – Trends, Patterns, Predictions

Teradata

Advanced analytics and AI can significantly accelerate data processing required to get the insights, answers and recommendations to handle and address the COVID-19 pandemic.

article thumbnail

10 Key skills, to help you become a data engineer

Start Data Engineering

This article gives you an overview of the 10 key skills you need to become a better data engineer. If you are struggling to get started on what to learn, start with the first topic and proceed through the list.

article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

Building A New Foundation For CouchDB

Data Engineering Podcast

Summary CouchDB is a distributed document database built for scale and ease of operation. With a built-in synchronization protocol and a HTTP interface it has become popular as a backend for web and mobile applications. Created 15 years ago, it has accrued some technical debt which is being addressed with a refactored architecture based on FoundationDB.

Building 100
article thumbnail

What is the most effective policy response to the new coronavirus pandemic?

KDnuggets

Where Test/Trace/Quarantine are working, the number of cases/day have declined empirically. Furthermore, this appears to be a radically superior strategy where it can be deployed. I’ll review the evidence, discuss the other strategies and their consequences, and then discuss what can be done.

IT 153

More Trending

article thumbnail

Building a Cloud ETL Pipeline on Confluent Cloud

Confluent

As enterprises move more and more of their applications to the cloud, they are also moving their on-prem ETL (extract, transform, load) pipelines to the cloud, as well as building […].

Cloud 118
article thumbnail

How to Use KSQL Stream Processing and Real-Time Databases to Analyze Streaming Data in Kafka

Rockset

Intro In recent years, Kafka has become synonymous with “streaming,” and with features like Kafka Streams, KSQL, joins, and integrations into sinks like Elasticsearch and Druid, there are more ways than ever to build a real-time analytics application around streaming data in Kafka. With all of these stream processing and real-time data store options, though, also comes questions for when each should be used and what their pros and cons are.

Kafka 40
article thumbnail

When Will AutoML replace Data Scientists? Poll Results and Analysis

KDnuggets

Will AI always be 5-10 years away? The majority of respondents to this poll think that AutoML will reach expert level in 5-10 years. Interestingly, it is about the same as 5 years ago. We examine the trends by AutoML experience, industry, and region.

Data 149
article thumbnail

Teradata's Response to COVID-19

Teradata

How Teradata is responding to the COVID-19 crisis for the health and well-being of its employees, customers and partners.

IT 59
article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

Announcing ksqlDB 0.8.0

Confluent

The latest ksqlDB release introduces long-awaited features such as tunable retention and grace period for windowed aggregates, new built-in functions including LATEST_BY_OFFSET, a peek at the new server API under […].

Process 98
article thumbnail

Five Interesting Data Engineering Projects

KDnuggets

As the role of the data engineer continues to grow in the field of data science, so are the many tools being developed to support wrangling all that data. Five of these tools are reviewed here (along with a few bonus tools) that you should pay attention to for your data pipeline work.

article thumbnail

A Beginner’s Guide to Data Integration Approaches in Business Intelligence

KDnuggets

An integrated BI system has a trickle-down effect on all business processes, especially reporting and analytics. Find out how integration can help you leverage the power of BI.

article thumbnail

Nine lessons learned during my first year as a Data Scientist

KDnuggets

What is it like to be a Data Scientist? There can be many hats to wear, and so many problems to solve that are fed with data, churned by data science, and guided by business results. Find out about lessons learned from one Data Scientist about how best to work and perform in the role.

article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

A Comprehensive Data Repository for Fake Health News Detection

KDnuggets

We introduce the FakeHealth, a new data repository for fake health news detection. Following a preliminary analysis to demonstrate its features, we consider additional potential directions for better identifying fake news.

Data 123
article thumbnail

Time Series Classification Synthetic vs Real Financial Time Series

KDnuggets

This article discusses distinguishing between real financial time series and synthetic time series using XGBoost.

Finance 155
article thumbnail

A Top Machine Learning Algorithm Explained: Support Vector Machines (SVM)

KDnuggets

Support Vector Machines (SVMs) are powerful for solving regression and classification problems. You should have this approach in your machine learning arsenal, and this article provides all the mathematics you need to know -- it's not as hard you might think.

article thumbnail

Build an Artificial Neural Network From Scratch: Part 2

KDnuggets

The second article in this series focuses on building an Artificial Neural Network using the Numpy Python library.

Building 140
article thumbnail

Entity Resolution Checklist: What to Consider When Evaluating Options

Are you trying to decide which entity resolution capabilities you need? It can be confusing to determine which features are most important for your project. And sometimes key features are overlooked. Get the Entity Resolution Evaluation Checklist to make sure you’ve thought of everything to make your project a success! The list was created by Senzing’s team of leading entity resolution experts, based on their real-world experience.

article thumbnail

24 Best (and Free) Books To Understand Machine Learning

KDnuggets

We have compiled a list of some of the best (and free) machine learning books that will prove helpful for everyone aspiring to build a career in the field.

article thumbnail

Top KDnuggets tweets, Mar 11-17: Most western countries are on the same #coronavirus trajectory

KDnuggets

Most western countries are on the same #coronavirus trajectory; The Workers Who Face the Greatest #Coronavirus Risk; #Coronavirus, a Visual Rundown; How to start building an automated NLP solution for processing customer feedback.

article thumbnail

Skynet Is Real: The History and Future of Factories With No Workers

KDnuggets

Let’s see whether robots will become "grave diggers" of the proletariat, what do we lack to get total automation, and what compromises exist.

90
article thumbnail

Forecasting Stories: Is it seasonality or not?

KDnuggets

Kicking off with a series of forecasting stories, starting with seasonality and its business applications. This first article speaks of course corrections that were based on weather and calendar driven seasonality.

IT 82
article thumbnail

How to Build an Experimentation Culture for Data-Driven Product Development

Speaker: Margaret-Ann Seger, Head of Product, Statsig

Experimentation is often seen as an aspirational practice, especially at smaller, fast-moving companies who are strapped for time and resources. So, how can you get your team making decisions in a more data-driven way while continuing to remain lean and maintaining ship velocity? In this webinar, Margaret-Ann Seger, Head of Product at Statsig, will teach you how to build an experimentation culture from the ground-up, graduating from just getting started with data-driven development to operating

article thumbnail

Top 20 ODSC 2020 Global Virtual Conference Sessions

KDnuggets

At ODSC 2020, we are unveiling our first ever 4-day Global Virtual Conference, an online and on-demand version of ODSC. Here are our picks for 20 talks that show how diverse and thorough the ODSC East Global Virtual Conference will be this April 14-17.

article thumbnail

KDnuggets™ News 20:n11, Mar 18: Covid-19, your community, and you – a data science perspective; When Will AutoML replace Data Scientists? Poll Results and Analysis

KDnuggets

A Data Science perspective on Covid-19, the novel coronavirus; The results and analysis of a previous KDnuggets Poll: When Will AutoML replace Data Scientists? How to build a mature Machine Learning team; The Most Useful Machine Learning Tools of 2020; and more.

article thumbnail

Exploring the Adoption of Python in the Workplace – Free Metis Corporate Training Webinar

KDnuggets

Metis will break down Python for data science and analytics, explain what is driving adoption in the field, and discuss how industries and companies are reacting to the shift.

Python 62
article thumbnail

Improving the partnership between Data Science and IT

KDnuggets

Friction can quickly arise as a result of these separate workflows and priorities. Given their differences, how can data science and IT more seamlessly work together in building a model-driven organization?

article thumbnail

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Speaker: David Bard, Principal at VP Product Coaching

In the fast-paced world of digital innovation, success is often accompanied by a multitude of challenges - like the pitfalls lurking at every turn, threatening to derail the most promising projects. But fret not, this webinar is your key to effective product development! Join us for an enlightening session to empower you to lead your team to greater heights.

article thumbnail

Top Stories, Mar 9-15: New Poll: Coronavirus impact on Data Science community; Covid-19, your community, and you — a data science perspective

KDnuggets

Also: 50 Must-Read Free Books For Every Data Scientist in 2020; Decision Boundary for a Series of Machine Learning Models; 20 AI, Data Science, Machine Learning Terms You Need to Know in 2020 (Part 2).

article thumbnail

Salesforce Open Sources a Framework for Open Domain Question Answering Using Wikipedia

KDnuggets

The framework uses a multi-hop QA method to answer complex questions by reasoning through Wikipedia’s datasets.

article thumbnail

Scaling Your Data Strategy

KDnuggets

This article presents a particular vision for a cohesive data strategy for addressing large-scale problems with data-driven solutions, based on prior professional experiences.

Data 54
article thumbnail

ModelDB 2.0 is here!

KDnuggets

We are excited to announce that ModelDB 2.0 is now available! We have learned a lot since building ModelDB 1.0, so we decided to rebuild from the ground up.

article thumbnail

Reimagined: Building Products with Generative AI

“Reimagined: Building Products with Generative AI” is an extensive guide for integrating generative AI into product strategy and careers featuring over 150 real-world examples, 30 case studies, and 20+ frameworks, and endorsed by over 20 leading AI and product executives, inventors, entrepreneurs, and researchers.