December, 2023

article thumbnail

How Much Data Do We Need? Balancing Machine Learning with Security Considerations

Towards Data Science

For a data scientist, there’s no such thing as too much data. But when we take a broader look at the organizational context, we have to balance our goals with other considerations. Photo by Trnava University on Unsplash Data Science vs Security/IT: A Battle for the Ages Acquiring and keeping data is the focus of a huge amount of our mental energy as data scientists.

article thumbnail

25 Free Courses to Master Data Science, Data Engineering, Machine Learning, MLOps, and Generative AI

KDnuggets

Discover a collection of top courses to launch your dream career or master a new skill, all for free!

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Streaming in Data Engineering

Towards Data Science

Streaming data pipelines and real-time analytics Continue reading on Towards Data Science »

article thumbnail

A Tech Conference Listed Fake Speakers for Years: I Accidentally Noticed

The Pragmatic Engineer

For 3 years straight, the DevTernity conference listed non-existent Coinbase employees as featured speakers. When were they added and what could have the motivation been? Three featured speakers listed at DevTernity 2021, 2022 and 2023, and JDKon 2024. These people do not exist. A year ago, I spent months doing an investigative report on how UK events tech company Pollen had its staff work for free, as it had run out of money but still kept operating.

article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

Unlocking the Power of Containers: Exploring the Top 20 Docker Containers for Every Development Need

Analytics Vidhya

Introduction Docker containers have emerged as indispensable tools in the fast-evolving landscape of software development and deployment, providing a lightweight and efficient way to package, distribute, and run applications. This article delves into the top 20 Docker containers across various categories, showcasing their features, use cases, and contributions to streamlining development workflows.

231
231
article thumbnail

How Meta built the infrastructure for Threads

Engineering at Meta

On July 5, 2023, Meta launched Threads, the newest product in our family of apps, to an unprecedented success that saw it garner over 100 million sign ups in its first five days. A small, nimble team of engineers built Threads over the course of only five months of technical work. While the app’s production launch had been under consideration for some time, the business finally made the decision and informed the infrastructure teams to prepare for its launch with only two days’ advance notice.

More Trending

article thumbnail

10 GitHub Repositories to Master Machine Learning

KDnuggets

The blog covers machine learning courses, bootcamps, books, tools, interview questions, cheat sheets, MLOps platforms, and more to master ML and secure your dream job.

article thumbnail

Making Flink Serverless, With Queries for Less Than a Penny

Confluent

Dive into the serverless architecture of Confluent Cloud for Apache Flink and explore its benefits like reduced infrastructure costs, increased reliability, & seamless adoption.

article thumbnail

Mentoring software engineers or engineering leaders

The Pragmatic Engineer

I get asked every now and then if I offer 1:1 mentoring for either software engineers or engineering managers or leaders. While I used to do this in the past, I don't offer this any more. I collected much of the advice I have to offer for software engineers in The Software Engineer's Guidebook. I also write The Pragmatic Engineer Newsletter where I do cover topics like what it means to be a senior engineer at various companies , how to deal with a low-quality engineering culture , and

article thumbnail

Make this 3D printed globe please

ArcGIS

It's that time of year to warm ourselves beside the electric hum of a plastic filament printer and fall into the joy of making.

IT 143
article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

Practical Magic: Improving Productivity and Happiness for Software Development Teams

LinkedIn Engineering

Co-authors: Max Kanat-Alexander and Grant Jenks Today we are open-sourcing the LinkedIn Developer Productivity & Happiness Framework (DPH Framework) - a collection of documents that describe the systems, processes, metrics, and feedback systems we use to understand our developers and their needs internally at LinkedIn. Now more than ever, developers are navigating so much change and new opportunity in this new era of Generative AI, so ensuring teams have the systems, processes, metrics and f

article thumbnail

Creating High Quality RAG Applications with Databricks

databricks

Retrieval-Augmented-Generation (RAG) has quickly emerged as a powerful way to incorporate proprietary, real-time data into Large Language Model (LLM) applications. Today we are.

Data 130
article thumbnail

Free MIT Course: TinyML and Efficient Deep Learning Computing

KDnuggets

Curious about optimizing AI for everyday devices? Dive into the complete overview of MIT's TinyML and Efficient Deep Learning Computing course. Explore strategies to make AI smarter on small devices. Read the full article for an in-depth look!

article thumbnail

If software development were a race, AI wins every time by Colin Eberhardt

Scott Logic

An exploration of the quantitative and qualitative impacts of Generative AI on software development. We’ve undertaken multiple experiments to better understand the impact of GenerativeAI tools (ChatGPT, Copilot) on developer productivity. Our quantitative results show a 37% improvement in productivity (speed), however, this result is a misrepresentation of what it means to be productive as a developer.

article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

The Pragmatic Engineer Newsletter in 2023

The Pragmatic Engineer

2023 was the second full year of The Pragmatic Engineer Newsletter , and this newsletter is now almost two and a half years old; the first issue came out on 26 August 2021. Thank you for being a reader, I greatly value your support. This year, 102 newsletter issues were published, and this is number 103. You received a deepdive issue on Tuesdays, and every Thursday it was  “The Pulse”  – formerly The Scoop.

article thumbnail

Snowflake Announces Agreement to Acquire Samooha to Simplify Building Interoperable Data Clean Rooms in the Data Cloud

Snowflake

When businesses share sensitive first-party data with outside partners or customers, they must do so in a way that meets strict governance requirements around security and privacy. Data clean rooms have emerged as the technology to meet this need, enabling interoperability where multiple parties can collaborate on and analyze sensitive data in a governed way without exposing direct access to the underlying data and business logic.

Cloud 115
article thumbnail

Join Enhancements in ArcGIS Pro 3.2

ArcGIS

ArcGIS Pro 3.2 includes a number of enhancements to the Spatial Join, Add Spatial Join, Add Join, and Join Field tools.

138
138
article thumbnail

Improve your RAG application response quality with real-time structured data

databricks

Retrieval Augmented Generation (RAG) is an efficient mechanism to provide relevant data as context in Gen AI applications. Most RAG applications typically use.

article thumbnail

Entity Resolution Checklist: What to Consider When Evaluating Options

Are you trying to decide which entity resolution capabilities you need? It can be confusing to determine which features are most important for your project. And sometimes key features are overlooked. Get the Entity Resolution Evaluation Checklist to make sure you’ve thought of everything to make your project a success! The list was created by Senzing’s team of leading entity resolution experts, based on their real-world experience.

article thumbnail

Building Predictive Models: Logistic Regression in Python

KDnuggets

Image by Author When you are getting started with machine learning, logistic regression is one of the first algorithms you’ll add to your toolbox.

Python 155
article thumbnail

Real-Time Field Service Optimization

Confluent

Telcos use Confluent with event-driven microservices to enable real-time communications with 3rd-party field service providers, fulfilling customer service requests more efficiently.

108
108
article thumbnail

AI debugging at Meta with HawkEye

Engineering at Meta

HawkEye is the powerful toolkit used internally at Meta for monitoring, observability, and debuggability of the end-to-end machine learning (ML) workflow that powers ML-based products. HawkEye supports recommendation and ranking models across several products at Meta. Over the past two years, it has facilitated order of magnitude improvements in the time spent debugging production issues.

article thumbnail

Cash Flow Sensitivity and Scenarios

FreshBI

Amidst the ebb and flow of revenues, expenditures, and accounts receivable/payable, accurately predicting and managing cash flow remains a significant challenge for businesses of all scales. Let’s make it dynamic and easy with Power BI. Sensitivity and Scenario Analysis 1) What is sensitivity analysis? Sensitivity analysis within cash flow management involves assessing how changes in specific variables impact the overall cash position.

BI 105
article thumbnail

The Big Payoff of Application Analytics

Outdated or absent analytics won’t cut it in today’s data-driven applications – not for your end users, your development team, or your business. That’s what drove the five companies in this e-book to change their approach to analytics. Download this e-book to learn about the unique problems each company faced and how they achieved huge returns beyond expectation by embedding analytics into applications.

article thumbnail

Building Trust in Public Sector AI Starts with Trusting Your Data

Cloudera

Recent Government Initiatives on Public Sector AI Solutions In recent years, governments across the globe have recognized the transformative potential of artificial intelligence (AI) and have embarked on initiatives to harness this technology to drive innovation and serve their citizens more effectively. These government-led efforts have had a profound impact on the development and adoption of AI solutions in the public sector, paving the way for a future where data-driven decision-making and au

Building 107
article thumbnail

Integrating NVIDIA TensorRT-LLM with the Databricks Inference Stack

databricks

Over the past six months, we've been working with NVIDIA to get the most out of their new TensorRT-LLM library. TensorRT-LLM provides an easy-to-use Python interface to integrate with a web server for fast, efficient inference performance with LLMs. In this post, we're highlighting some key areas where our collaboration with NVIDIA has been particularly important.

Python 113
article thumbnail

The Best Data Science Resources, Bootcamp, and Courses to Learn Data Science in the New Year

KDnuggets

We've partnered with Springboard, the leading data science bootcamp offering personalized 1:1 mentorship, dedicated career support, proven outcomes, and an unbeatable money-back job guarantee, to present a handpicked collection of resources to supercharge your data science journey in the coming year.

article thumbnail

Our First Netflix Data Engineering Summit

Netflix Tech

Holden Karau Elizabeth Stone Pedro Duarte Chris Stephens Pallavi Phadnis Lee Woodridge Mark Cho Guil Pires Sujay Jain Tristan Reid Senthilnathan Athinarayanan Bharath Mummadisetty Abhinaya Shetty Judit Lantos Amanuel Kahsay Dao Mi Mick Dreeling Chris Colburn and Agata Gryzbek Introduction Earlier this summer Netflix held our first-ever Data Engineering Forum.

article thumbnail

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Speaker: David Bard, Principal at VP Product Coaching

In the fast-paced world of digital innovation, success is often accompanied by a multitude of challenges - like the pitfalls lurking at every turn, threatening to derail the most promising projects. But fret not, this webinar is your key to effective product development! Join us for an enlightening session to empower you to lead your team to greater heights.

article thumbnail

Reflective Understanding of Prince2® Principles in a Project Environment in 2024

Knowledge Hut

Managing s uccessful p rojects i n diverse areas such as construction , IT, banking , research and product development or in the field of health and service industry requires adoption of best practices that are pan-geographical. Post-COVID, the world is slowly recovering emotionally and economically, and what is needed are robust recovery measures such as project management best practices that will hasten up this recovery and help make things normal again.

Project 98
article thumbnail

MySQL Master Slave Replication: 7 Easy Steps

Hevo

MySQL replication, specifically, MySQL master slave replication plays a vital role in ensuring data availability by enabling simultaneous copying and replication of data between servers. The MySQL master slave replication proves indispensable for data recovery, offering a reliable backup solution in the face of catastrophes or hardware failures.

MySQL 98
article thumbnail

Top 6 Episodes of The Data Chief Podcast: 2023

ThoughtSpot

2023 has been a year of breakthrough innovation for many, and a deer-in-headlights moment for others. I keep flashing back to the 90s when the Internet created new businesses and destroyed others—LLMs are doing the same, only with more velocity. From CDAOs to VCs alike, the rate of creative destruction is faster, but there is also an intense focus on value.

article thumbnail

Databricks Named a Leader in 2023 Gartner® Magic Quadrant™ for Cloud Database Management Systems

databricks

We are excited to announce that Gartner has recognized Databricks as a Leader for a third consecutive year in the 2023 Gartner® Magic.

Systems 131
article thumbnail

Driving Business Impact for PMs

Speaker: Jon Harmer, Product Manager for Google Cloud

Move from feature factory to customer outcomes and drive impact in your business! This session will provide you with a comprehensive set of tools to help you develop impactful products by shifting from output-based thinking to outcome-based thinking. You will deepen your understanding of your customers and their needs as well as identifying and de-risking the different kinds of hypotheses built into your roadmap.