December, 2023

article thumbnail

How Much Data Do We Need? Balancing Machine Learning with Security Considerations

Towards Data Science

For a data scientist, there’s no such thing as too much data. But when we take a broader look at the organizational context, we have to balance our goals with other considerations. Photo by Trnava University on Unsplash Data Science vs Security/IT: A Battle for the Ages Acquiring and keeping data is the focus of a huge amount of our mental energy as data scientists.

article thumbnail

25 Free Courses to Master Data Science, Data Engineering, Machine Learning, MLOps, and Generative AI

KDnuggets

Discover a collection of top courses to launch your dream career or master a new skill, all for free!

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Streaming in Data Engineering

Towards Data Science

Streaming data pipelines and real-time analytics Continue reading on Towards Data Science »

article thumbnail

A Tech Conference Listed Fake Speakers for Years: I Accidentally Noticed

The Pragmatic Engineer

For 3 years straight, the DevTernity conference listed non-existent Coinbase employees as featured speakers. When were they added and what could have the motivation been? Three featured speakers listed at DevTernity 2021, 2022 and 2023, and JDKon 2024. These people do not exist. A year ago, I spent months doing an investigative report on how UK events tech company Pollen had its staff work for free, as it had run out of money but still kept operating.

article thumbnail

Get Better Network Graphs & Save Analysts Time

Many organizations today are unlocking the power of their data by using graph databases to feed downstream analytics, enahance visualizations, and more. Yet, when different graph nodes represent the same entity, graphs get messy. Watch this essential video with Senzing CEO Jeff Jonas on how adding entity resolution to a graph database condenses network graphs to improve analytics and save your analysts time.

article thumbnail

Troubleshooting Kafka In Production

Data Engineering Podcast

Summary Kafka has become a ubiquitous technology, offering a simple method for coordinating events and data across different systems. Operating it at scale, however, is notoriously challenging. Elad Eldor has experienced these challenges first-hand, leading to his work writing the book "Kafka: : Troubleshooting in Production" In this episode he highlights the sources of complexity that contribute to Kafka's operational difficulties, and some of the main ways to identify and mitigate

Kafka 245
article thumbnail

Unlocking the Power of Containers: Exploring the Top 20 Docker Containers for Every Development Need

Analytics Vidhya

Introduction Docker containers have emerged as indispensable tools in the fast-evolving landscape of software development and deployment, providing a lightweight and efficient way to package, distribute, and run applications. This article delves into the top 20 Docker containers across various categories, showcasing their features, use cases, and contributions to streamlining development workflows.

231
231

More Trending

article thumbnail

10 GitHub Repositories to Master Machine Learning

KDnuggets

The blog covers machine learning courses, bootcamps, books, tools, interview questions, cheat sheets, MLOps platforms, and more to master ML and secure your dream job.

article thumbnail

Creating High Quality RAG Applications with Databricks

databricks

Retrieval-Augmented-Generation (RAG) has quickly emerged as a powerful way to incorporate proprietary, real-time data into Large Language Model (LLM) applications. Today we are.

Data 131
article thumbnail

Mentoring software engineers or engineering leaders

The Pragmatic Engineer

I get asked every now and then if I offer 1:1 mentoring for either software engineers or engineering managers or leaders. While I used to do this in the past, I don't offer this any more. I collected much of the advice I have to offer for software engineers in The Software Engineer's Guidebook. I also write The Pragmatic Engineer Newsletter where I do cover topics like what it means to be a senior engineer at various companies , how to deal with a low-quality engineering culture , and

article thumbnail

Making Flink Serverless, With Queries for Less Than a Penny

Confluent

Dive into the serverless architecture of Confluent Cloud for Apache Flink and explore its benefits like reduced infrastructure costs, increased reliability, & seamless adoption.

article thumbnail

Understanding User Needs and Satisfying Them

Speaker: Scott Sehlhorst

We know we want to create products which our customers find to be valuable. Whether we label it as customer-centric or product-led depends on how long we've been doing product management. There are three challenges we face when doing this. The obvious challenge is figuring out what our users need; the non-obvious challenges are in creating a shared understanding of those needs and in sensing if what we're doing is meeting those needs.

article thumbnail

Practical Magic: Improving Productivity and Happiness for Software Development Teams

LinkedIn Engineering

Co-authors: Max Kanat-Alexander and Grant Jenks Today we are open-sourcing the LinkedIn Developer Productivity & Happiness Framework (DPH Framework) - a collection of documents that describe the systems, processes, metrics, and feedback systems we use to understand our developers and their needs internally at LinkedIn. Now more than ever, developers are navigating so much change and new opportunity in this new era of Generative AI, so ensuring teams have the systems, processes, metrics and f

article thumbnail

Make this 3D printed globe please

ArcGIS

It's that time of year to warm ourselves beside the electric hum of a plastic filament printer and fall into the joy of making.

IT 140
article thumbnail

Free MIT Course: TinyML and Efficient Deep Learning Computing

KDnuggets

Curious about optimizing AI for everyday devices? Dive into the complete overview of MIT's TinyML and Efficient Deep Learning Computing course. Explore strategies to make AI smarter on small devices. Read the full article for an in-depth look!

article thumbnail

Introducing Databricks Vector Search Public Preview

databricks

Following the announcement we made yesterday around Retrieval Augmented Generation (RAG), today, we’re excited to announce the public preview of Databricks Vector Search. W.

article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

The Pragmatic Engineer Newsletter in 2023

The Pragmatic Engineer

2023 was the second full year of The Pragmatic Engineer Newsletter , and this newsletter is now almost two and a half years old; the first issue came out on 26 August 2021. Thank you for being a reader, I greatly value your support. This year, 102 newsletter issues were published, and this is number 103. You received a deepdive issue on Tuesdays, and every Thursday it was  “The Pulse”  – formerly The Scoop.

article thumbnail

If software development were a race, AI wins every time by Colin Eberhardt

Scott Logic

An exploration of the quantitative and qualitative impacts of Generative AI on software development. We’ve undertaken multiple experiments to better understand the impact of GenerativeAI tools (ChatGPT, Copilot) on developer productivity. Our quantitative results show a 37% improvement in productivity (speed), however, this result is a misrepresentation of what it means to be productive as a developer.

article thumbnail

Unlock the New Wave of Gen AI With Snowpark Container Services GPU-Powered Compute

Snowflake

The rise of generative AI (gen AI) is inspiring organizations to envision a future in which AI is integrated into all aspects of their operations for a more human, personalized and efficient customer experience. However, getting the required compute infrastructure into place, particularly GPUs for large language models (LLMs), is a real challenge. Accessing the necessary resources from cloud providers demands careful planning and up to month-long wait times due to the high demand for GPUs.

Scala 117
article thumbnail

Join Enhancements in ArcGIS Pro 3.2

ArcGIS

ArcGIS Pro 3.2 includes a number of enhancements to the Spatial Join, Add Spatial Join, Add Join, and Join Field tools.

138
138
article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

Building Predictive Models: Logistic Regression in Python

KDnuggets

Image by Author When you are getting started with machine learning, logistic regression is one of the first algorithms you’ll add to your toolbox.

Python 152
article thumbnail

Improve your RAG application response quality with real-time structured data

databricks

Retrieval Augmented Generation (RAG) is an efficient mechanism to provide relevant data as context in Gen AI applications. Most RAG applications typically use.

article thumbnail

Real-Time Field Service Optimization

Confluent

Telcos use Confluent with event-driven microservices to enable real-time communications with 3rd-party field service providers, fulfilling customer service requests more efficiently.

108
108
article thumbnail

AI debugging at Meta with HawkEye

Engineering at Meta

HawkEye is the powerful toolkit used internally at Meta for monitoring, observability, and debuggability of the end-to-end machine learning (ML) workflow that powers ML-based products. HawkEye supports recommendation and ranking models across several products at Meta. Over the past two years, it has facilitated order of magnitude improvements in the time spent debugging production issues.

article thumbnail

The Big Payoff of Application Analytics

Outdated or absent analytics won’t cut it in today’s data-driven applications – not for your end users, your development team, or your business. That’s what drove the five companies in this e-book to change their approach to analytics. Download this e-book to learn about the unique problems each company faced and how they achieved huge returns beyond expectation by embedding analytics into applications.

article thumbnail

Snowflake Announces Agreement to Acquire Samooha to Simplify Building Interoperable Data Clean Rooms in the Data Cloud

Snowflake

When businesses share sensitive first-party data with outside partners or customers, they must do so in a way that meets strict governance requirements around security and privacy. Data clean rooms have emerged as the technology to meet this need, enabling interoperability where multiple parties can collaborate on and analyze sensitive data in a governed way without exposing direct access to the underlying data and business logic.

Cloud 113
article thumbnail

Cash Flow Sensitivity and Scenarios

FreshBI

Amidst the ebb and flow of revenues, expenditures, and accounts receivable/payable, accurately predicting and managing cash flow remains a significant challenge for businesses of all scales. Let’s make it dynamic and easy with Power BI. Sensitivity and Scenario Analysis 1) What is sensitivity analysis? Sensitivity analysis within cash flow management involves assessing how changes in specific variables impact the overall cash position.

BI 105
article thumbnail

The Best Data Science Resources, Bootcamp, and Courses to Learn Data Science in the New Year

KDnuggets

We've partnered with Springboard, the leading data science bootcamp offering personalized 1:1 mentorship, dedicated career support, proven outcomes, and an unbeatable money-back job guarantee, to present a handpicked collection of resources to supercharge your data science journey in the coming year.

article thumbnail

Databricks Named a Leader in 2023 Gartner® Magic Quadrant™ for Cloud Database Management Systems

databricks

We are excited to announce that Gartner has recognized Databricks as a Leader for a third consecutive year in the 2023 Gartner® Magic.

Systems 132
article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

Just Arrived: New Symbols on the Robinhood 24 Hour Market

Robinhood

Robinhood is the only US retail brokerage to offer 24/5 trading of single name stocks At Robinhood, we know the world never stops – and believe investing shouldn’t be any different. Since launching in May, we’ve seen customers utilize the unprecedented flexibility and access to the markets with the Robinhood 24 Hour Market. And we’re just getting started – we’re proud to announce that we’ve expanded the total number of symbols available from 95 to 226.

Retail 107
article thumbnail

Top Trends in Agile You Can’t Miss in 2024

Knowledge Hut

Technology is evolving at breakneck speed, and the information we consume every day continues to grow exponentially with every passing day. Analyzing this complex mountain of data to make the right decisions informed by this data has become ever more challenging. Traditional models of project management , like the waterfall method and hierarchical team structures are too rigid to respond to the fast-paced change organizations are facing today.

article thumbnail

Building Trust in Public Sector AI Starts with Trusting Your Data

Cloudera

Recent Government Initiatives on Public Sector AI Solutions In recent years, governments across the globe have recognized the transformative potential of artificial intelligence (AI) and have embarked on initiatives to harness this technology to drive innovation and serve their citizens more effectively. These government-led efforts have had a profound impact on the development and adoption of AI solutions in the public sector, paving the way for a future where data-driven decision-making and au

Building 105
article thumbnail

Our First Netflix Data Engineering Summit

Netflix Tech

Holden Karau Elizabeth Stone Pedro Duarte Chris Stephens Pallavi Phadnis Lee Woodridge Mark Cho Guil Pires Sujay Jain Tristan Reid Senthilnathan Athinarayanan Bharath Mummadisetty Abhinaya Shetty Judit Lantos Amanuel Kahsay Dao Mi Mick Dreeling Chris Colburn and Agata Gryzbek Introduction Earlier this summer Netflix held our first-ever Data Engineering Forum.

article thumbnail

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Speaker: David Bard, Principal at VP Product Coaching

In the fast-paced world of digital innovation, success is often accompanied by a multitude of challenges - like the pitfalls lurking at every turn, threatening to derail the most promising projects. But fret not, this webinar is your key to effective product development! Join us for an enlightening session to empower you to lead your team to greater heights.