May, 2024

article thumbnail

Is the “AI developer”a threat to jobs – or a marketing stunt?

The Pragmatic Engineer

This article was published on 14 March 2024 in The Pragmatic Engineer, for subscribers. I'm sharing this piece in public more than a month later, as it provides important context and analysis for the AI dev tools space. Subscribe to The Pragmatic Engineer to stay up-to-date on what is happening with software engineering, Big Tech, and startups.

article thumbnail

mapGroupsWithState and.batch?

Waitingforcode

That's one of my recent surprises. While I have been exploring arbitrary stateful processing, hence the mapGroupsWithState among others, I mistakenly created a batch DataFrame and applied the mapping function on top of it. Turns out, it worked! Well, not really but I let you discover why in this blog post.

Process 130
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

4 ELT Alternatives To Airbyte – How To Ingest Your Data

Seattle Data Guy

Getting data out of source systems and into a data warehouse or data lake is one of the first steps in making it usable by analysts and data scientists. The question is how will your team do that? Will they write custom data connectors, pay for a data connector out of the box or perhaps… Read more The post 4 ELT Alternatives To Airbyte – How To Ingest Your Data appeared first on Seattle Data Guy.

Data Lake 130
article thumbnail

Barking Up The Wrong GPTree: Building Better AI With A Cognitive Approach

Data Engineering Podcast

Summary Artificial intelligence has dominated the headlines for several months due to the successes of large language models. This has prompted numerous debates about the possibility of, and timeline for, artificial general intelligence (AGI). Peter Voss has dedicated decades of his life to the pursuit of truly intelligent software through the approach of cognitive AI.

Building 130
article thumbnail

How To Get Promoted In Product Management

Speaker: John Mansour

If you're looking to advance your career in product management, there are more options than just climbing the management ladder. Join our upcoming webinar to learn about highly rewarding career paths that don't involve management responsibilities. We'll cover both career tracks and provide tips on how to position yourself for success in the one that's right for you.

article thumbnail

What’s New in ArcGIS Pro 3.3

ArcGIS

Discover the exciting new features of ArcGIS Pro 3.3. From water flow modeling to direct PDF support, this release has it all. Read our blog to learn more.

IT 142
article thumbnail

How to build a data team

Christophe Blefari

My personal collection of the best resources to bootstrap a data team and get inspired from what others are doing.

Building 130

More Trending

article thumbnail

OutputModes in Apache Spark Structured Streaming - complementary notes

Waitingforcode

I wrote a blog post about OutputModes 6 (yes!) years ago and after reading it a few times, I realized it was not good enough to be a quick refresher. For that reason you can read about OutputModes for the second time here. Hopefully, this one will be a good try!

IT 130
article thumbnail

Reading and Processing JSON with Rust vs Python.

Confessions of a Data Guy

Have you ever wondered about being explicit in your code vs being vague? I think about this a lot as I’m writing code on a daily basis. I’ve found I like being explicit and verbose when writing code, rather than being vague in what I’m doing most of the time. When it comes to debugging […] The post Reading and Processing JSON with Rust vs Python. appeared first on Confessions of a Data Guy.

Python 100
article thumbnail

A Notebook is all I want or Don't

Data Engineering Weekly

The tweet received strong reactions on LinkedIn and Twitter. To clarify, I quoted it as a Notebook-style development, but it is not exactly a Notebook. There is a lot of context missing in that tweet, so I decided to write a blog about it. People have reservations about using tools like Jupytor Notebook for the production pipeline for a good reason.

article thumbnail

Snowflake Cortex LLM Functions Moves to General Availability with New LLMs, Improved Retrieval and Enhanced AI Safety

Snowflake

Snowflake Cortex is a fully-managed service that enables access to industry-leading large language models (LLMs) is now generally available. You can use these LLMs in select regions directly via LLM Functions on Cortex so you can bring generative AI securely to your governed data. Your team can focus on building AI applications, while we handle model optimization and GPU infrastructure to deliver cost-effective performance.

article thumbnail

Navigating the Future: Generative AI, Application Analytics, and Data

Generative AI is upending the way product developers & end-users alike are interacting with data. Despite the potential of AI, many are left with questions about the future of product development: How will AI impact my business and contribute to its success? What can product managers and developers expect in the future with the widespread adoption of AI?

article thumbnail

Top 10 Startups in India – Everyone Should Know

Knowledge Hut

As of the beginning of January 2022, India has recognized more than 61,000 startups, thus having the 3rd largest startup ecosystem after the US and China. The government of India has an initiative called Startup India, whose sole purpose is to bring about startup culture and build an ecosystem for entrepreneurship and innovation. As a result, the startup ecosystem in India has emerged as a major growth engine for the country in the past few years and aims to become a global tech powerhouse.

article thumbnail

A Roadmap to Machine Learning Algorithm Selection

KDnuggets

The goal of this article is to help demystify the process of selecting the proper machine learning algorithm, concentrating on "traditional" algorithms and offering some guidelines for choosing the best one for your application.

article thumbnail

Robinhood Response to Receipt of Wells Notice from the U.S. Securities and Exchange Commission

Robinhood

Robinhood Markets,Inc.(Nasdaq:HOOD) today announced that RHC has received a Wells Notice from the SEC Staff Earlier today we announced Robinhood Crypto (RHC) has received a Wells Notice from the U.S. Securities and Exchange Commission staff indicating they will recommend that the Commission file an enforcement action. “After years of good faith attempts to work with the SEC for regulatory clarity including our well-known attempt to ‘come in and register,’ we are disappointed that the agency has

102
102
article thumbnail

The right words in the right place

Tweag

tl;dr You may not believe it, but Nix documentation is getting better. Nixpkgs and NixOS still need more time. Table of contents Overview Motivation Statistics Retrospective Thoughts on future work Acknowledgements Overview This is a retrospective of my and many other people’s work on documentation in the Nix ecosystem between October 2022 and March 2024.

article thumbnail

Get Better Network Graphs & Save Analysts Time

Many organizations today are unlocking the power of their data by using graph databases to feed downstream analytics, enahance visualizations, and more. Yet, when different graph nodes represent the same entity, graphs get messy. Watch this essential video with Senzing CEO Jeff Jonas on how adding entity resolution to a graph database condenses network graphs to improve analytics and save your analysts time.

article thumbnail

We’ll See You at the Gartner Data and Analytics Summit

Cloudera

The Gartner Data and Analytics Summit in London is quickly approaching on May 13 th to 15 th , and the Cloudera team is ready to hit the show floor! The theme of this year’s summit, “Generating Value Together: Creating Synergies between Data, Analytics & AI,” could not have come at a better time as we push forward on our AI and analytics journey together.

Banking 83
article thumbnail

Meet the 2024 Snowflake Startup Challenge Finalists

Snowflake

The 2024 Snowflake Startup Challenge began with over 900 applications from startups Powered by Snowflake in more than 100 countries. Our judges narrowed that long list of contenders down to 10, and after much deliberation, they’ve now pared it down to the final three. We are pleased to announce that BigGeo, Scientific Financial Systems and SignalFlare.ai by Extropy360 will advance to the Snowflake Startup Challenge finale and compete for the opportunity to receive a share of up to $1 million in

Media 103
article thumbnail

What is Project in Project Management? Types, Importance and Examples

Knowledge Hut

In the dynamic business environment of current times, existing business organizations aggressively seek to upgrade or change their practices, and startups begin with the best practices of the processes. Both need the route of the Project to accomplish their objective. So, what is a project in this dynamic business environment? Projects are, in short, vehicles of change.

Project 97
article thumbnail

Free AI Courses from NVIDIA: For All Levels

KDnuggets

Want to build cool AI applications? Start learning AI today with these free courses from NVIDIA.

Building 134
article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

Pushing the Boundaries of Innovation with Data and AI: Announcing the 2024 Finalists of the Databricks Data Team Transformation Award

databricks

The Data Team Awards celebrates enterprise data teams' essential role in helping businesses across sectors face their most pressing challenges. With more than.

Data 89
article thumbnail

5 Things to do When Evaluating ELT/ETL Tools

Towards Data Science

A list to make evaluating ELT/ETL tools a bit less daunting Photo by Volodymyr Hryshchenko on Unsplash We’ve all been there: you’ve attended (many!) meetings with sales reps from all of the SaaS data integration tooling companies and are granted 14 day access to try their wares. Now you have to decide what sorts of things to test in order to figure out definitively if the tool is the right commitment for you and the team.

article thumbnail

Join us at the Iceberg Summit 2024

Cloudera

Apache Iceberg is vital to the work we do and the experience that the Cloudera platform delivers to our customers. Iceberg, a high-performance open-source format for huge analytic tables, delivers the reliability and simplicity of SQL tables to big data while allowing for multiple engines like Spark, Flink, Trino, Presto, Hive, and Impala to work with the same tables, all at the same time.

article thumbnail

Moving Beyond MTEB and BEIR: Snowflake AI Research Joins Forces with the University of Waterloo to Evolve RAG and Retrieval Benchmarks

Snowflake

To accurately answer business questions using LLMs, companies must augment models with their data. Retrieval Augmented Generation (RAG) is a popular solution to this problem, as it integrates the organization’s factual, real-time data into the prompt for the LLM. While the adoption of RAG has increased, an open question remains: How do enterprises know how effective their system is?

Cloud 101
article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

Six Sigma Green Belt Project Examples & How to Execute?

Knowledge Hut

The Lean Six Sigma Green Belt certification is an important step in becoming a master of the lean six sigma technique and leading improvement projects for a company. LSS Green Belts identify critical areas for improvement and play a key role in executing the necessary changes, based on the ideas and abilities learned throughout LSS Yellow Belt training.

Project 98
article thumbnail

5 Simple Steps to Automate Data Cleaning with Python

KDnuggets

Automate your data cleaning process with a practical 5-step pipeline in Python, ideal for beginners.

Python 139
article thumbnail

The Modern Data Stack: How The Evolution of Data Architecture Led to The Data Intelligence Platform

databricks

Modern data stacks provide the necessary flexibility and efficiency for analytics and AI. Learn how the Databricks Data Intelligence Platform makes use of them.

article thumbnail

Robinhood Reports First Quarter 2024 Results

Robinhood

Robinhood Markets, Inc. (Nasdaq: HOOD) today reported financial results for the quarter ended March 31, 2024. Read our Q1 2024 earnings press release here. Access more information at investors.robinhood.com. The post Robinhood Reports First Quarter 2024 Results appeared first on Robinhood Newsroom.

article thumbnail

How Embedded Analytics Gets You to Market Faster with a SAAS Offering

Start-ups & SMBs launching products quickly must bundle dashboards, reports, & self-service analytics into apps. Customers expect rapid value from your product (time-to-value), data security, and access to advanced capabilities. Traditional Business Intelligence (BI) tools can provide valuable data analysis capabilities, but they have a barrier to entry that can stop small and midsize businesses from capitalizing on them.

article thumbnail

DataKitchen Training And Certification Offerings

DataKitchen

DataKitchen Training And Certification Offerings For Individual contributors with a background in Data Analytics/Science/Engineering Overall Ideas and Principles of DataOps DataOps Cookbook (200 page book over 30,000 readers, free): DataOps Certificatio n (3 hours, online, free, signup online): DataOps Manifesto (over 30,000 signatures) One Day DataOps training (paid) Data Observability (the first step in DataOps) I deas and Principles of Data Observability Four-part Da

article thumbnail

Gen AI Perspectives from Industry Leaders Shaping the Future

Snowflake

From its start with efficient batch processing with data warehouses for descriptive analytics, and the inclusion of streaming data in real time to build recommendations, we find ourselves at the forefront of a new stage of evolution: generative AI (gen AI). This generative powerhouse has fueled vertical integration, giving rise to industry-specific solutions that harness the full potential of generative capabilities and unlocked the imagination of many.

article thumbnail

How Systems Thinking Can Be Applied To Agile Transformations

Knowledge Hut

Applying systems thinking views a system as a set of interconnected and interdependent components defined by its limits and more than the sum of their parts (subsystems). When one component of a system is altered, the effects frequently spread across the entire system. While the precise impact may not be foreseeable in and of itself, many behavioral patterns are impactful.

Systems 98
article thumbnail

Avoid These 5 Common Mistakes Every Novice in AI Makes

KDnuggets

Top five mistakes made by AI beginners and practical tips to avoid them, along with an engaging "50-Day Challenge" that you cannot afford to miss.

106
106
article thumbnail

Embedding BI: Architectural Considerations and Technical Requirements

While data platforms, artificial intelligence (AI), machine learning (ML), and programming platforms have evolved to leverage big data and streaming data, the front-end user experience has not kept up. Holding onto old BI technology while everything else moves forward is holding back organizations. Traditional Business Intelligence (BI) aren’t built for modern data platforms and don’t work on modern architectures.