Sat.Sep 04, 2021 - Fri.Sep 10, 2021

article thumbnail

Value Proposition of the Cloudera Operational Database over Legacy Apache HBase Deployments

Cloudera

The CDP Operational Database ( COD ) builds on the foundation of existing operational database capabilities that were available with Apache HBase and/or Apache Phoenix in legacy CDH and HDP deployments. Within the context of a broader data and analytics platform implemented in the Cloudera Data Platform ( CDP ), COD will function as highly scalable relational and non-relational transactional database allowing users to leverage big data in operational applications as well as the backbone of the a

article thumbnail

Event Sourcing Outgrows the Database

Confluent

I’ve always found event sourcing to be fascinating. We spend so much of our lives as developers saving data in database tables—doing this in a completely different way seems almost […].

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Big Data 50: Companies Driving Innovation

DataKitchen

The post Big Data 50: Companies Driving Innovation first appeared on DataKitchen.

article thumbnail

Data-Driven Performance Improvements: Basketball and actionable insights

Retail Insight

At the 1992 Olympics, the American men’s basketball team won the gold medal after years of disappointment and underperformance. For the first time at an Olympics, Team USA was comprised of professional US National Basketball Association (NBA) players, including the legendary Michael Jordan. Since this ‘Dream Team’ was formed, the USA men’s basketball team has won seven golds at the last eight Olympics, including most recently at Tokyo 2020.

Data 52
article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

#ClouderaLife Spotlight: Fanly Tanto, Regional Sales Director

Cloudera

Meet Fanly Tanto. Fanly is a Regional Sales Director operating out of Indonesia and the recent recipient of Channel Asia’s Women in ICT “Shining Star” Award – an award recognizing candidates with “a strong record of achievement and a consistent high performer who regularly achieves standout business results and continues to assume increased levels of seniority.” .

article thumbnail

Spark vs Hive - What's the Difference

ProjectPro

Apache Hive and Apache Spark are the two popular Big Data tools available for complex data processing. To effectively utilize the Big Data tools, it is essential to understand the features and capabilities of the tools. Spark vs. Hive comparison elaborates on the two tools’ architecture, features, limitations, and key differences. Table of Contents Spark vs Hive - Architecture Hive vs Spark - Key Features and Capabilities Apache Hive - Key Features Apache Spark - Key Features Apache Spark

Hadoop 52

More Trending

article thumbnail

Hello World: Join the New Rockset Developer Community

Rockset

At Rockset, we work hard to build developer tools (as well as APIs and SDKs) that allow you to easily consume semi-structured data using SQL and run sub-second queries on real-time data. You automatically get our Converged Index ™, which unifies indexing, sub-second query latency on terabytes of nested data, real-time data ingestion for mere seconds in data latency, and much more.

SQL 52
article thumbnail

Cloudera and NVIDIA Help IRS Fight Fraud, Safeguard Taxpayers

Cloudera

Across the federal government, agencies are struggling to identify, organize, analyze, and act on troves of data. It’s a problem that leaders are working actively to tackle, but they’re in a race against immeasurable volumes of data that is continuously being generated in perpetuity in stores known and unknown. At the Internal Revenue Service, decades’ worth of data exceeds even the most cutting-edge processing capabilities.

article thumbnail

Taking Pride in Our Actions

Teradata

Corporate responsibility may have a new name but Teradata’s commitments continue to shine. Read Claire Bramley and Molly Treese’s overview of Teradata’s dedicated ESG efforts.

52
article thumbnail

Welcome, KC!

Grouparoo

The promise of open source is one of community. It is about people making great things together. With that in mind, maybe it's not surprising that we first met KC Glick years ago when he contributed to the Actionhero project that is at the core of Grouparoo. Now, he's on the Grouparoo team and will be contributing throughout the stack. KC comes to us most recently from iHeart, the media company that runs all those stations we listen to.

Media 52
article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

See Rockset’s Rollups for Streaming Data at Kafka Summit 2021

Rockset

Event stream processing has lately become the most-requested feature among data practitioners, who are ever being pushed by their business counterparts for more fresh, real-time insights to improve their operational decisions and boost the digital customer experience. But while streaming data is easy, analyzing it in real time was, until recently, too expensive and too slow.

Kafka 52
article thumbnail

Slowly Changing Dimensions (SCD Type 1) with Delta and Databricks

Advancing Analytics: Data Engineering

From Warehouse to Lakehouse Pt.1 SCD Type 1 in SQL and Python Introduction With the move to cloud based Data Lake platforms there has often been criticism from the more traditional Data Warehousing community. A Data Lake, offering cheap, almost endlessly scalable storage in the cloud is hugely appealing to a platform administrator however over the number of years that this has been promoted some adopters have often fallen victim to the infamous Data Swamp.

article thumbnail

20 Web Scraping Projects Ideas for 2023

ProjectPro

In this article, you will find a list of interesting web scraping projects that are fun and easy to implement. The list has worthwhile web scraping projects for both beginners and intermediate professionals. The projects have been divided into categories so that you can quickly pick one as per your requirements. Table of Contents Top 20 Web Scraping Project Ideas Useful Web Scraping Projects for Beginners Fun Web Scraping Projects for Final Year Students Python Web Scraping Projects Machine Lear

Project 52
article thumbnail

Reflecting on Change.

Teradata

Change is inevitable, but you have to adapt to survive. Take a look back on the last 40 years to see how Teradata has adapted to change.and not only survived, but thrived.

52
article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

Apache Superset™ Now Supports Rockset

Preset

Apache Superset™ now supports Rockset as a data source. Rockset is a real-time indexing database built for the cloud that uses RocksDB for fast storage.

article thumbnail

Why Take a Warehouse-First Approach to Analytics

RudderStack

This post explores how leveraging your data warehouse as a central, foundational source of truth unlocks higher quality, more secure, and less expensive data analytics.

article thumbnail

Top 15 Machine Learning Use Cases in 2023

ProjectPro

The Machine Learning market is anticipated to be worth $30.6 Billion in 2024. The world is increasingly driven by the Internet of Things (IoT) and Artificially Intelligent (AI) solutions. Machine Learning plays a vital role in the design and development of such solutions. Machine learning is everywhere. We live in an era led by machine learning applications , be it the Voice Assistants on our Smartphones, the Face Unlock feature, the surge pricing on the ride-hailing apps, email filtering, and m

article thumbnail

A View From The Round Table Of Gartner's Cool Vendors

Data Engineering Podcast

Summary Gartner analysts are tasked with identifying promising companies each year that are making an impact in their respective categories. For businesses that are working in the data management and analytics space they recognized the efforts of Timbr.ai, Soda Data, Nexla, and Tada. In this episode the founders and leaders of each of these organizations share their perspective on the current state of the market, and the challenges facing businesses and data professionals today.

SQL 100
article thumbnail

Entity Resolution Checklist: What to Consider When Evaluating Options

Are you trying to decide which entity resolution capabilities you need? It can be confusing to determine which features are most important for your project. And sometimes key features are overlooked. Get the Entity Resolution Evaluation Checklist to make sure you’ve thought of everything to make your project a success! The list was created by Senzing’s team of leading entity resolution experts, based on their real-world experience.

article thumbnail

Building an Open-source Ingestion Layer with Airbyte

Preset

To build an open-source community tracker, we first build an ingestion layer with Airbyte

article thumbnail

RudderStack Product News Vol. #012 - Call for Beta Users

RudderStack

In this update, we cover the S3 Data Lake destination, our Braze Currents source, and other new integrations.

article thumbnail

How to Become an MLOps Engineer in 2023?

ProjectPro

In the past few years, there has been a massive increase in the demand for data-related roles. The hiring for machine learning and artificial intelligence-related roles has grown by 74% annually. People from a multitude of backgrounds are trying to break into the data industry. Most of these individuals attempt to land a job in data science or analytics.

article thumbnail

Jellyfish: Cost-Effective Data Tiering for Uber’s Largest Storage System

Uber Engineering

Problem. Uber deploys a few storage technologies to store business data based on their application model. One such technology is called Schemaless , which enables the modeling of related entries in one single row of multiple columns, as well as … The post Jellyfish: Cost-Effective Data Tiering for Uber’s Largest Storage System appeared first on Uber Engineering Blog.

Systems 134
article thumbnail

How to Build an Experimentation Culture for Data-Driven Product Development

Speaker: Margaret-Ann Seger, Head of Product, Statsig

Experimentation is often seen as an aspirational practice, especially at smaller, fast-moving companies who are strapped for time and resources. So, how can you get your team making decisions in a more data-driven way while continuing to remain lean and maintaining ship velocity? In this webinar, Margaret-Ann Seger, Head of Product at Statsig, will teach you how to build an experimentation culture from the ground-up, graduating from just getting started with data-driven development to operating

article thumbnail

Decision Making at Netflix

Netflix Tech

Martin Tingley with Wenjing Zheng , Simon Ejdemyr , Stephanie Lane , and Colin McFarland This introduction is the first in a multi-part series on how Netflix uses A/B tests to make decisions that continuously improve our products, so we can deliver more joy and satisfaction to our members. Subsequent posts will cover the basic statistical concepts underpinning A/B tests, the role of experimentation across Netflix, how Netflix has invested in infrastructure to support and scale experimentation, a

article thumbnail

What is Operational Analytics?

Grouparoo

We've improved the Getting Started Experience! Check out our UI Configuration method. The steps utilizing grouparoo generate will not be replicable as the command will be fully deprecated in v0.8.1 What is Operational Analytics? Operational analytics is the process of creating data pipelines and datasets to support business teams such as sales, marketing, and customer support.

article thumbnail

Enabling Multi-User Fine-Grained Access Control for Cloud Storage in CDP

Cloudera

Shared Data Experience ( SDX ) on Cloudera Data Platform ( CDP ) enables centralized data access control and audit for workloads in the Enterprise Data Cloud. The public cloud (CDP-PC) editions default to using cloud storage (S3 for AWS, ADLS-gen2 for Azure). This introduces new challenges around managing data access across teams and individual users.

article thumbnail

Micro Frontends: Deep Dive into Rendering Engine (Part 2)

Zalando Engineering

Zalando's Fashion Store has been running on top of microservices for quite some time already. This architecture has proven to be very flexible, and project Mosaic has extended it – although partially – to the frontend, allowing HTML fragments from multiple services to be stitched together, and served as a single page. Fragments in Mosaic can be seen as the first step towards a Micro Frontends architecture.

article thumbnail

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Speaker: David Bard, Principal at VP Product Coaching

In the fast-paced world of digital innovation, success is often accompanied by a multitude of challenges - like the pitfalls lurking at every turn, threatening to derail the most promising projects. But fret not, this webinar is your key to effective product development! Join us for an enlightening session to empower you to lead your team to greater heights.

article thumbnail

Data Integration: Approaches, Techniques, Tools, and Best Practices for Implementation

AltexSoft

The pace of data being created is mind-blowing. For example, Amazon receives more than 66,000 orders per hour with each order containing valuable pieces of information for analytics. Yet, dealing with continuously growing volumes of data isn’t the only challenge businesses encounter on the way to better, faster decision-making. Information often resides across countless distributed data sources, resulting in data silos.

article thumbnail

Data Engineering Annotated Monthly – August 2021

Big Data Tools

August is usually a quiet month, with vacations taking their toll. But data engineering never stops. I’m Pasha Finkelshteyn and I will be your guide through this month’s news, my impressions of the developments, and ideas from the wider community. If you think I missed something worthwhile, ping me on Twitter and suggest a topic, link, or anything else.

article thumbnail

Supporting Transformation with an Integrated Data Platform. Three Common Questions Answered.

Cloudera

In recent years there has been increased interest in how to safely and efficiently extend enterprise data platforms and workloads into the cloud. CDOs are under increasing pressure to reduce costs by moving data and workloads to the cloud, similar to what has happened with business applications during the last decade. Our upcoming webinar is centered on how an integrated data platform supports the data strategy and goals of becoming a data-driven company.

article thumbnail

AWS vs GCP - Which One to Choose in 2023?

ProjectPro

Are you confused about choosing the best cloud platform for your next data engineering project ? AWS vs. GCP blog compares the two major cloud platforms to help you choose the best one. So, are you ready to explore the differences between two cloud giants, AWS vs. google cloud? Let’s get started! Table of Contents AWS vs. GCP - The Cloud Battle AWS vs.

AWS 52
article thumbnail

Reimagined: Building Products with Generative AI

“Reimagined: Building Products with Generative AI” is an extensive guide for integrating generative AI into product strategy and careers featuring over 150 real-world examples, 30 case studies, and 20+ frameworks, and endorsed by over 20 leading AI and product executives, inventors, entrepreneurs, and researchers.