April, 2019

article thumbnail

Running Your Database On Kubernetes With KubeDB

Data Engineering Podcast

Summary Kubernetes is a driving force in the renaissance around deploying and running applications. However, managing the database layer is still a separate concern. The KubeDB project was created as a way of providing a simple mechanism for running your storage system in the same platform as your application. In this episode Tamal Saha explains how the KubeDB project got started, why you might want to run your database with Kubernetes, and how to get started.

Database 100
article thumbnail

12 Programming Languages Walk into a Kafka Cluster…

Confluent

When it was first created, Apache Kafka ® had a client API for just Scala and Java. Since then, the Kafka client API has been developed for many other programming languages which enables you to pick the language you want. This freedom of choice ultimately allows you to build an event streaming platform with the language best suited to your business needs.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

3 Ways New As-a-Service Offerings Bring Choice and Flexibility to Teradata Vantage

Teradata

At Teradata, we think a lot about our customers in the cloud, and continue on our promise to deliver choice and flexibility by adding new as-a-service options for Teradata Vantage.

Cloud 83
article thumbnail

Introducing SVT-AV1: a scalable open-source AV1 framework

Netflix Tech

by Andrey Norkin, Joel Sole, Kyle Swanson, Mariana Afonso, Anush Moorthy, Anne Aaron Netflix Headquarters, Winchester Circle. Netflix headquarters circa 2014. It’s a nice building with good architecture! This was the primary home of Netflix for a number of years during the company’s growth, but at some point Netflix had outgrown its home and needed more space.

Coding 64
article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

Breaking Down Data Silos in Financial Services with a Centralized Data Management Platform

Cloudera

Organizations in the financial services industry rely on data to make strategic decisions, drive their businesses, and maintain a competitive edge. The Bank of England was discovering that legacy tools were no longer sufficient to satisfy the growing demands of analysts and economists. The Bank of England is the central bank of the United Kingdom formed in 1694.

article thumbnail

Analytics on DynamoDB: Comparing Elasticsearch, Athena and Spark

Rockset

In this blog post I compare options for real-time analytics on DynamoDB - Elasticsearch , Athena, and Spark - in terms of ease of setup, maintenance, query capability, latency. There is limited support for SQL analytics with some of these options. I also evaluate which use cases each of them are best suited for. Developers often have a need to serve fast analytical queries over data in Amazon DynamoDB.

NoSQL 52

More Trending

article thumbnail

Announcing Confluent Cloud for Apache Kafka as a Native Service on Google Cloud Platform

Confluent

I’m excited to announce that we’re partnering with Google Cloud to make Confluent Cloud, our fully managed offering of Apache Kafka ® , available as a native offering on Google Cloud Platform (GCP). This means you will have the ability to use Confluent Cloud’s managed Apache Kafka service with familiar Google tools and processes, including integration into the Google Cloud Console and GCP Marketplace to provide a seamless sign-up experience, and integrated billing and first-line support provided

article thumbnail

Why Smart Cities Need Intelligent Data

Teradata

In his blog, Bob McQueen defines smart cities, their challenges and opportunities, and the use of smart data management.

Data 86
article thumbnail

Open Source: March Updates - A new Kubernetes operator & more Cloud Native Apps

Zalando Engineering

Project Highlights A new operator is added to Zalando’s list of Cloud Native Applications. Elasticsearch Operator - an operator for running Elasticsearch in Kubernetes with focus on operational aspects, like safe draining and offering auto-scaling capabilities for Elasticsearch data nodes, rather than just abstracting manifest definitions. To make things even simpler for developers, we also released a new framework that helps to build Kubernetes operators in Python.

Cloud 52
article thumbnail

Machine Learning in Production: Software Architecture

Domino Data Lab: Data Engineering

Special thanks to Addison-Wesley Professional for permission to excerpt the following "Software Architecture" chapter from the book, Machine Learning in Production. This chapter excerpt provides data scientists with insights and tradeoffs to consider when moving machine learning models to production. Also, if you’re interested in learning about how Domino provides an API endpoint for your model, check out this video tutorial on the Domino Support site.

article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

What customer centric corporate culture really means and why it is so important

Cloudera

All organizations, big or small, have a unique corporate culture that has been nurtured and mastered over the years. A company’s culture is its basic personality and the essence of how employees interact and work. It is the sum of company beliefs, ethics, expectations, goals, value and mission. The company culture is normally where brand promises are either kept or broken.

article thumbnail

Index Your Big Data With Pilosa For Faster Analytics

Data Engineering Podcast

Summary Database indexes are critical to ensure fast lookups of your data, but they are inherently tied to the database engine. Pilosa is rewriting that equation by providing a flexible, scalable, performant engine for building an index of your data to enable high-speed aggregate analysis. In this episode Seebs explains how Pilosa fits in the broader data landscape, how it is architected, and how you can start using it for your own analysis.

Big Data 100
article thumbnail

KSQL: What’s New in 5.2

Confluent

KSQL enables you to write streaming applications expressed purely in SQL. There’s a ton of great new features in 5.2, many of which are a result of requests and support from the community—we use GitHub to track these, and I’ve indicated in each point below the corresponding issue. If you have suggestions for new features, please do be sure to search our GitHub issues page and upvote, or create a new issue as appropriate.

Food 95
article thumbnail

How to Analyze Data at Speed and Scale Using Pervasive Data Intelligence

Teradata

Chris Twogood explains while large companies who utilize data need Pervasive Data Intelligence in order to leverage all of their data, all of the time.

article thumbnail

How to Build an Experimentation Culture for Data-Driven Product Development

Speaker: Margaret-Ann Seger, Head of Product, Statsig

Experimentation is often seen as an aspirational practice, especially at smaller, fast-moving companies who are strapped for time and resources. So, how can you get your team making decisions in a more data-driven way while continuing to remain lean and maintaining ship velocity? In this webinar, Margaret-Ann Seger, Head of Product at Statsig, will teach you how to build an experimentation culture from the ground-up, graduating from just getting started with data-driven development to operating

article thumbnail

How to set an ideal thread pool size

Zalando Engineering

We all know that thread creation in Java is not free. The actual overhead varies across platforms, but thread creation takes time, introducing latency into request processing, and requires some processing activity by the JVM and OS. This is where the Thread Pool comes to the rescue. The thread pool reuses previously created threads to execute current tasks and offers a solution to the problem of thread cycle overhead and resource thrashing.

Java 45
article thumbnail

How We Structure our dbt Projects

dbt Developer Hub

As the maintainers of dbt, and analytics consultants, at Fishtown Analytics (now dbt Labs) we build a lot of dbt projects. Over time, we’ve developed internal conventions on how we structure them. This article does not seek to instruct you on how to design a final model for your stakeholders — it won’t cover whether you should denormalize everything into one wide master table , or have many tables that need to be joined together in the BI layer.

Project 40
article thumbnail

Secondary Indexes For Analytics On DynamoDB

Rockset

In this post I explore how to support analytical queries without encountering prohibitive scan costs, by leveraging secondary indexes in DynamoDB. I also evaluate the pros and cons of this approach in contrast to extracting data to another system like Athena, Spark or Elastic. Rockset recently added support for DynamoDB - which basically means you can run fast SQL on DynamoDB tables without any ETL.

NoSQL 40
article thumbnail

Announcing the General Availability of Cloudera Flow Management and Cloudera Edge Management

Cloudera

Last month at Strata, San Francisco, we made an announcement about two upcoming products – Cloudera Flow Management and Cloudera Edge Management. Today, we are super excited to announce that both the products are generally available for use. While Cloudera Flow Management has been eagerly awaited by our Cloudera customers for use on their existing Cloudera platform clusters, Cloudera Edge Management has generated equal buzz across the industry for the possibilities that it brings to enterp

article thumbnail

Entity Resolution Checklist: What to Consider When Evaluating Options

Are you trying to decide which entity resolution capabilities you need? It can be confusing to determine which features are most important for your project. And sometimes key features are overlooked. Get the Entity Resolution Evaluation Checklist to make sure you’ve thought of everything to make your project a success! The list was created by Senzing’s team of leading entity resolution experts, based on their real-world experience.

article thumbnail

Monitoring Data Replication in Multi-Datacenter Apache Kafka Deployments

Confluent

Enterprises run modern data systems and services across multiple cloud providers, private clouds and on-prem multi-datacenter deployments. Instead of having many point-to-point connections between sites, the Confluent Platform provides an integrated event streaming architecture with frictionless data replication between sites. Applications can publish streams of data to a self-hosted on-prem cluster, replicate them to another on-prem cluster or to different cloud providers, load them into data s

Kafka 86
article thumbnail

How U.S. Bank Uses A.I. and Machine Learning to Deeply Personalize Your Banking Experience

Teradata

Katherine Knowles-Marchione explains how US. Bank is using AI to improve and personalize the banking experience.

Banking 85
article thumbnail

Learning DevOps as a Software Engineer

Zalando Engineering

At Zalando the teams are autonomous and involved in the entire software development process - from gathering stakeholder requirements to design, implementation, testing and deployment. For me, this was one of the greatest challenges/opportunities of joining Zalando and it allowed me to grow on so many dimensions of software development, one of these being DevOps.

article thumbnail

Python at Netflix

Netflix Tech

By Pythonistas at Netflix, coordinated by Amjith Ramanujam and edited by Ellen Livengood As many of us prepare to go to PyCon, we wanted to share a sampling of how Python is used at Netflix. We use Python through the full content lifecycle, from deciding which content to fund all the way to operating the CDN that serves the final video to 148 million members.

Python 111
article thumbnail

The Big Payoff of Application Analytics

Outdated or absent analytics won’t cut it in today’s data-driven applications – not for your end users, your development team, or your business. That’s what drove the five companies in this e-book to change their approach to analytics. Download this e-book to learn about the unique problems each company faced and how they achieved huge returns beyond expectation by embedding analytics into applications.

article thumbnail

Serverless Data Pipelines On DataCoral

Data Engineering Podcast

Summary How much time do you spend maintaining your data pipeline? How much end user value does that provide? Raghu Murthy founded DataCoral as a way to abstract the low level details of ETL so that you can focus on the actual problem that you are trying to solve. In this episode he explains his motivation for building the DataCoral platform, how it is leveraging serverless computing, the challenges of delivering software as a service to customer environments, and the architecture that he has de

article thumbnail

Intel and Cloudera collaborate to bring improved performance to customers with Optane DC Persistent Memory

Cloudera

Cloudera and Intel have a long history of innovation, driving big data analytics and machine learning into the enterprise with unparalleled performance and security. We are pleased to build upon that direction with our collaboration on Intel® Optane DC persistent memory. Available to customers running 2nd Generation Intel® Xeon® Scalable processors, Intel Optane DC persistent memory can significantly enhance the performance of real-time and streaming applications.

NoSQL 49
article thumbnail

Reshaping Entire Industries with IoT and Confluent Cloud

Confluent

While the current hype around the Internet of Things (IoT) focuses on smart “things”—smart homes, smart cars, smart watches—the first known IoT device was a simple Coca-Cola vending machine at Carnegie Mellon University in Pittsburgh. Students in the 1980s, tired of long walks to an empty machine, installed a board that tracked the machine’s sensors to determine whether the machine was stocked and the bottles were cold.

Food 83
article thumbnail

The Eight Functions You Should Consider When Choosing a Self-Service Analytics Platform

Teradata

This blog discusses the functions one should consider when choosing a self-service analytics platform.

71
article thumbnail

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Speaker: David Bard, Principal at VP Product Coaching

In the fast-paced world of digital innovation, success is often accompanied by a multitude of challenges - like the pitfalls lurking at every turn, threatening to derail the most promising projects. But fret not, this webinar is your key to effective product development! Join us for an enlightening session to empower you to lead your team to greater heights.

article thumbnail

Developing Zalando APIs

Zalando Engineering

How Zalando software engineers develop internal and external APIs Imagine a distributed system consisting of 8,000+ active service applications; developed and operated by 300+ delivery teams in six tech hubs. 1,200+ software engineers use various technologies to implement business needs and are responsible end-to-end for those components. A pretty complex system of people and software.

Scala 40
article thumbnail

How to Use AI and Video Analytics to Give Your Retail Business a Competitive Edge

Teradata

Peter Mackenzie explains the advancements in AI and video analytics in the retail sector.

Retail 58
article thumbnail

Optimizing Kafka Streams Applications

Confluent

With the release of Apache Kafka ® 2.1.0, Kafka Streams introduced the processor topology optimization framework at the Kafka Streams DSL layer. This framework opens the door for various optimization techniques from the existing data stream management system (DSMS) and data stream processing literature. In what follows, we provide some context around how a processor topology was generated inside Kafka Streams before 2.1, with a focus on stateful operations like aggregations and joins.

Kafka 90
article thumbnail

Putting Events in Their Place with Dynamic Routing

Confluent

Event-driven architecture means just that: It’s all about the events. In a microservices architecture, events drive microservice actions. No event, no shoes, no service. In the most basic scenario, microservices that need to take action on a common stream of events all listen to that stream. In the Apache Kafka ® world, this means that each of those microservice client applications subscribes to a common Kafka topic.

Kafka 108
article thumbnail

Driving Business Impact for PMs

Speaker: Jon Harmer, Product Manager for Google Cloud

Move from feature factory to customer outcomes and drive impact in your business! This session will provide you with a comprehensive set of tools to help you develop impactful products by shifting from output-based thinking to outcome-based thinking. You will deepen your understanding of your customers and their needs as well as identifying and de-risking the different kinds of hypotheses built into your roadmap.