September, 2018

article thumbnail

Building A Knowledge Graph From Public Data At Enigma With Chris Groskopf - Episode 50

Data Engineering Podcast

Summary There are countless sources of data that are publicly available for use. Unfortunately, combining those sources and making them useful in aggregate is a time consuming and challenging process. The team at Enigma builds a knowledge graph for use in your own data projects. In this episode Chris Groskopf explains the platform they have built to consume large varieties and volumes of public data for constructing a graph for serving to their customers.

Building 100
article thumbnail

Themes and Conferences per Pacoid, Episode 1

Domino Data Lab: Data Engineering

Introduction: New Monthly Series! Welcome to a new monthly series! I’ll summarize highlights from recent industry conferences, new open source projects, interesting research, great examples, amazing people, etc. – all pointed at how to level up your organization’s data science practices.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

A new era of SQL-development, fueled by a modern data warehouse

Cloudera

SQL development is not a new concept. However, as the data warehousing world shifts into a fast-paced, digital, and agile era, the demands to quickly generate reports and help guide data-driven decisions are constantly increasing. This puts new pressures on the people working behind the scenes to prepare and serve data in a consumable way to a growing audience with various levels of access credentials and technical expertise.

article thumbnail

The Journey to Connecting Retail

Zalando Engineering

Digitizing brick & mortar fashion stores through Connected Retail Everything started back in 2015 when Zalando was already successful as an online fashion retailer in Europe. However, a B2B problem was identified that needed to be tackled: brick-and-mortar fashion stores need a way to increase their sales. Seeing the need to connect offline with online in order to help merchants solve this problem, is when I joined Zalando as a Product Manager in early 2016 at the newly established Helsinki

Retail 40
article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

Recap of Hadoop News for August 2018

ProjectPro

News on Hadoop - August 2018 Apache Hadoop: A Tech Skill That Can Still Prove Lucrative.Dice.com, August 2, 2018. In 2017, Gartner announced that organizations were spending close to $800 million on Hadoop distributions , even though only 14% of companies reported that they were relying on hadoop technology.However, several studies have revealed that the adoption and spending on hadoop technology continues to rise high through last year.Dice analysis demonstrates that jobs that intersect with Ha

Hadoop 40
article thumbnail

A Primer On Enterprise Data Curation with Todd Walter - Episode 49

Data Engineering Podcast

Summary As your data needs scale across an organization the need for a carefully considered approach to collection, storage, organization, and access becomes increasingly critical. In this episode Todd Walter shares his considerable experience in data curation to clarify the many aspects that are necessary for a successful platform for your business.

Data Lake 100

More Trending

article thumbnail

Keep Your Data And Query It Too Using Chaos Search with Thomas Hazel and Pete Cheslock - Episode 47

Data Engineering Podcast

Summary Elasticsearch is a powerful tool for storing and analyzing data, but when using it for logs and other time oriented information it can become problematic to keep all of your history. Chaos Search was started to make it easy for you to keep all of your data and make it usable in S3, so that you can have the best of both worlds. In this episode the CTO, Thomas Hazel, and VP of Product, Pete Cheslock, describe how they have built a platform to let you keep all of your history, save money, a

IT 100
article thumbnail

An Agile Approach To Master Data Management with Mark Marinelli - Episode 46

Data Engineering Podcast

Summary With the proliferation of data sources to give a more comprehensive view of the information critical to your business it is even more important to have a canonical view of the entities that you care about. Is customer number 342 in your ERP the same as Bob Smith on Twitter? Using master data management to build a data catalog helps you answer these questions reliably and simplify the process of building your business intelligence reports.

article thumbnail

Taking out the threat from the inside

Cloudera

The worst thing about an inside job is that once it’s detected, it’s usually too late. Early detection is critical to prevent considerable damage arising out of insider threats to the business. But that’s easier said than done! Whether it’s a rogue trader in a bank or brokerage or someone illegally sharing company intellectual property or intelligence, illegal insider actions put enterprises at risk of losing millions.

article thumbnail

And the winners are…. Congratulations to the Sixth Annual Data Impact Awards winners

Cloudera

It’s a big week for us, as many Clouderans descend on New York for the Strata Data Conference. The week is typically filled with exciting announcements from Cloudera and many partners and others in the data management, machine learning and analytics industry. Last night we kicked it off with the sixth annual Data Impact Awards Celebration. These awards recognize organizations that transform complex data into actionable insights and illustrate impact to technology, science, health, lifestyle, and

article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

Cloudera spotlights partner success at Strata Data with Partner Impact Awards

Cloudera

At Strata Data , it appeared that artificial intelligence, machine learning, and the promise of game-changing insights from big data were at the forefront of everyone’s mind. Cloudera aimed to demystify the “how” in the AI and big data equation at Strata Data through helpful sessions, anticipated keynotes, and new product announcements to alleviate the mystery associated with leveraging this revolutionary technology.

article thumbnail

Cloudera Data Warehouse – A Partner Perspective

Cloudera

Among the many reasons that a majority of large enterprises have adopted Cloudera Data Warehouse as their modern analytic platform of choice is the incredible ecosystem of partners that have emerged over recent years. In this new blog series, we will take a closer look at some of the most innovative partners, and how the Cloudera platform is helping them deliver groundbreaking solutions to our customers.

article thumbnail

Building an Open Data Processing Pipeline for IoT

Cloudera

Authors: David Bericat, Global Technical Lead, Internet of Things, Red Hat and Jonathan Cooper-Ellis, Solutions Architect, Cloudera. Last week Cloudera introduced an open end-to-end architecture for IoT and the different components needed to help satisfy today’s enterprise needs regarding operational technology (OT), information technology (IT), data analytics and machine learning (ML), along with modern and traditional application development, deployment, and integration.

article thumbnail

Take Customer Experience Back to the Future with Data

Cloudera

Delivering a positive and memorable customer experience is the cornerstone of nearly every organization. Failure to do so negatively impacts a company’s bottom line and reputation. Each year, companies invest millions of dollars in programs and solutions that aim to improve the customer experience and provide valuable customer insights, but what if for the answer, they only had to look back to the future?

Banking 40
article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

Boosting enterprise machine learning with automated feature engineering

Cloudera

Machine learning. The very name suggests there’s little involvement required from actual people. It’s a bit surprising to note, then, that perhaps the most limiting factor in data science and machine learning today is people. People add complexity. People add the risk of error. And people add a lot of time. However, we’ll always need people to come up with the overarching prediction problems to solve and to make the ultimate choices to solve them, but there is a lot of data science work now that

article thumbnail

Shop the Look with Deep Learning

Zalando Engineering

Retrieving fashion products based on a query image Have you ever seen a picture on Instagram and thought, “Oh, wow! I want these shoes”? or been inspired by your favourite fashion blogger and looked for similar products (for example, on Zalando)? Visual search for fashion, the task of identifying fashion articles in an image and finding them in an online store, has been the subject of an ever growing body of scientific literature over the last few years (see for example [1-11]).

article thumbnail

Zalando Strengthens its InnerSource Strategy

Zalando Engineering

Zalando is known for its commitment to the open source world. Many of our engineers are active contributors of open source projects like PostgreSQL or Kubernetes. The Zalando tech department currently consists of more than 2,000 employees that manage over 200 delivery teams and virtual teams. Zalando engineers are from 77 nations and based out of various locations across Europe which makes us super international but also quite distributed.

IT 40
article thumbnail

An End-to-End Open & Modular Architecture for IoT

Cloudera

While the Internet of Things (IoT) represents a significant opportunity, IoT architectures are often rigid, complex to implement, costly, and create a multitude of challenges for organizations. First of all, in order to effectively pull together an end-to-end architecture for IoT, organizations must manage multiple vendor solutions, validate that they work together, integrate them to ensure the right functionality, and provide for future enhancement compatibility.

article thumbnail

Entity Resolution Checklist: What to Consider When Evaluating Options

Are you trying to decide which entity resolution capabilities you need? It can be confusing to determine which features are most important for your project. And sometimes key features are overlooked. Get the Entity Resolution Evaluation Checklist to make sure you’ve thought of everything to make your project a success! The list was created by Senzing’s team of leading entity resolution experts, based on their real-world experience.

article thumbnail

AML: Past, Present and Future – Part III

Cloudera

This is the third installment in a 3 part series. The first installment provides a short background on anti-money laundering. The second installment examines common AML problems faced by financial institutions today. In this installment, we introduce an approach that carries AML well into the future. Part III: The future is now. Given what we know about current anti-money laundering systems, if we wanted to build one from scratch today, we might come up with the following requirements.

Banking 40
article thumbnail

Heralding a new era in GDPR compliance with Accenture and Cloudera

Cloudera

The General Data Protection Regulation (GDPR), in force since May 25, strengthens and unifies data protection laws for individuals within the European Union (EU), making personal data privacy a fundamental right for all. Traditionally, while companies have relied on time-consuming manual processes to achieve compliance, Accenture and Cloudera are harnessing advances in technology to simplify the compliance.

article thumbnail

Delivering a Shared Multidisciplinary Analytics Experience Anywhere With SDX and Altus

Cloudera

Woodworking is one of my passions and I often use wooden pallets as my source material. Regardless of what I build—whether a shelf, chair, or bookcase—I always use the same things: Wood, tools, and a plan that shows dimensions and steps to put all the bits together. The other day, it struck me how similar this is to how organizations digitally transform and become data-driven.

article thumbnail

Visual Creation and Exploration at Zalando Research

Zalando Engineering

Adversarial texture distribution learning as a tool of artistic expression Deep learning is progressing fast these days. Despite advances that were expected to happen sooner or later (e.g. accurate face and speech recognition), there are some new developments that would have seemed like a pipe dream years ago: neural networks can now generate realistic images just by looking at few examples of their properties.

article thumbnail

The Big Payoff of Application Analytics

Outdated or absent analytics won’t cut it in today’s data-driven applications – not for your end users, your development team, or your business. That’s what drove the five companies in this e-book to change their approach to analytics. Download this e-book to learn about the unique problems each company faced and how they achieved huge returns beyond expectation by embedding analytics into applications.

article thumbnail

Altus Data Warehouse

Cloudera

We are proud to announce the general availability of Cloudera Altus Data Warehouse , the only cloud data warehousing service that brings the warehouse to the data. Cloudera’s modern data warehouse runs wherever it makes the most sense for your business – on-premises, public cloud, hybrid cloud, or even multi-cloud. Modern data warehousing for the cloud.

article thumbnail

AML: Past, Present and Future – Part II

Cloudera

This is the second installment in a 3 part series. The first installment provides a short background on anti-money laundering. In this installment, we examine common AML problems faced by financial institutions today. The third installment introduces an approach that carries AML into the future. Part II: Current Challenges in AML. There are several key areas in the field of anti-money laundering (AML) that rely heavily on technology.

article thumbnail

AML: Past, Present and Future Part I

Cloudera

This is the first installment in a 3 part series. It provides a short background on anti-money laundering for the layperson. AML professionals may wish to skip this installment and go directly to the second and third parts. The second installment examines common AML problems faced by financial institutions today. The third installment introduces an approach that carries AML into the future.

Banking 43