November, 2020

article thumbnail

A Data Scientist in Engineering Wonderland

Team Data Science

As a data scientist, I always felt a missing link between my developed models and putting them in the production process. Yes, I can create a pipeline, write a model, get results, and interpret the results, but if I cannot scale it, these all will sit on my Jupiter notebooks. This thought led me to my data engineering adventure. I am confident that learning data engineering will make me a better data scientist.

article thumbnail

How to Pull Data from an API, Using AWS Lambda

Start Data Engineering

Introduction If you are looking for a simple, cheap data pipeline to pull small amounts of data from a stable API and store it in a cloud storage, then serverless functions are a good choice. This post aims to answer questions like the ones shown below My company does not have the budget to purchase a tool like fivetran, What should I use to pull data from an API ?

AWS 130
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Analysing historical and live data with ksqlDB and Elastic Cloud

Confluent

Building data pipelines isn’t always straightforward. The gap between the shiny “hello world” examples of demos and the gritty reality of messy data and imperfect formats is sometimes all too […].

Cloud 139
article thumbnail

Streaming Data Integration Without The Code at Equalum

Data Engineering Podcast

Summary The first stage of every good pipeline is to perform data integration. With the increasing pace of change and the need for up to date analytics the need to integrate that data in near real time is growing. With the improvements and increased variety of options for streaming data engines and improved tools for change data capture it is possible for data teams to make that goal a reality.

article thumbnail

Get Better Network Graphs & Save Analysts Time

Many organizations today are unlocking the power of their data by using graph databases to feed downstream analytics, enahance visualizations, and more. Yet, when different graph nodes represent the same entity, graphs get messy. Watch this essential video with Senzing CEO Jeff Jonas on how adding entity resolution to a graph database condenses network graphs to improve analytics and save your analysts time.

article thumbnail

Veterans Day: What Service Means to Clouderan Vets

Cloudera

Around the world, a number of countries celebrate November 11 as a day to give thanks and recognition for their veterans. Originally designated to honor the end of World War I ( Armistice Day and Remembrance Day ), in some countries it is now used to pay respect to all veterans ( Veterans Day ). . Year after year, we use this time to express our support and appreciation to those who have served in the military.

article thumbnail

How to Make the Most of Big Data Analytics in Your Business

Teradata

Big data's growth and its impact on business is undeniable. But how do you make the most of your data analytics to create real business value? Find out more.

More Trending

article thumbnail

5 things you should know about Real-Time Analytics

A Cloud Guru: Data Engineering

Running analytics on real-time data is a challenge many data engineers are facing today. But not all analytics can be done in real time! Many are dependent on the volume of the data and the processing requirements. Even logic conditions are becoming a bottleneck. For example, think about join operations on huge tables with more […] The post 5 things you should know about Real-Time Analytics appeared first on A Cloud Guru.

article thumbnail

Digital Transformation in Style: How Boden Innovates Retail Using Apache Kafka

Confluent

As a clothing retailer with more than 1.5 million customers worldwide, Boden is always looking to capitalise on business moments to drive sales. For example, when the Duchess of Cambridge […].

Retail 134
article thumbnail

Keeping A Bigeye On The Data Quality Market

Data Engineering Podcast

Summary One of the oldest aphorisms about data is "garbage in, garbage out", which is why the current boom in data quality solutions is no surprise. With the growth in projects, platforms, and services that aim to help you establish and maintain control of the health and reliability of your data pipelines it can be overwhelming to stay up to date with how they all compare.

Hadoop 100
article thumbnail

Fraud Detection using Deep Learning

Cloudera

One of the many areas where machine learning has made a large difference for enterprise business is in the ability to make accurate predictions in the realm of fraud detection. Knowing that a transaction is fraudulent is a critical requirement for financial services companies, but knowing that a transaction that was flagged by a rules-based system as fraudulent is a valid transaction, can be equally important.

article thumbnail

Understanding User Needs and Satisfying Them

Speaker: Scott Sehlhorst

We know we want to create products which our customers find to be valuable. Whether we label it as customer-centric or product-led depends on how long we've been doing product management. There are three challenges we face when doing this. The obvious challenge is figuring out what our users need; the non-obvious challenges are in creating a shared understanding of those needs and in sensing if what we're doing is meeting those needs.

article thumbnail

Risk-Based Wealth Management: What the Insurance Industry Gets Wrong

Teradata

Product-centric processes degrade customer experience. Insurers must insulate consumers from internal & regulatory-driven controls by placing them in the center of the customer experience.

article thumbnail

The Journey Begins

Team Data Science

Week 1: 10/9/20 - 10/16/20 In my quest to further improve my overall data science skills, I pulled the trigger on October 9th, 2020, and enrolled in a Data Engineering boot camp lead by Andreas Kretz. First a little bit about myself. I have a background in Aerospace Engineering and have been in the industry for close to 15 years now. A little more than a year ago, I decided to pivot to Machine Learning and Data Science.

article thumbnail

Building a Search Engine for Afterpay’s Shop Directory

Afterpay Tech

Photo by Markus Winkler on Unsplash By Jose Picado , Qiao Wang , and Yi Li Context Our Shop Directory is used by our consumers to discover stores, brands, and products. It is incredibly valuable for our retail partners: Almost 20 million referrals per month in the 2020 July - Sept quarter. The Shop Directory, available on the Web and our mobile app, contains nearly 64,000 stores, and each store sells 1000s of products.

article thumbnail

IBM and Confluent Announce Strategic Partnership

Confluent

I’m excited to announce a new strategic partnership with IBM. As part of this partnership, IBM will be reselling Confluent Platform, enabling its customers to leverage their existing IBM relationships […].

IT 127
article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

Building A Cost Effective Data Catalog With Tree Schema

Data Engineering Podcast

Summary A data catalog is a critical piece of infrastructure for any organization who wants to build analytics products, whether internal or external. While there are a number of platforms available for building that catalog, many of them are either difficult to deploy and integrate, or expensive to use at scale. In this episode Grant Seward explains how he built Tree Schema to be an easy to use and cost effective option for organizations to build their data catalogs.

Building 100
article thumbnail

Expediting SQL Workers means Expediting your Business

Cloudera

Two of the more painful things in your everyday life as an analyst or SQL worker are not getting easy access to data when you need it, or not having easy to use, useful tools available to you that don’t get in your way! As one of my dear customers, a data worker in Pharma, said to me: “I really don’t care about bells and whistles, I just want to get my task done.

SQL 107
article thumbnail

Boost Your Customer Experience with Better Payment Conversions

Teradata

With digital payments on the rise, payment processing has become more complex. Fortunately, advanced data technologies can create better customer experience via streamlined payment processes.

article thumbnail

Branding Yourself

Team Data Science

Week 2: 10/16/20 - 10/23/20 Week 2 of the course consists of Modules 3 & 4. If you have not read my first blog go here. Module 3 focuses on creating a professional LinkedIn profile. Your LinkedIn profile is the world's access to you and how you want to be seen professionally. Below is a screenshot. So here, I have a professionally taken photograph, what I am interested in below, and the 'About' section that summarizes Me.in a professional sense.

article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

Web Scraping Using R.!

Data Science Blog: Data Engineering

In this blog, I’ll show you, How to Web Scrape using R.? What is R.? R is a programming language and its environment built for statistical analysis, graphical representation & reporting. R programming is mostly preferred by statisticians, data miners, and software programmers who want to develop statistical software. R is also available as Free Software under the terms of the Free Software Foundation’s GNU General Public License in source code form.

article thumbnail

Use Cases and Architectures for HTTP and REST APIs with Apache Kafka

Confluent

This blog post presents the use cases and architectures of REST APIs and Confluent REST Proxy, and explores a new management API and improved integrations into Confluent Server and Confluent […].

article thumbnail

Add Version Control To Your Data Lake With LakeFS

Data Engineering Podcast

Summary Data lakes are gaining popularity due to their flexibility and reduced cost of storage. Along with the benefits there are some additional complexities to consider, including how to safely integrate new data sources or test out changes to existing pipelines. In order to address these challenges the team at Treeverse created LakeFS to introduce version control capabilities to your storage layer.

Data Lake 100
article thumbnail

2020 Data Impact Award Winner Spotlight: Experian

Cloudera

This year’s Data Impact Awards were like none other that we’ve ever hosted. While everyone attended from the comfort of their own homes (and timezones), we were still able to celebrate the fantastic achievements of our customers. From all corners of the globe, our customers have delivered incredible amounts of innovation in the enterprise, while overcoming many of the challenges and disruptions 2020 has brought.

article thumbnail

The Big Payoff of Application Analytics

Outdated or absent analytics won’t cut it in today’s data-driven applications – not for your end users, your development team, or your business. That’s what drove the five companies in this e-book to change their approach to analytics. Download this e-book to learn about the unique problems each company faced and how they achieved huge returns beyond expectation by embedding analytics into applications.

article thumbnail

Connect Teradata Vantage to Salesforce Data With Azure Data Factory

Teradata

This "how-to" guide will help you to connect Teradata Vantage using the Native Object Store feature to query Salesforce data sourced by Microsoft Azure Data Factory.

Data 59
article thumbnail

Grouparoo Raises $3M Seed Round

Grouparoo

We are excited to announce that Grouparoo has raised $3M in seed funding to make SaaS integrations easier for engineering. This round was led by Eniac Ventures and Fuel Capital. We’re also honored and humbled to have great participants in the round including Hack VC , Liquid2 , SCM Advisors , Stacy Brown-Philpot , J Zac Stein , Meka Asonye , Jonathan Grant , and others with experience that will be helpful in our journey.

article thumbnail

NLP Heroes, Pinot, Data Testing, and More: Top 10 Links From Across the Web

Data Council

Here's our November 2020 roundup of good reads and podcast episodes that might be relevant for your career in data: 1. Heroes of NLP: Quoc Le (Deeplearning.ai) NLP researcher Quoc Le was recently Andrew Ng’s guest as part of the ‘Heroes of NLP’ video series. Their discussion covered Le’s impressive journey, from growing up in Vietnam and developing his first basic chatbot in high school to becoming Google Brain’s first intern, and everything that followed.

Data 52
article thumbnail

How Real-Time Stream Processing Safely Scales with ksqlDB, Animated

Confluent

Software engineering memes are in vogue, and nothing is more fashionable than joking about how complicated distributed systems can be. Despite the ribbing, many people adopt them. Why? Distributed systems […].

Process 120
article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

Liquidity Monitoring: Depth

Ripple Engineering

In our last liquidity monitoring post , we introduced the concept of dislocation as a way to measure the price competitiveness of an XRP-fiat pair. In this post, we introduce the companion depth metric and combine both metrics into a data visualization for assessing liquidity performance. Depth Dislocation tells us how competitive an exchange’s XRP prices are, but it ignores the important quantity component of liquidity.

article thumbnail

Cloudera’s Pivot to a Virtual Internship Program

Cloudera

Typically, running smooth and successful internship programs requires in-person interactions with high touchpoints. From onboarding and regular meetings to coffee chats and welcome events to meet the team – it takes a lot to integrate a new intern. They’re not only new to the organization but new to the workforce, after all. . Yet, with most tech companies going fully remote, Early Talent teams had to consider their options.

article thumbnail

How Tesla is Redefining the Auto Industry

Teradata

New players like Tesla are changing the automotive industry into a software-driven paradigm which has made data management & analysis at scale a critical capability for OEMs.

article thumbnail

Power BI Template App for SalesForce

FreshBI

So, what is a Power BI Template App? A Power BI Template App is a published Power BI solution that can be used by any company that has the data platform for which the Template App was created. Wouldn’t it be nice to pick your entire Power BI Solution off the shelf - one crafted for your specific business needs and your specific data structure. Power BI Template Apps are designed to be such an out-of-the-box solution and this blog post is an example of such for a Power BI Solution for Salesforce.

BI 52
article thumbnail

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Speaker: David Bard, Principal at VP Product Coaching

In the fast-paced world of digital innovation, success is often accompanied by a multitude of challenges - like the pitfalls lurking at every turn, threatening to derail the most promising projects. But fret not, this webinar is your key to effective product development! Join us for an enlightening session to empower you to lead your team to greater heights.