Fast String Processing with Polars — Scam Emails Dataset

Clean, process and tokenise texts in milliseconds using in-built Polars string expressions

Antons Tocilins-Ruberts
Towards Data Science
10 min readMay 28, 2023

--

Photo by Stephen Phillips - Hostreviews.co.uk on Unsplash

Introduction

With the large scale adoption of Large language Models (LLMs) it might seem that we’re past the stage where we had to manually clean and process text data. Unfortunately…

--

--