Fast String Processing with Polars — Scam Emails Dataset
Clean, process and tokenise texts in milliseconds using in-built Polars string expressions
Published in
10 min readMay 28, 2023
Introduction
With the large scale adoption of Large language Models (LLMs) it might seem that we’re past the stage where we had to manually clean and process text data. Unfortunately…