Sat.Mar 02, 2024

article thumbnail

Data News — Week 24.09

Christophe Blefari

Mistral ( credits ) Hello all, this is the Data News, this week edition might be smaller than usual in term of comments as I'm working on a Data News related project that takes me a bit of time, which will probably lead to a series of articles. Before I forget I've appeared on The Joe Reis Show , we chatted with Joe about data engineering teaching, why it is hard and about generative AI that will change education for ever.

Data 162
article thumbnail

Data Dirtiness Score

Towards Data Science

New method to measure tabular dataset quality This article, the first in a series on data cleaning practices involving Large Language Models (LLMs), focuses on quantifying the cleanliness or dirtiness of a dataset Photo by Fabrizio Conti on Unsplash Starting with the Why This article introduces a concept for evaluating the dirtiness of a dataset, a topic that presents challenges due to the lack of a tangible score or loss function related to data cleaning.