“Use your data wisely” – Data-Centric NLP in the e-commerce domain
Many recent NLP publications focus on increasing the model capacity and tuning the model’s architecture, “Data-Centric AI” suggests putting more effort into preparing high-quality data instead.
In this talk, we show how to leverage the massive amounts of data available in e-commerce platforms such as Allegro, in order to train models to solve the Named Entity Recognition task. Such models enable us to find useful information in human-written texts such as offer descriptions, allowing automatic parameter values enrichment of sold products.
Paweł Olszewski is a research engineer at Allegro where he is building deep learning models that work on unstructured data. Pawel is also a Ph.D. candidate in applied machine learning at the Warsaw University of Life Sciences. Prior to that, he completed his Master’s in Mathematics at the University of Warsaw.
Pawel got his professional experience working at Samsung R&D and TCL Research Europe. His research interests include deep learning, natural language processing, and machine learning in medicine.