Synerise BaseModel: A Foundation Model for Behavioral Event Data
The complexity of industry-grade event-based datalakes grows dynamically each passing hour. Companies actively gather behavioral information on their customers, recording multiple types of events, such as clicks, likes, page views, card transactions, add-to-basket, or purchase events. In response to this, the Synerise BaseModel platform has been proposed. The primary focus of BaseModel is to produce Universal Behavioral Representations (UBRs) – large vectors encapsulating the behavioral patterns of each user. UBRs do not lose knowledge about individual events, in contrast to aggregated features or averaged embeddings. They are based on award-winning algorithms developed at Synerise – Cleora and EMDE – and allow to process real-life datasets composed of billions of events in record time. 
In this talk, I will provide an overview of our groundbreaking work at Synerise, detailing the inception, development, and capabilities of the BaseModel platform. This presentation will not only introduce BaseModel and its core objective of generating Universal Behavioral Representations but will also delve into the technical aspects of our model. Specifically, I will explore the underlying mechanisms of EMDE and Cleora, the award-winning algorithms that power BaseModel.
Bio
Michał Daniluk is a Staff AI Research Engineer at Synerise and a PhD candidate at the Warsaw University of Technology. He graduated in Machine Learning at University College London with a distinction award. His expertise encompasses the development and study of machine learning models for processing the complex structure of multimodal web-scale data. His research interests include graph representation learning, recommendation systems, behavioral user representations, and natural language processing. Michał won prestigious competitions such as the RecSys Twitter Challenge, KDD Cup OGB-LSC, and the WSDM Booking.com Data Challenge, underscoring his direct contributions and success in the AI field.
 
	    	    		    	     
	    	    		    	    