Research Projects


Explainable Deep Clustering for Financial Customer Profiling

For more details, check out my note on Notion: [link]


Starting as a project for BigData Competition in NH Investment & Securities with the topic of Advanced Customer Profiling and Personalized Investment Portfolio Curation, I extended our project with teammates into an academic research initiative using high-dimensional cross-sectional data from Korea Institute of Public Finance.

Effective customer segmentation and communication of these findings to non-experts is a pressing task in the financial services sector, with the potential for widespread applications. This study employs a three-stage dimension reduction and clustering technique to segment a large, high-dimensional dataset, emphasizing explainability and intuitive visualization. We present the high-dimensional data and feature set using novel network-based visualization methods and identify the multi-stage process’s optimal configuration. Finally, we derive investment portfolios for each segment to demonstrate an expert system application in financial investment advisory to underscore the importance of explainable segmentations.

This paper is published in EAAI Vol 128. Using these findings, I also presented a poster in IE Frontier, an internal research poster competition in KAIST ISE Department. Also, I have applied this framework to cluster financial securities including stock, and wrote a paper on Stock Deep Clustering and its application to Fama-French Factor Model.


Integrating Multi-Modality in All-round LLM-based Recommender System

For more details, check out my note on Notion: [link]


I have investigated state-of-the-art (SOTA) LLM-based Recommender Systems, with a particular focus on my mentor’s project, A-LLMRec (by Sein Kim at KAIST DSAIL). I understand how to construct efficient LLM framework for downstream recommendation task without fine-tuning the LLM by stably blending pretrained CF-RecSys embeddings with natural language embeddings. I have gained insights how to create joint collaborative item-text embeddings using autoencoder while avoiding over-smoothed representation. Additionally, I have devised an alignment network that robustly aligns item embeddings from CF-based RecSys in the token space of LLM. Furthermore, I have acquired knowledge of designing LLM prompt which can incorporate modality information and integrate collaborative knowledge with recommendation instructions without fine-tuning the LLM

Building upon the findings of this study, I have enhanced the framework to seamlessly incorporate multi-modal data like item images. Instead of a single item encoder trained by matching loss with item text description encoder, I have implemented cross attention mechanism and contrastive learning to effectively integrate the multi-modality between item embeddings and meta data embeddings. These new integrated item encoder have produced better embeddings for soft prompt in LLM recommendation tasks.

Throughout this research, I wrote a Survey on Modern Recommender Systems: Collaborative Filtering to LLM-based Recommender System.