Logo

  

Patrick John Chia

I am a founding engineer at a stealth startup in the AI x Gaming space, working on develpoing the cognitive architecture of agents using local, offline lanugage models. Previously I was at Coveo working with Dr Jacopo Tagliabue at the intersection of NLP, Information Retrieval and e-Commerce.

I completed my M.Eng at Imperial College, London and spent my final year at MIT working closely with Dr. Ferran Alet in the Learning and Intelligent Systems (LIS) Group.

My interests revolve around better understanding Artificial Intelligence (AI) of today, and bridging the gap toward more human-like AI.

I am also an active contributor to various open source initiatives (Models, MLOps, RecList) and you can find my work at various academic and industrial CS/AI venues (ICML, ACL, TheWebConf, SIGIR, BerlinBuzzwords) too.

Last update: September 2024.

Projects | Research


FashionCLIP

Update: To date (Sep, 24) FashionCLIP has amassed more than 50M downloads on Hugging Face!

FashionCLIP is a CLIP-like model develoepd to produce generalizable, multimodal product representations for the fahion domain. We achieve this by fine-tuning a pretrained CLIP model on an <image, caption> dataset dervied from a high quality fashion catalog containing over 800K product. We study whether such fine-tuning is sufficient to produce product representations that are zero-shot transferable to entirely new datasets and tasks.

Our results demonstrate strong zero-shot classification on a wide range of fashion datasets, highlighting the merits of doamin specific fine-tuning. Furthermore, we study the compositional and grounding capabilities of CLIP-like models by creating synthetic products (see above “Nike Dress”) and testing FashionCLIP retrieval/classifcation on them. Our model is hosted on Hugging Face, check it out!


GradREC

GradREC is a zero-shot approach toward a new class of recommendation – comparative recs – which answers queries of the form: “Can I have something darker/longer/warmer?”.

We achieve this by framing comparative recommendations as latent space traversal – a key component of our approach is leveraging the latent space learnt by FashionCLIP, a CLIP-like model fine-tuned for fashion concepts.

Like other self-supervised approaches for learning representations such as the seminal Word2Vec, we postulated that the contrastive learning approach employed by CLIP-like models should also encode certain concepts (e.g. analogies) despite being unsupervised/self-supervised.

We demonstrate that it is possible to traverse the latent space in a manner whereby we can discover products that vary along a certain attribute/dimension of interest (e.g. shoes of increasing/decreasing heel height).


RecList 🚀

We built RecList as an open source library for behavioral testing of recommender systems (RSs). As pervasive as RSs are in daily life, insufficient attention has been put into better understanding their performance and behaviour beyond just point-wise metrics.

We developed RecList to be modular in fashion with an easy-to-extend interface for custom use cases to enable the scaling up of behavioral testing.

RecList has since garnered > 300 🌟 on GitHub and has been used as part of the evaluation loop in the EvalRS CIKM 2022 Data Challenge


Waldo

WALDO is a Third Year Industrial Group Project done in collaboration with IBM. It is a deep learning enabled assisted living device meant to perform Makaton Sign recognition. I was part of the team that focused on the Machine Learning reasearch, development and deployment.

At the core of WALDO is its deep neural network architecture based on a C3D network concatenated with an LSTM. This allows the model to learn the spatio-temporal features required for accurate gesture recognition.

Particularly challenging was having the model run on a Jetson Nano edge device. This required investigation into which parts of the model could be compressed, whilst achieving a good trade off between accuracy and performance. For more information, do check out the Github Repo!

Publications