Ongoing Topical Classification of Wikipedia Images Developing a hierarchical classifier for Wikipedia images, categorizing them only by their topic. Collaboration with Wikimedia Research! Past A data-driven Political Compass Analyzed a dataset of ~180 milion quotations from news media from 2008 and 2020 to infer a new Political Compass based on quotes from US politicians. topic-classificationsentiment-analysisdimensionality-reductiondata-story Training an AI Assistant for STEM Education Fine-tuned LLMs and Reward Models with Constitutional AI to build a chatbot expert in STEM education at EPFL. nlplarge-language-modelsreward-modelconstitutional-ai Noise2Noise Denoising Autoencoders Denoising Autoencoder trained with only noisy images, implemented both with PyTorch and from scratch. deep-learningautoencoderspytorch OpenRitardi - Open Train Delays Democratizing access to train delays data in Italy by building a web app that allows users to easily visualize and analyze delays and connectivity. data-visualizationd3.jsweb-development Disambiguating Voynich Manuscript transliterations Fine-tuned word embedding models to disambiguate transliterations of the Voynich Manuscript, a mysterious medieval document written in a completely unknown script. nlpword2vecfasttext Detecting Hateful Users on Twitter Developed Graph Machine Learning models to detect hateful users on a Twitter retweet graph. graph-machine-learninggraphsagenode2vectwitter Solving the N-Queens problem with the Metropolis algorithm Implemented a Metropolis–Hastings Monte Carlo Markov Chain algorithm for sampling a solution to the N-queens problem for any board size. markov-chain-monte-carlosimulated-annealingmaths