Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
Posts
course_projects
Coin Counter: Deep Learning for Robust Multi-Class Coin Detection and Classification
This project explores multiple deep learning models: CNNs, ResNet50, and Vision Transformers, for classifying coins in images. We use advanced segmentation techniques, custom over-segmentation filtering, and data augmentation to improve generalization. Developed for the EPFL IAPR course, this system demonstrates competitive accuracy and robustness for automated coin recognition tasks.
Distributed Movie Recommendation Pipelines with Apache Spark
This project builds a full-scale movie recommendation system using Apache Spark, incorporating data analytics, keyword-based filtering. Implemented on the MovieLens dataset, the system supports efficient data preprocessing, incremental rating updates, and personalized movie recommendations through LSH and predictive models.
NameCoin on Peerster: A Blockchain-Based Decentralized DNS Implementation
This project explores the design and implementation of a decentralized DNS system using blockchain and a gossip-based peer protocol. The system supports secure domain registration, updates, transfers, and resolution with robust anti-entropy synchronization and Proof-of-Work consensus. The project evaluates network resilience, consensus reliability, and mining efficiency.
Multimodal Modeling of Entrepreneurial Teams: Predicting Opportunity Generation from Audio and Text
This project combines personality and emotion detection with machine learning to predict the number of ideas generated by entrepreneurial teams. We process multimodal data (transcripts and audio) to extract MBTI traits, emotional profiles, and speaker features, and use these to model team-level idea generation. This work advances behavioral modeling in early-stage startup evaluation.
Command-Collecting Robot: Embedded Systems Project for Restaurant Automation
A robotic system developed for real-time order collection in a restaurant environment, combining line-following, object detection, and GUI-based control. The robot uses image processing, multithreading, and Bluetooth communication to collect orders from tables and transmit them for analysis and optimization.
journal
Speeding Up Graph Similarity Matching with Efficient Tensor Ops
The graph similarity algorithm for matching image-text graph pairs was too slow, particularly in the pairwise comparison step
Reducing Padding Overhead with Sequence Bucketing
Group similar-length samples to minimize VRAM waste and stabilize throughput in NLP tasks.
Resolving OOM in PPO/GRPO with Large Models
PPO and GRPO training with models >7B caused OOM errors on A100 GPUs due to multiple full model replicas. This post details optimization strategies to fix it.
Speeding Up Distributed Training with vLLM, Flash Attention, and Checkpoint Resuming
Improving distributed training speed using vLLM, Flash Attention, LoRA, gradient checkpointing, and stable checkpoint recovery across multi-node systems.
Scaling Data Mining with API Efficiency Under TPM Limits
Efficiently mining structured text or graphs using GPT-4 APIs while staying under 2M TPM.
Fixing Mixed Precision Underutilization for Speed Gains
Correctly configuring AMP and autocast led to 2Ă— faster training on NVIDIA GPUs.
Speeding Up Evaluation with Cached Tokenization
Avoiding redundant tokenizer calls accelerated validation by up to 3Ă— during fine-tuning.
publications
Generative Approaches to Kinetic Parameter Inference in Metabolic Networks via Latent Space Exploration
Published in bioRxiv, 2025
We present a novel generative framework that leverages latent space exploration to generate dynamic metabolic models with targeted properties. This work introduces a new approach to controllably infer kinetic parameters in large-scale biological systems using pretrained neural network generators such as REKINDLE and RENAISSANCE.
research_projects
Unified Graph-Based Matching: Text-Image Cross-Modal Retrieval using Scene Graph Alignment for Remote Sensing Applications
We introduce a unified framework for cross-modal retrieval in remote sensing, aligning scene graphs from satellite imagery and text descriptions. Using the STAR dataset, our method encodes graph structures from both modalities and aligns them via contrastive learning. We evaluate multiple similarity strategies—node, edge, global, and hybrid—and propose a benchmark protocol for retrieval. This work opens up new directions for structured vision-language understanding in geospatial domains.
Prompting Beyond Retrieval with GRAD: A Generative Retrieval-Aligned Demonstrator for Robust Few-Shot Reasoning
This project was conducted at DLab, EPFL. We propose GRAD: a generative, retrieval-free demonstration generator for LLMs. GRAD tailors concise, input-specific prompts to improve multi-step reasoning under strict token limits. Unlike RAG, GRAD requires no external retrieval and adapts across out of distribution (OOD) domains. Trained only on math data, it generalizes to OOD tasks in physics, chemistry, and CS. It enables scalable, low-cost few-shot learning in resource-constrained settings. This work has been submitted to EMNLP 2025. The code repository will be made public upon acceptance.
Kinetic Parameter Inference in Metabolic Networks via Latent Space Exploration
Published:
We present a novel framework to interpret and control the latent spaces of generative neural network models for kinetic metabolic modeling. By perturbing structured latent spaces learned via REKINDLE or RENAISSANCE, our method generates new dynamic models with targeted properties such as specific response times, regulatory bottlenecks, or alternative physiologies, unlocking deeper insight and reusability across metabolic contexts.
GemmaEdu: Enhancing Scientific Learning via Fine-Tuned Language Models and RAG
We developed an educational chatbot built on the quantized Gemma 2 7B model, optimized with Direct Preference Optimization (DPO) and enhanced with Retrieval-Augmented Generation (RAG). By leveraging fine-tuning on student-generated preference data and incorporating relevant external documents, our system significantly improves accuracy in answering STEM multiple-choice questions, outperforming baseline models like Mistral and Llama2.
From Novice to Expert: Dimensionality Reduction and Policy Distillation in Reinforcement Learning for Motor Control
This project investigates how to accelerate motor skill acquisition in reinforcement learning using curriculum-based learning, dimensionality reduction, and policy distillation. Using the Myosuite Baoding balls task, we explore how expert policies can be transferred to novice agents via PCA-reduced feature and action spaces, offering an efficient alternative to prolonged training times.
Learning-Based Multi-Robot Lane Navigation: Scalable Trajectory Prediction using Neural Networks
This project was conducted at DISAL, EPFL. We explore trajectory generation for multi-robot navigation using neural networks. We propose a scalable alternative to Webots simulation by training models using graph neural network, reinforcement and imitation learning. The final approach produces accurate trajectories in a lane-based environment, balancing precision and efficiency in robotic control.
work
AI Research Intern — AXA Group Operations
I led applied research and prototyping efforts in multimodal AI, focusing on cross-modal representation learning, graph-based embeddings, and neural search systems. I developed scalable pipelines to generate scene graphs from satellite imagery and knowledge graphs from textual data, and to align their graph embeddings in a shared representation space.
Machine Learning Intern - Pixalione
Developed a machine learning pipeline to forecast daily ad spend on Google Ads based on client-specific campaign data. Deployed a web backend for dynamic budget strategy adjustment, automated alerts, and integration with Azure Cloud infrastructure.
Student Assistant — EPFL
During my studies, I served as a teaching assistant for multiple courses, assisting in lectures, labs, and tutorials