Research Projects

Unified Graph-Based Matching: Text-Image Cross-Modal Retrieval using Scene Graph Alignment for Remote Sensing Applications

3 minute read

We introduce a unified framework for cross-modal retrieval in remote sensing, aligning scene graphs from satellite imagery and text descriptions. Using the STAR dataset, our method encodes graph structures from both modalities and aligns them via contrastive learning. We evaluate multiple similarity strategies—node, edge, global, and hybrid—and propose a benchmark protocol for retrieval. This work opens up new directions for structured vision-language understanding in geospatial domains.

Prompting Beyond Retrieval with GRAD: A Generative Retrieval-Aligned Demonstrator for Robust Few-Shot Reasoning

2 minute read

This project was conducted at DLab, EPFL. We propose GRAD: a generative, retrieval-free demonstration generator for LLMs. GRAD tailors concise, input-specific prompts to improve multi-step reasoning under strict token limits. Unlike RAG, GRAD requires no external retrieval and adapts across out of distribution (OOD) domains. Trained only on math data, it generalizes to OOD tasks in physics, chemistry, and CS. It enables scalable, low-cost few-shot learning in resource-constrained settings. This work has been submitted to EMNLP 2025. The code repository will be made public upon acceptance.

Kinetic Parameter Inference in Metabolic Networks via Latent Space Exploration

1 minute read

Published: April 05, 2025

We present a novel framework to interpret and control the latent spaces of generative neural network models for kinetic metabolic modeling. By perturbing structured latent spaces learned via REKINDLE or RENAISSANCE, our method generates new dynamic models with targeted properties such as specific response times, regulatory bottlenecks, or alternative physiologies, unlocking deeper insight and reusability across metabolic contexts.

Download Paper

GemmaEdu: Enhancing Scientific Learning via Fine-Tuned Language Models and RAG

1 minute read

We developed an educational chatbot built on the quantized Gemma 2 7B model, optimized with Direct Preference Optimization (DPO) and enhanced with Retrieval-Augmented Generation (RAG). By leveraging fine-tuning on student-generated preference data and incorporating relevant external documents, our system significantly improves accuracy in answering STEM multiple-choice questions, outperforming baseline models like Mistral and Llama2.

Download Report

From Novice to Expert: Dimensionality Reduction and Policy Distillation in Reinforcement Learning for Motor Control

2 minute read

This project investigates how to accelerate motor skill acquisition in reinforcement learning using curriculum-based learning, dimensionality reduction, and policy distillation. Using the Myosuite Baoding balls task, we explore how expert policies can be transferred to novice agents via PCA-reduced feature and action spaces, offering an efficient alternative to prolonged training times.

Download Report

Learning-Based Multi-Robot Lane Navigation: Scalable Trajectory Prediction using Neural Networks

1 minute read

This project was conducted at DISAL, EPFL. We explore trajectory generation for multi-robot navigation using neural networks. We propose a scalable alternative to Webots simulation by training models using graph neural network, reinforcement and imitation learning. The final approach produces accurate trajectories in a lane-based environment, balancing precision and efficiency in robotic control.

Download Report

Oussama Gabouj

Research Projects

Unified Graph-Based Matching: Text-Image Cross-Modal Retrieval using Scene Graph Alignment for Remote Sensing Applications

Prompting Beyond Retrieval with GRAD: A Generative Retrieval-Aligned Demonstrator for Robust Few-Shot Reasoning

Kinetic Parameter Inference in Metabolic Networks via Latent Space Exploration

GemmaEdu: Enhancing Scientific Learning via Fine-Tuned Language Models and RAG

From Novice to Expert: Dimensionality Reduction and Policy Distillation in Reinforcement Learning for Motor Control

Learning-Based Multi-Robot Lane Navigation: Scalable Trajectory Prediction using Neural Networks