Course Projects

Distributed Movie Recommendation Pipelines with Apache Spark

This project builds a full-scale movie recommendation system using Apache Spark, incorporating data analytics, keyword-based filtering. Implemented on the MovieLens dataset, the system supports efficient data preprocessing, incremental rating updates, and personalized movie recommendations through LSH and predictive models.

NameCoin on Peerster: A Blockchain-Based Decentralized DNS Implementation

This project explores the design and implementation of a decentralized DNS system using blockchain and a gossip-based peer protocol. The system supports secure domain registration, updates, transfers, and resolution with robust anti-entropy synchronization and Proof-of-Work consensus. The project evaluates network resilience, consensus reliability, and mining efficiency.

Download Report

Multimodal Modeling of Entrepreneurial Teams: Predicting Opportunity Generation from Audio and Text

This project combines personality and emotion detection with machine learning to predict the number of ideas generated by entrepreneurial teams. We process multimodal data (transcripts and audio) to extract MBTI traits, emotional profiles, and speaker features, and use these to model team-level idea generation. This work advances behavioral modeling in early-stage startup evaluation.

Download Report