Multimodal Modeling of Entrepreneurial Teams: Predicting Opportunity Generation from Audio and Text

Problem Statement & Motivation

Evaluating startup teams often lacks rigorous data-driven methods. This project tackles the challenge of quantifying a team’s creative potential—defined as the number of new ideas generated—using machine learning on multimodal data (audio + text). The goal is to build predictive models from behavioral signals like emotion, personality, confidence, and team dynamics during conversations.

Challenges include:

Processing noisy and subjective emotional speech data.
Inferring MBTI personalities from informal, multilingual text.
Avoiding bias in psychological profiling for real-world decision-making.

Our Method

We construct a full ML pipeline that spans emotion detection, personality extraction, and idea prediction:

1. Emotion Recognition from Audio

Dataset: Thorsten German emotion dataset.
Preprocessing: silence trimming, segmentation (frame=2048, hop=512), MFCC + ZCR + RMS extraction.
Model: LSTM-based neural network trained to classify 7 emotions with 85% accuracy.

2. Personality Prediction from Text

Dataset: Twisty MBTI tweets (German).
Preprocessing: tokenization, TF-IDF, LSA, sentiment analysis, topic modeling.
Models: Logistic Regression, SVMs, BERT, and MLPs; both binary and multiclass (16 MBTI types).
Output: 4 personality probabilities per speaker (e.g., Extrovert vs. Introvert).

3. Idea Generation Modeling

Features:
- Speaker: MBTI, emotion, confidence.
- Team: cohesion, expertise breadth/depth, speaker count.
- Meeting: duration.
Models:
- Classification: SVM, KNN, Logistic Regression.
- Regression: Kernel Ridge.
Outcome: Predict number of ideas per speaker (Model A) and team (Model B).

Lessons Learned

Combining audio and text modalities provides richer signals for behavioral modeling.
MBTI prediction is harder and more subjective than emotion recognition.
Real-world data adds variability; dataset diversity and probabilistic outputs are crucial.

Ethical Considerations

Bias Risk: Personality/emotion labels are subjective and can reflect cultural stereotypes.
Privacy: Data anonymization was enforced.
Mitigation: Probabilistic labeling, data balancing, and multimodal validation were applied.

Oussama Gabouj