Multimodal Modeling of Entrepreneurial Teams: Predicting Opportunity Generation from Audio and Text
Problem Statement & Motivation
Evaluating startup teams often lacks rigorous data-driven methods. This project tackles the challenge of quantifying a team’s creative potential—defined as the number of new ideas generated—using machine learning on multimodal data (audio + text). The goal is to build predictive models from behavioral signals like emotion, personality, confidence, and team dynamics during conversations.
Challenges include:
- Processing noisy and subjective emotional speech data.
- Inferring MBTI personalities from informal, multilingual text.
- Avoiding bias in psychological profiling for real-world decision-making.
Our Method
We construct a full ML pipeline that spans emotion detection, personality extraction, and idea prediction:
1. Emotion Recognition from Audio
- Dataset: Thorsten German emotion dataset.
- Preprocessing: silence trimming, segmentation (frame=2048, hop=512), MFCC + ZCR + RMS extraction.
- Model: LSTM-based neural network trained to classify 7 emotions with 85% accuracy.
2. Personality Prediction from Text
- Dataset: Twisty MBTI tweets (German).
- Preprocessing: tokenization, TF-IDF, LSA, sentiment analysis, topic modeling.
- Models: Logistic Regression, SVMs, BERT, and MLPs; both binary and multiclass (16 MBTI types).
- Output: 4 personality probabilities per speaker (e.g., Extrovert vs. Introvert).
3. Idea Generation Modeling
- Features:
- Speaker: MBTI, emotion, confidence.
- Team: cohesion, expertise breadth/depth, speaker count.
- Meeting: duration.
- Models:
- Classification: SVM, KNN, Logistic Regression.
- Regression: Kernel Ridge.
- Outcome: Predict number of ideas per speaker (Model A) and team (Model B).
Lessons Learned
- Combining audio and text modalities provides richer signals for behavioral modeling.
- MBTI prediction is harder and more subjective than emotion recognition.
- Real-world data adds variability; dataset diversity and probabilistic outputs are crucial.
Ethical Considerations
- Bias Risk: Personality/emotion labels are subjective and can reflect cultural stereotypes.
- Privacy: Data anonymization was enforced.
- Mitigation: Probabilistic labeling, data balancing, and multimodal validation were applied.