Multimodal Modeling of Entrepreneurial Teams: Predicting Opportunity Generation from Audio and Text

Problem Statement & Motivation

Evaluating startup teams often lacks rigorous data-driven methods. This project tackles the challenge of quantifying a team’s creative potential—defined as the number of new ideas generated—using machine learning on multimodal data (audio + text). The goal is to build predictive models from behavioral signals like emotion, personality, confidence, and team dynamics during conversations.

Challenges include:

  • Processing noisy and subjective emotional speech data.
  • Inferring MBTI personalities from informal, multilingual text.
  • Avoiding bias in psychological profiling for real-world decision-making.

Our Method

We construct a full ML pipeline that spans emotion detection, personality extraction, and idea prediction:

1. Emotion Recognition from Audio

  • Dataset: Thorsten German emotion dataset.
  • Preprocessing: silence trimming, segmentation (frame=2048, hop=512), MFCC + ZCR + RMS extraction.
  • Model: LSTM-based neural network trained to classify 7 emotions with 85% accuracy.

2. Personality Prediction from Text

  • Dataset: Twisty MBTI tweets (German).
  • Preprocessing: tokenization, TF-IDF, LSA, sentiment analysis, topic modeling.
  • Models: Logistic Regression, SVMs, BERT, and MLPs; both binary and multiclass (16 MBTI types).
  • Output: 4 personality probabilities per speaker (e.g., Extrovert vs. Introvert).

3. Idea Generation Modeling

  • Features:
    • Speaker: MBTI, emotion, confidence.
    • Team: cohesion, expertise breadth/depth, speaker count.
    • Meeting: duration.
  • Models:
    • Classification: SVM, KNN, Logistic Regression.
    • Regression: Kernel Ridge.
  • Outcome: Predict number of ideas per speaker (Model A) and team (Model B).

Lessons Learned

  • Combining audio and text modalities provides richer signals for behavioral modeling.
  • MBTI prediction is harder and more subjective than emotion recognition.
  • Real-world data adds variability; dataset diversity and probabilistic outputs are crucial.

Ethical Considerations

  • Bias Risk: Personality/emotion labels are subjective and can reflect cultural stereotypes.
  • Privacy: Data anonymization was enforced.
  • Mitigation: Probabilistic labeling, data balancing, and multimodal validation were applied.