Understanding Human Signals — Intelligent Emotion Recognition for Psychotherapy
Multimodal emotion recognition designed for clinical psychotherapy. Fusing face, voice, text, and posture through Mamdani fuzzy inference for transparent, explainable emotional assessment.
Try Live Demo →Try Hermyon AI directly in your browser. Register with your email for a 60-second trial session.
Three complementary modalities capture the full spectrum of emotional expression in therapy sessions.
Facial expression analysis with 7 basic emotion categories (happy, sad, angry, fearful, surprised, disgusted, neutral). Real-time detection of micro-expressions and emotional transitions.
Speech emotion recognition across 8 categories. Analyzes prosody, pitch, energy, speaking rate, and pause patterns to capture emotional state from the acoustic signal.
Fine-grained text emotion analysis using the GoEmotions taxonomy. Maps patient speech transcripts to 28 nuanced emotion categories for deep emotional profiling.
Mamdani-style fuzzy inference system for transparent, interpretable multimodal emotion fusion.
Classic Mamdani-style fuzzy system with linguistically interpretable rules. Each rule maps input modality confidence levels to fused emotional output.
Input and output variables defined with trapezoidal membership functions (low, medium, high), providing smooth transitions and robust handling of uncertainty.
A carefully designed rule base of 14+ rules covering agreement, disagreement, and mixed-signal scenarios across all three modalities.
In clinical settings, transparency is critical. Fuzzy inference provides full reasoning traces — clinicians can see exactly why Hermyon reached a particular emotional assessment. Every rule activation, membership degree, and defuzzified output is fully traceable and explainable.
Understanding what patients talk about and how they feel about it.
Automatic keyword and keyphrase extraction from patient speech using KeyBERT. Identifies the core topics and themes discussed during sessions.
Topic modeling with BERTopic groups related themes into coherent clusters, revealing the underlying structure of patient discourse.
Links detected topics to their associated emotions, building a rich map of which subjects trigger which emotional responses in each patient.
Track emotional evolution across therapy sessions to identify patterns, progress, and regression.
Track emotional trajectories within individual sessions. Identify emotional peaks, valleys, and transitions to understand the therapeutic process.
Cross-session analysis reveals long-term emotional patterns. Monitor treatment effectiveness and identify breakthrough or regression moments.
Automatic detection of recurring emotional patterns, topic sensitivities, and emotional volatility trends across the patient timeline.
A web-based real-time interface designed for clinicians.
Live emotion visualization during therapy sessions. Face, voice, and text emotion streams displayed alongside the fused output and confidence scores.
Automated post-session reports with emotional summary, topic-emotion maps, key moments, and fuzzy rule activation traces for full transparency.
Comprehensive patient profiles with full session history, emotional baselines, topic sensitivities, and longitudinal trend analysis.
Fine-grained emotion taxonomy for deep clinical understanding.
| Category | Emotions |
|---|---|
| Positive | Admiration, Amusement, Approval, Caring, Desire, Excitement, Gratitude, Joy, Love, Optimism, Pride, Relief |
| Negative | Anger, Annoyance, Disappointment, Disapproval, Disgust, Embarrassment, Fear, Grief, Nervousness, Remorse, Sadness |
| Ambiguous | Confusion, Curiosity, Realization, Surprise |
| Neutral | Neutral |
Input streams (video, audio, text transcript) are processed by modality-specific models (CNN/ViT for face, Wav2Vec2 for voice, GoEmotions-RoBERTa for text). Confidence vectors feed into the Mamdani fuzzy inference engine, which produces fused emotion assessments with full reasoning traces. Results flow to the clinical dashboard and longitudinal storage for cross-session analysis.