Fig.2 Training Diagram.
Fig.3 Run-time Diagram.
The codes of this paper are publicly available here.
| Primary Emotion (A) | Reference Emotion (B) | Mixed Effects (A+B) |
| Surprise | Happy | Delight |
| Surprise | Angry | Outrage |
| Surprise | Sad | Disappointment |
(A) Mixed Emotion Evaluation (All Speech Samples are Synthesized from Text) |
||||||
|---|---|---|---|---|---|---|
In this section, listeners can feel how the characteristics of other emotions ('Angry', 'Happy' or 'Sad') are introduced into 'Surprise'. |
||||||
| Only Surprise | Only Angry | Mixing Surprise with Angry | Only Sad | Mixing Surprise with Sad | Only Happy | Mixing Surprise with Happy |
(B) Secondary Emotion Evaluation (All Speech Samples are Synthesized from Text) |
|||
|---|---|---|---|
In this section, listeners can feel how the mixed emotions sounds like secondary emotions in psychology ('Outrage', 'Disappointment', or 'Delight'). |
|||
| Surprise | Outrage | Disappointment | Delight |
(C) Controllability (All Speech Samples are Synthesized from Text) |
|||
|---|---|---|---|
In this section, we would like to show the controbility of proposed framework, for example, to adjust the percentage of each emotions in the mixed emotional effects. |
|||
| 100% Surprise + 0% Angry | 100% Surprise + 30% Angry | 100% Surprise + 60% Angry | 100% Surprise + 90% Angry |
| 100% Surprise + 0% Sad | 100% Surprise + 30% Sad | 100% Surprise + 60% Sad | 100% Surprise + 90% Sad |
| 100% Surprise + 0% Happy | 100% Surprise + 30% Happy | 100% Surprise + 60% Happy | 100% Surprise + 90% Happy |
(D) Ablation Study |
||
|---|---|---|
In this section, we would like to show the improvement of emotional intelligibility in synthesized speech. |
||
| Synthesized Angry (Proposed w/o Relative Scheme) | Synthesized Angry (Propsoed w/ Relative Scheme) | Reference Angry (Ground Truth) |
| Synthesized Surprise (Proposed w/o Relative Scheme) | Synthesized Surprise (Propsoed w/ Relative Scheme) | Reference Surprise (Ground Truth) |
| Synthesized Sad (Proposed w/o Relative Scheme) | Synthesized Sad (Propsoed w/ Relative Scheme) | Reference Sad (Ground Truth) |
| Synthesized Happy (Proposed w/o Relative Scheme) | Synthesized Happy (Propsoed w/ Relative Scheme) | Reference Happy (Ground Truth) |
(E) Further Investigateion I: Bittersweet? Both Happy and Sad |
|||||
|---|---|---|---|---|---|
In this section, we would like to synthesize a mixed feeling of Happy and Sad. (All speech samples are synthesized from text) |
|||||
| Synthesized Happy | Synthesized Sad | Mixing 100% Happy with 100% Sad | Mixing 90% Happy with 90% Sad | Mixing 80% Happy with 80% Sad | Mixing 70% Happy with 70% Sad |
| Synthesized Happy | Synthesized Sad | Mixing 100% Sad with 100% Happy | Mixing 90% Sad with 90% Happy | Mixing 80% Sad with 80% Happy | Mixing 70% Sad with 70% Happy |
(E) Further Investigateion II: Emotion Transition |
|||||
|---|---|---|---|---|---|
In this section, we would like to build an emotion transition system, which can gradually transit the emotional state from one to another. (All speech samples are synthesized from text) |
(1) Angry <---> Surprise |
||||
| 100% Angry | 80% Angry with 20% Surprise | 60% Angry with 40% Surprise | 60% Surprise with 40% Angry | 80% Surprise with 20% Angry | 100% Surprise |
(2) Sad <---> Angry |
|||||
| 100% Sad | 80% Sad with 20% Angry | 60% Sad with 40% Angry | 60% Angry with 40% Sad | 80% Angry with 20% Sad | 100% Angry | (2) Sad <---> Happy |
| 100% Sad | 80% Sad with 20% Happy | 60% Sad with 40% Happy | 60% Happy with 40% Sad | 80% Happy with 20% Sad | 100% Happy |
[2] R. Plutchik, “The nature of emotions: Human emotions have deep evolutionary roots, a fact that may explain their complexity and provide tools for clinical practice,” American scientist, vol. 89, no. 4, pp. 344–350, 2001