Building Emotion-Aware Agents

This guide shows how to build AI agents that understand and respond to emotional context.

Basic Agent Setup

import { ProsodyEmotionTool, ProsodyPredictionTool } from '@prosody/langchain';
import { ChatOpenAI } from '@langchain/openai';
import { AgentExecutor, createToolCallingAgent } from 'langchain/agents';
import { ChatPromptTemplate } from '@langchain/core/prompts';

// Create tools
const emotionTool = new ProsodyEmotionTool({
  apiKey: process.env.PROSODY_API_KEY,
  vertical: 'contact_center',
});

const predictionTool = new ProsodyPredictionTool({
  apiKey: process.env.PROSODY_API_KEY,
});

// Create LLM
const llm = new ChatOpenAI({
  model: 'gpt-4',
  temperature: 0.7,
});

// Create prompt
const prompt = ChatPromptTemplate.fromMessages([
  ['system', `You are an empathetic customer service agent.

When handling customer interactions:
1. Always analyze the customer's emotional state first
2. Acknowledge their feelings before addressing the issue
3. Adapt your tone based on their emotional state
4. If escalation risk is high, prioritize de-escalation

Tone guidelines by emotion:
- Frustrated/Angry: Be calm, apologetic, and solution-focused
- Confused: Be patient and provide clear explanations
- Satisfied: Be friendly and confirm resolution
- Anxious: Be reassuring and provide clear next steps`],
  ['human', '{input}'],
  ['placeholder', '{agent_scratchpad}'],
]);

// Create agent
const agent = createToolCallingAgent({
  llm,
  tools: [emotionTool, predictionTool],
  prompt,
});

const executor = new AgentExecutor({
  agent,
  tools: [emotionTool, predictionTool],
  verbose: true,
});

Agent Patterns

Reactive Agent

Responds to emotional state in real-time:

async function handleCustomerInteraction(audioUrl: string) {
  const response = await executor.invoke({
    input: `A customer is on the line. Analyze their emotional state 
from this audio: ${audioUrl}
Then provide an appropriate response to help them.`,
  });
  
  return response.output;
}

Proactive Agent

Uses predictions to prevent issues:

async function monitorConversation(sessionId: string) {
  const response = await executor.invoke({
    input: `Check the predictions for session ${sessionId}.
If escalation risk is above 50%, suggest intervention strategies.
If churn risk is high, identify retention opportunities.`,
  });
  
  return response.output;
}

Handles both voice and text:

const multiModalAgent = createToolCallingAgent({
  llm,
  tools: [emotionTool, predictionTool, textSentimentTool],
  prompt: ChatPromptTemplate.fromMessages([
    ['system', `You are a multi-modal customer service agent.
    
You can analyze:
- Voice audio for prosodic emotional signals
- Text messages for written sentiment

Always combine insights from both modalities when available.`],
    ['human', '{input}'],
    ['placeholder', '{agent_scratchpad}'],
  ]),
});

Conversation Memory

Track emotional state across turns:

import { BufferMemory } from 'langchain/memory';
import { ConversationChain } from 'langchain/chains';

// Create memory with emotion tracking
const memory = new BufferMemory({
  memoryKey: 'chat_history',
  returnMessages: true,
});

// Custom memory for emotion history
class EmotionAwareMemory extends BufferMemory {
  emotionHistory: Array<{
    turn: number;
    emotion: string;
    valence: number;
  }> = [];
  
  async saveContext(input: any, output: any) {
    await super.saveContext(input, output);
    
    if (output.emotion) {
      this.emotionHistory.push({
        turn: this.emotionHistory.length + 1,
        emotion: output.emotion,
        valence: output.valence,
      });
    }
  }
  
  getEmotionTrend(): string {
    if (this.emotionHistory.length < 2) return 'insufficient_data';
    
    const recent = this.emotionHistory.slice(-3);
    const avgValence = recent.reduce((a, b) => a + b.valence, 0) / recent.length;
    const firstValence = recent[0].valence;
    
    if (avgValence > firstValence + 0.2) return 'improving';
    if (avgValence < firstValence - 0.2) return 'declining';
    return 'stable';
  }
}

Agent with Session Management

import { ProsodySessionTool, ProsodyEmotionTool } from '@prosody/langchain';

class ConversationAgent {
  private sessionTool: ProsodySessionTool;
  private emotionTool: ProsodyEmotionTool;
  private executor: AgentExecutor;
  private sessionId?: string;
  
  constructor(apiKey: string) {
    this.sessionTool = new ProsodySessionTool({ apiKey, vertical: 'contact_center' });
    this.emotionTool = new ProsodyEmotionTool({ apiKey, vertical: 'contact_center' });
    
    // Setup agent executor
    this.executor = new AgentExecutor({
      agent: createToolCallingAgent({
        llm: new ChatOpenAI({ model: 'gpt-4' }),
        tools: [this.emotionTool, this.sessionTool],
        prompt: this.getPrompt(),
      }),
      tools: [this.emotionTool, this.sessionTool],
    });
  }
  
  async startConversation(metadata?: Record<string, unknown>) {
    const session = await this.sessionTool.invoke({
      action: 'create',
      metadata,
    });
    this.sessionId = session.sessionId;
    return session;
  }
  
  async processUtterance(audio: Buffer, speakerId: string) {
    if (!this.sessionId) {
      throw new Error('No active session');
    }
    
    // Add utterance to session
    const emotionResult = await this.sessionTool.invoke({
      action: 'add_utterance',
      sessionId: this.sessionId,
      audio: audio.toString('base64'),
      speakerId,
    });
    
    // Get agent response
    const agentResponse = await this.executor.invoke({
      input: `The ${speakerId} just spoke. Their emotion is ${emotionResult.emotion} 
with ${emotionResult.metrics?.escalationRisk} escalation risk.
Provide an appropriate response.`,
    });
    
    return {
      emotion: emotionResult,
      response: agentResponse.output,
    };
  }
  
  async endConversation() {
    if (!this.sessionId) return null;
    
    const summary = await this.sessionTool.invoke({
      action: 'end',
      sessionId: this.sessionId,
    });
    
    this.sessionId = undefined;
    return summary;
  }
  
  private getPrompt() {
    return ChatPromptTemplate.fromMessages([
      ['system', `You are managing a customer service conversation.
Track the emotional trajectory and adapt your responses.
If you detect escalation risk, prioritize de-escalation.`],
      ['human', '{input}'],
      ['placeholder', '{agent_scratchpad}'],
    ]);
  }
}

// Usage
const agent = new ConversationAgent(process.env.PROSODY_API_KEY!);

await agent.startConversation({ callId: 'abc123' });

const result1 = await agent.processUtterance(customerAudio1, 'customer');
console.log('Customer emotion:', result1.emotion);
console.log('Agent response:', result1.response);

const result2 = await agent.processUtterance(customerAudio2, 'customer');
// Continue conversation...

const summary = await agent.endConversation();
console.log('Conversation summary:', summary);

from prosody_langchain import ProsodySessionTool, ProsodyEmotionTool
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate

class ConversationAgent:
    def __init__(self, api_key: str):
        self.session_tool = ProsodySessionTool(
            api_key=api_key,
            vertical="contact_center"
        )
        self.emotion_tool = ProsodyEmotionTool(
            api_key=api_key,
            vertical="contact_center"
        )
        self.session_id = None
        
        llm = ChatOpenAI(model="gpt-4")
        prompt = ChatPromptTemplate.from_messages([
            ("system", "You are managing a customer service conversation..."),
            ("human", "{input}"),
            ("placeholder", "{agent_scratchpad}"),
        ])
        
        agent = create_tool_calling_agent(
            llm=llm,
            tools=[self.emotion_tool, self.session_tool],
            prompt=prompt
        )
        
        self.executor = AgentExecutor(
            agent=agent,
            tools=[self.emotion_tool, self.session_tool]
        )
    
    async def start_conversation(self, metadata=None):
        session = await self.session_tool.ainvoke({
            "action": "create",
            "metadata": metadata
        })
        self.session_id = session["session_id"]
        return session
    
    async def process_utterance(self, audio: bytes, speaker_id: str):
        if not self.session_id:
            raise ValueError("No active session")
        
        import base64
        emotion_result = await self.session_tool.ainvoke({
            "action": "add_utterance",
            "session_id": self.session_id,
            "audio": base64.b64encode(audio).decode(),
            "speaker_id": speaker_id
        })
        
        agent_response = await self.executor.ainvoke({
            "input": f"Customer emotion: {emotion_result['emotion']}. Respond appropriately."
        })
        
        return {
            "emotion": emotion_result,
            "response": agent_response["output"]
        }

# Usage
agent = ConversationAgent(os.environ["PROSODY_API_KEY"])
await agent.start_conversation({"call_id": "abc123"})
result = await agent.process_utterance(audio_bytes, "customer")

Real-Time Coaching Agent

An agent that provides real-time coaching to human agents:

class CoachingAgent {
  private executor: AgentExecutor;
  
  async provideCoaching(
    customerEmotion: EmotionResult,
    agentResponse: string,
    conversationHistory: string[]
  ): Promise<CoachingAdvice> {
    const response = await this.executor.invoke({
      input: `Analyze this customer service interaction:

Customer Emotion: ${customerEmotion.emotion} (${customerEmotion.confidence} confidence)
Escalation Risk: ${customerEmotion.metrics?.escalationRisk}
Valence: ${customerEmotion.valence}

Agent's Response: "${agentResponse}"

Recent History:
${conversationHistory.slice(-5).join('\n')}

Provide coaching feedback:
1. Was the agent's response appropriate for the customer's emotional state?
2. What could they do better?
3. Specific phrases to use or avoid?`,
    });
    
    return this.parseCoachingResponse(response.output);
  }
}

For production deployments, consider using streaming responses for lower latency coaching suggestions.

Building Agents