Back to site
ProsodyAI Docs
TypeScript SDK

Analyze Audio

Single-utterance audio analysis

Analyze Audio

The analyze method processes a single audio segment and returns emotion predictions.

Basic Usage

const result = await client.analyze({
  audio: audioBuffer,
});

console.log(result.emotion);     // Primary emotion
console.log(result.confidence);  // Prediction confidence
console.log(result.valence);     // Emotional valence (-1 to 1)
console.log(result.arousal);     // Emotional arousal (0 to 1)
console.log(result.dominance);   // Emotional dominance (0 to 1)

Request Options

AnalyzeRequest

interface AnalyzeRequest {
  // Audio input (one of the following)
  audio?: Buffer | Blob | ArrayBuffer;
  audioBase64?: string;
  audioUrl?: string;
  
  // Configuration
  vertical?: Vertical;
  features?: Feature[];
  
  // Optional metadata
  transcript?: string;
  speakerId?: string;
  sessionId?: string;
  metadata?: Record<string, unknown>;
}

Audio Input Formats

import fs from 'fs';

const audioBuffer = fs.readFileSync('audio.wav');
const result = await client.analyze({ audio: audioBuffer });
// From file input
const file = document.getElementById('audio-input').files[0];
const result = await client.analyze({ audio: file });

// From MediaRecorder
mediaRecorder.ondataavailable = async (e) => {
  const result = await client.analyze({ audio: e.data });
};
const audioBase64 = 'UklGRiQAAABXQVZFZm10IBAAAA...';
const result = await client.analyze({ audioBase64 });
const result = await client.analyze({
  audioUrl: 'https://storage.example.com/audio/call-123.wav',
});

Supported Audio Formats

FormatExtensionNotes
WAV.wavRecommended, uncompressed
MP3.mp3Lossy compression
OGG.oggVorbis codec
WebM.webmBrowser recording format
FLAC.flacLossless compression

For best results, use 16kHz sample rate, 16-bit depth, mono audio.

Features

Specify which features to include in the response:

const result = await client.analyze({
  audio: audioBuffer,
  features: ['emotion', 'prosody', 'vad', 'vertical'],
});
FeatureDescription
emotionBase emotion classification
prosodyProsodic features (pitch, energy, rhythm)
vadValence-Arousal-Dominance scores
verticalVertical-specific state and metrics
predictionForward predictions (requires session)

Vertical-Specific Analysis

const result = await client.analyze({
  audio: audioBuffer,
  vertical: 'contact_center',
  features: ['emotion', 'vertical'],
});

// Vertical-specific fields
console.log(result.state);           // "frustrated"
console.log(result.escalationRisk);  // "high"
console.log(result.sentimentScore);  // -0.6

Response

AnalyzeResponse

interface AnalyzeResponse {
  // Base emotion
  emotion: Emotion;
  confidence: number;
  emotionProbabilities: Record<Emotion, number>;
  
  // VAD scores
  valence: number;   // -1 (negative) to 1 (positive)
  arousal: number;   // 0 (calm) to 1 (excited)
  dominance: number; // 0 (submissive) to 1 (dominant)
  
  // Prosodic features (if requested)
  prosody?: {
    pitch: PitchFeatures;
    energy: EnergyFeatures;
    rhythm: RhythmFeatures;
    voiceQuality: VoiceQualityFeatures;
  };
  
  // Vertical-specific (if requested)
  state?: string;
  metrics?: Record<string, unknown>;
  
  // Metadata
  duration: number;
  processedAt: string;
}

Emotion Types

type Emotion =
  | 'neutral'
  | 'happy'
  | 'sad'
  | 'angry'
  | 'fearful'
  | 'disgusted'
  | 'surprised'
  | 'contempt'
  | 'anxious'
  | 'confused'
  | 'excited'
  | 'amused'
  | 'content';

Examples

With Transcript

Providing a transcript improves accuracy:

const result = await client.analyze({
  audio: audioBuffer,
  transcript: "I've been waiting for 30 minutes and nobody has helped me!",
  vertical: 'contact_center',
});

Healthcare Vertical

const result = await client.analyze({
  audio: patientAudio,
  vertical: 'healthcare',
  features: ['emotion', 'vertical'],
});

console.log(result.state);                    // "anxious"
console.log(result.metrics.depressionMarkers); // 0.3
console.log(result.metrics.clinicalAttention); // "monitor"

Sales Vertical

const result = await client.analyze({
  audio: prospectAudio,
  vertical: 'sales',
  features: ['emotion', 'vertical'],
});

console.log(result.state);                    // "skeptical"
console.log(result.metrics.buyingIntent);     // 0.25
console.log(result.metrics.recommendedAction); // "provide_evidence"

Error Handling

import { ProsodyError, ValidationError, AudioFormatError } from '@prosody/sdk';

try {
  const result = await client.analyze({ audio: audioBuffer });
} catch (error) {
  if (error instanceof ValidationError) {
    console.error('Invalid request:', error.details);
  } else if (error instanceof AudioFormatError) {
    console.error('Unsupported audio format:', error.format);
  } else if (error instanceof ProsodyError) {
    console.error('API error:', error.message);
  }
}