Back to site
ProsodyAI Docs
TypeScript SDK

Fine-Tuning

Train custom models on your data

Fine-Tuning

Fine-tune ProsodyAI models on your domain-specific data for improved accuracy.

Fine-tuning is available on Pro and Enterprise plans.

Overview

Fine-tuning allows you to:

  • Improve accuracy on your specific audio characteristics
  • Add custom emotional states
  • Optimize for your vertical's terminology
  • Train on labeled conversation outcomes

Creating a Fine-Tune Job

const job = await client.fineTune.create({
  name: 'customer-support-v1',
  description: 'Fine-tuned on Q1 2025 call data',
  baseModel: 'prosody-ssm-base',
  vertical: 'contact_center',
  config: {
    epochs: 10,
    learningRate: 1e-4,
    batchSize: 32,
  },
});

console.log(job.id);      // "ft-abc123"
console.log(job.status);  // "pending"

Uploading Training Data

Data Format

Training data should be JSONL format with audio references and labels:

{"audio_url": "gs://bucket/audio1.wav", "emotion": "frustrated", "outcome": {"csat": 2, "escalated": true}}
{"audio_url": "gs://bucket/audio2.wav", "emotion": "satisfied", "outcome": {"csat": 5, "escalated": false}}
{"audio_url": "gs://bucket/audio3.wav", "emotion": "confused", "transcript": "I don't understand..."}

Upload Methods

import fs from 'fs';

const trainingData = fs.readFileSync('training.jsonl');

await client.fineTune.uploadData(job.id, {
  data: trainingData,
  format: 'jsonl',
});
await client.fineTune.uploadData(job.id, {
  source: 'gcs',
  bucket: 'your-bucket',
  path: 'training-data/',
});
const upload = client.fineTune.createUploadStream(job.id);

for await (const sample of dataSource) {
  upload.write(JSON.stringify(sample) + '\n');
}

await upload.end();

Training Data Schema

interface TrainingDataPoint {
  // Audio source (one required)
  audio_url?: string;
  audio_base64?: string;
  
  // Labels (at least one required)
  emotion?: Emotion;
  valence?: number;
  arousal?: number;
  dominance?: number;
  state?: string;
  
  // Optional context
  transcript?: string;
  speaker_id?: string;
  
  // Session-level outcomes
  outcome?: {
    csat?: number;
    escalated?: boolean;
    resolved?: boolean;
    deal_closed?: boolean;
    // Custom outcome fields
    [key: string]: unknown;
  };
  
  // Metadata
  metadata?: Record<string, unknown>;
}

Starting Training

// Start training
await client.fineTune.start(job.id);

// Monitor progress
const status = await client.fineTune.getStatus(job.id);

console.log(status.status);           // "running"
console.log(status.progress);         // 0.45
console.log(status.currentEpoch);     // 5
console.log(status.metrics.loss);     // 0.234
console.log(status.metrics.accuracy); // 0.89

Monitoring Training

Polling

async function waitForCompletion(jobId: string) {
  while (true) {
    const status = await client.fineTune.getStatus(jobId);
    
    console.log(`Status: ${status.status}, Progress: ${status.progress * 100}%`);
    
    if (status.status === 'completed') {
      return status;
    }
    
    if (status.status === 'failed') {
      throw new Error(`Training failed: ${status.error}`);
    }
    
    await sleep(30000); // Poll every 30 seconds
  }
}

Webhooks

await client.webhooks.create({
  url: 'https://your-app.com/webhooks/fine-tune',
  events: ['fine_tune.started', 'fine_tune.completed', 'fine_tune.failed'],
});

Using Fine-Tuned Models

Once training completes, use your model:

// Get the fine-tuned model ID
const job = await client.fineTune.get(jobId);
const modelId = job.outputModel; // "ft-model-abc123"

// Use in analysis
const result = await client.analyze({
  audio: audioBuffer,
  model: modelId,
  vertical: 'contact_center',
});

Setting as Default

// Set as default for your organization
await client.models.setDefault(modelId);

// Now all requests use this model by default
const result = await client.analyze({ audio: audioBuffer });

Training Configuration

interface FineTuneConfig {
  // Training parameters
  epochs?: number;              // Default: 10
  learningRate?: number;        // Default: 1e-4
  batchSize?: number;           // Default: 32
  warmupSteps?: number;         // Default: 100
  
  // Data handling
  validationSplit?: number;     // Default: 0.1
  shuffleSeed?: number;         // For reproducibility
  
  // Regularization
  dropout?: number;             // Default: 0.1
  weightDecay?: number;         // Default: 0.01
  
  // Early stopping
  earlyStoppingPatience?: number;  // Default: 3
  earlyStoppingMetric?: string;    // Default: "val_loss"
  
  // Advanced
  freezeLayers?: string[];      // Freeze specific layers
  customLossWeights?: {
    emotion?: number;
    vad?: number;
    outcome?: number;
  };
}

Active Learning Integration

Enable continuous improvement from production:

// Enable feedback collection
await client.fineTune.enableFeedbackLoop(modelId, {
  // Automatically include low-confidence predictions
  confidenceThreshold: 0.4,
  
  // Include samples where outcome contradicts prediction
  includeContradictions: true,
  
  // Sample rate for high-confidence predictions
  sampleRate: 0.01,
});

// Submit manual corrections
await client.feedback.submit({
  predictionId: 'pred-123',
  correctedEmotion: 'frustrated',
  correctedState: 'angry',
  notes: 'Customer was clearly angry, not just frustrated',
});

Best Practices

Data Quality Guidelines:

  • Minimum 1,000 samples recommended
  • Balance classes to avoid bias
  • Include edge cases and difficult examples
  • Verify label quality before training

Sample Size Recommendations

Use CaseMinimum SamplesRecommended
Basic fine-tuning5002,000+
Custom states100 per state500+ per state
Outcome prediction1,000 sessions5,000+ sessions

Evaluation

// Run evaluation on held-out data
const evaluation = await client.fineTune.evaluate(modelId, {
  testData: 'gs://bucket/test-data.jsonl',
});

console.log(evaluation.accuracy);           // 0.92
console.log(evaluation.f1Score);            // 0.89
console.log(evaluation.confusionMatrix);    // [[...], [...]]
console.log(evaluation.perClassMetrics);    // { happy: {...}, sad: {...} }