fast vc service

VC/VC++ 2025-07-27

Real-time voice conversion service based on Seed-VC, providing WebSocket voice conversion with PCM and Opus audio format support

English | 简体中文

Features are continuously being updated. Stay tuned for our latest developments...

Fast-VC-Service aims to build a high-performance real-time streaming voice conversion cloud service designed for production environments. Based on the Seed-VC model, it supports WebSocket protocol and PCM/OPUS audio encoding formats.

Core Features | Quick Start | Performance | Version Updates | TODO | Acknowledgements

Core Features

  • Real-time Conversion: Low-latency streaming voice conversion based on Seed-VC
  • WebSocket API: Support for PCM and OPUS audio formats
  • Performance Monitoring: Complete real-time performance metrics statistics
  • High Concurrency: Multi-Worker concurrent processing, supporting production environments
  • Easy Deployment: Simple configuration, one-click startup

Quick Start

? One-click Installation

# Clone project
git clone --recursive https://*gi*th*ub.com/Leroll/fast-vc-service.git
cd fast-vc-service

# Configure environment
cp .env.example .env

# Install dependencies (Poetry recommended)
poetry install

# Start service
fast-vc serve

? Quick Testing

# WebSocket real-time voice conversion
python examples/websocket/ws_client.py 
    --source-wav-path "wavs/sources/low-pitched-male-24k.wav" 
    --encoding PCM

For detailed installation and usage guide, please refer to Quick Start documentation.

? Performance

GPU Concurrency Worker Chunk time First Token Latency End-to-End Latency Avg Chunk Latency Avg RTF Median RTF P95 RTF
4090D 1 6 500 136.0 143.0 105.0 0.21 0.22 0.24
4090D 12 12 500 140.1 256.6 216.6 0.44 0.45 0.51
1080TI 1 6 500 157.0 272.0 252.2 0.50 0.51 0.61
1080TI 3 6 500 154.3 261.3 304.9 0.61 0.62 0.73
  • Time unit: milliseconds (ms)
  • View detailed test report:
    • Performance-Report_4090D
    • Performance-Report_1080ti

Version Updates

2025-07-02 - v0.1.3: Added Process and Instance Level Concurrency Monitoring

  • Added PID record to logs for easier instance tracking
  • Added instance concurrency monitoring feature for real-time concurrency viewing
  • Optimized performance analysis interface to reduce impact on real-time performance

2025-06-26 - v0.1.2: Persistent Storage Optimization

  • Optimized session persistent storage module with asynchronous processing
  • Separated time-consuming timeline statistical analysis module to improve response speed
  • Optimized timeline recording mechanism to reduce storage overhead

2025-06-19 - v0.1.1: First Packet Performance Optimization

  • Added performance monitoring API endpoint /tools/performance-report for real-time performance metrics
  • Enhanced timing logs for better performance bottleneck analysis
  • Mitigated delay issue caused by first audio packet model invocation
View Historical Versions

2025-06-15 - v0.1.0: Basic Service Framework

Completed the core framework construction of real-time voice conversion service based on Seed-VC, implementing WebSocket streaming inference, performance monitoring, multi-format audio support and other complete basic functions.

  • Real-time streaming voice conversion service
  • WebSocket API support for PCM and Opus formats
  • Complete performance monitoring and statistics system
  • Flexible configuration management and environment variable support
  • Multi-Worker concurrent processing capability
  • Concurrent performance testing framework

? TODO

  • tag - v0.2 - Improve inference efficiency, reduce RTF - v2025-xx
    • Optimize timeline_lognize, add delay items for same events
    • Add SLOW tags in logs for monitoring receive interval, send interval, and VC-E2E latency
    • Optimize session tool's file naming
    • Add adaptive pitch extraction functionality with corresponding toggle switch
    • Change VAD to use ONNX-GPU to improve inference speed
    • Complete support for seed-vc V2.0 model
    • Explore solutions to reduce model inference latency (e.g., new model architectures, quantization, etc.)
    • Use torchaudio to directly read reference audio to GPU, eliminating transfer steps
    • Fix file_vc issue with the last block
    • Create Docker image, AutoDL image

Acknowledgements

  • Seed-VC - Provides powerful underlying voice conversion model
  • RVC - Provides basic streaming voice conversion pipeline
下载源码

通过命令行克隆项目:

git clone https://github.com/Leroll/fast-vc-service.git