TTS API Server Implementation #141

arkohut · 2025-03-09T15:43:36Z

主要功能

实现了一个基于 FastAPI 的文本转语音服务，提供音色管理、语音合成和声音克隆功能。核心功能包括三个主要接口：

1. 音色管理接口

GET /voices: 获取已注册音色列表
POST /register_voice: 注册新音色到系统（需提供参考音频和文本）

2. 文本转语音接口

POST /tts: 使用已注册音色进行语音合成
- 参数：speaker（指定音色名称），tts_text（合成文本）

3. 即时克隆接口

POST /clone: 单次语音克隆合成
- 参数：prompt_wav（参考音频），prompt_text（参考文本），tts_text（目标文本）

使用示例

# 获取可用音色列表
curl -X GET http://localhost:8000/voices

# 注册新音色
curl -X POST -F "speaker_name=my_voice" -F "prompt_text=Hello world" -F "[email protected]" http://localhost:8000/register_voice

# 使用注册音色合成
curl -X POST -F "tts_text=欢迎使用语音合成服务" -F "speaker=my_voice" http://localhost:8000/tts --output output.wav

# 即时克隆合成
curl -X POST -F "tts_text=这是即时克隆的语音" -F "prompt_text=我是参考文本" -F "[email protected]" -F "speaker_name=temporary" http://localhost:8000/clone --output clone.wav

…nd cloning

arkohut · 2025-04-25T16:08:05Z

这个 PR 我已经在实际项目中使用了，效果非常出色。我使用这个 API 生成了一个中文配音视频，可以在这里查看效果：
https://www.bilibili.com/video/BV1vkdnYcEPh

希望能合并这个 PR，让更多开发者能够使用这个功能。如果有任何问题或建议，我很乐意进行讨论和改进。

arkohut added 2 commits March 9, 2025 23:26

feat: Add FastAPI server for Step-Audio TTS with voice registration a…

4482493

…nd cloning

rename file server.py -> tts_api_server.py

5480302

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TTS API Server Implementation #141

TTS API Server Implementation #141

Uh oh!

arkohut commented Mar 9, 2025

Uh oh!

arkohut commented Apr 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

TTS API Server Implementation #141

Are you sure you want to change the base?

TTS API Server Implementation #141

Uh oh!

Conversation

arkohut commented Mar 9, 2025

主要功能

1. 音色管理接口

2. 文本转语音接口

3. 即时克隆接口

使用示例

Uh oh!

arkohut commented Apr 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant