Cyclops.Speech

项目概述

Cyclops.Speech是Cyclops.Framework框架中的语音处理组件，提供语音识别、语音合成、语音分析等功能。该组件集成了多种语音服务API，并提供统一的接口，支持本地处理和云服务两种模式，适用于语音交互、内容转换和音频分析等应用场景。

核心功能模块

语音识别（Speech-to-Text）

实时语音识别
音频文件转录
多语言支持
方言识别
说话人分离

语音合成（Text-to-Speech）

文本转语音
多音色选择
语速和音调调节
情感合成
SSML标记语言支持

语音分析

语音特征提取
情绪识别
语音质量评估
说话人识别
噪声检测和过滤

音频处理

音频格式转换
音频剪辑和合并
降噪和增强
音频压缩

服务集成

本地模型支持
云服务API集成（如Azure Speech、百度语音、讯飞语音等）
自动服务切换
离线模式支持

技术栈

.NET 8.0
NAudio（音频处理）
ML.NET（本地语音处理）
Cyclops.Common
Cyclops.HttpClient

环境依赖

.NET 8.0 SDK
可选：云服务API密钥（用于云服务模式）
可选：音频设备（用于实时语音处理）

安装配置

NuGet安装

Install-Package Cyclops.Speech

基本配置

在应用程序启动时进行配置：

// 在Program.cs或Startup.cs中
using Cyclops.Speech;

var builder = WebApplication.CreateBuilder(args);

// 添加Cyclops.Speech服务
builder.Services.AddCyclopsSpeech(options => {
    // 基本配置
    options.DefaultMode = SpeechMode.Cloud; // Cloud或Local
    
    // 临时文件路径
    options.TempFilePath = Path.Combine(AppContext.BaseDirectory, "Temp");
    
    // 云服务配置
    options.CloudServices = new Dictionary<SpeechProvider, CloudSpeechConfig> {
        {
            SpeechProvider.Azure,
            new CloudSpeechConfig {
                ApiKey = builder.Configuration["Speech:Azure:ApiKey"],
                Region = builder.Configuration["Speech:Azure:Region"]
            }
        },
        {
            SpeechProvider.Baidu,
            new CloudSpeechConfig {
                ApiKey = builder.Configuration["Speech:Baidu:ApiKey"],
                SecretKey = builder.Configuration["Speech:Baidu:SecretKey"]
            }
        }
    };
    
    // 本地模型配置
    options.LocalModelPath = builder.Configuration["Speech:Local:ModelPath"];
    
    // 音频配置
    options.AudioFormat = AudioFormat.Wav;
    options.SampleRate = 16000;
    
    // 启用日志
    options.EnableLogging = true;
});

// ...

代码示例

语音识别示例

using Cyclops.Speech.Models;
using Cyclops.Speech.Services;
using Microsoft.Extensions.DependencyInjection;

// 获取语音识别服务
var speechRecognitionService = serviceProvider.GetRequiredService<ISpeechRecognitionService>();

// 识别音频文件
var recognitionResult = await speechRecognitionService.RecognizeFromFileAsync(
    filePath: "sample.wav",
    options: new SpeechRecognitionOptions {
        Language = "zh-CN",
        EnablePunctuation = true,
        Provider = SpeechProvider.Azure
    }
);

Console.WriteLine($"识别结果: {recognitionResult.Text}");
Console.WriteLine($"置信度: {recognitionResult.Confidence}");

// 实时语音识别
await using (var audioStream = new FileStream("sample.wav", FileMode.Open))
{
    await speechRecognitionService.RecognizeFromStreamAsync(
        audioStream: audioStream,
        options: new SpeechRecognitionOptions {
            Language = "zh-CN",
            EnableContinuousRecognition = false
        },
        onPartialResult: partial => {
            Console.WriteLine($"部分结果: {partial.Text}");
        },
        onFinalResult: final => {
            Console.WriteLine($"最终结果: {final.Text}");
        }
    );
}

// 多语言识别
var multiLangResult = await speechRecognitionService.RecognizeFromFileAsync(
    filePath: "multilingual.wav",
    options: new SpeechRecognitionOptions {
        AutoDetectLanguage = true,
        EnableSpeakerDiarization = true
    }
);

Console.WriteLine($"检测到的语言: {multiLangResult.DetectedLanguage}");
foreach (var segment in multiLangResult.Segments)
{
    Console.WriteLine($"说话人 {segment.SpeakerId}: {segment.Text}");
}

语音合成示例

using Cyclops.Speech.Models;
using Cyclops.Speech.Services;
using Microsoft.Extensions.DependencyInjection;

// 获取语音合成服务
var speechSynthesisService = serviceProvider.GetRequiredService<ISpeechSynthesisService>();

// 基本文本转语音
var synthesisResult = await speechSynthesisService.SynthesizeAsync(
    text: "您好，这是一个语音合成示例。Cyclops.Speech组件可以将文本转换为自然流畅的语音。",
    options: new SpeechSynthesisOptions {
        VoiceId = "zh-CN-XiaoxiaoNeural", // Azure语音ID
        Speed = 1.0,
        Pitch = 1.0
    }
);

// 保存合成音频到文件
await using (var fileStream = new FileStream("synthesized_audio.wav", FileMode.Create))
{
    await synthesisResult.AudioStream.CopyToAsync(fileStream);
}

Console.WriteLine($"语音合成完成，音频长度: {synthesisResult.AudioDuration.TotalSeconds} 秒");

// 使用SSML进行高级合成
var ssmlText = @"
<speak version='1.0' xmlns='http://www.w3.org/2001/10/synthesis' xml:lang='zh-CN'>
  <voice name='zh-CN-YunxiNeural'>
    <prosody rate='0.9'>这是一段语速较慢的文本。</prosody>
    <break time='500ms'/>
    <prosody rate='1.2'>这是一段语速较快的文本。</prosody>
    <emphasis level='strong'>这是需要强调的内容。</emphasis>
  </voice>
</speak>";

var ssmlResult = await speechSynthesisService.SynthesizeSsmlAsync(
    ssml: ssmlText,
    provider: SpeechProvider.Azure
);

// 直接播放合成音频
await speechSynthesisService.PlaySynthesizedAudioAsync(ssmlResult.AudioStream);

// 获取可用的语音列表
var availableVoices = await speechSynthesisService.GetAvailableVoicesAsync(
    locale: "zh-CN",
    provider: SpeechProvider.Azure
);

Console.WriteLine("可用的语音列表:");
foreach (var voice in availableVoices)
{
    Console.WriteLine($"- {voice.Name} ({voice.Gender}) - {voice.Locale}");
}

语音分析示例

using Cyclops.Speech.Models;
using Cyclops.Speech.Services;
using Microsoft.Extensions.DependencyInjection;

// 获取语音分析服务
var speechAnalysisService = serviceProvider.GetRequiredService<ISpeechAnalysisService>();

// 分析音频文件
var analysisResult = await speechAnalysisService.AnalyzeAudioAsync(
    filePath: "sample.wav",
    options: new SpeechAnalysisOptions {
        AnalyzeEmotion = true,
        AnalyzeQuality = true,
        DetectSpeakers = true
    }
);

// 输出分析结果
Console.WriteLine($"音频时长: {analysisResult.Duration.TotalSeconds} 秒");
Console.WriteLine($"采样率: {analysisResult.SampleRate} Hz");
Console.WriteLine($"声道数: {analysisResult.Channels}");
Console.WriteLine($"检测到的说话人数: {analysisResult.SpeakerCount}");

// 情绪分析结果
if (analysisResult.EmotionAnalysis != null)
{
    Console.WriteLine("情绪分析结果:");
    foreach (var emotion in analysisResult.EmotionAnalysis.DominantEmotions)
    {
        Console.WriteLine($"- {emotion.Type}: {emotion.Score:P2}");
    }
}

// 语音质量分析
if (analysisResult.QualityAnalysis != null)
{
    Console.WriteLine($"语音质量评分: {analysisResult.QualityAnalysis.OverallScore:P2}");
    Console.WriteLine($"信噪比: {analysisResult.QualityAnalysis.SignalToNoiseRatio} dB");
    Console.WriteLine($"噪声水平: {analysisResult.QualityAnalysis.NoiseLevel}");
}

// 实时语音分析
await speechAnalysisService.AnalyzeRealtimeAsync(
    onSegmentAnalyzed: segment => {
        Console.WriteLine($"语音段时长: {segment.Duration.TotalMilliseconds} ms");
        Console.WriteLine($"语音段能量: {segment.Energy}");
        if (segment.Emotion != null)
        {
            Console.WriteLine($"检测到的情绪: {segment.Emotion.Type} ({segment.Emotion.Score:P2})");
        }
    },
    onSpeechDetected: () => {
        Console.WriteLine("检测到语音开始");
    },
    onSpeechEnded: () => {
        Console.WriteLine("检测到语音结束");
    },
    duration: TimeSpan.FromSeconds(10)
);

音频处理示例

using Cyclops.Speech.Services;
using Microsoft.Extensions.DependencyInjection;

// 获取音频处理服务
var audioProcessingService = serviceProvider.GetRequiredService<IAudioProcessingService>();

// 音频格式转换
await audioProcessingService.ConvertAudioFormatAsync(
    inputFilePath: "input.mp3",
    outputFilePath: "output.wav",
    options: new AudioConversionOptions {
        SampleRate = 16000,
        Channels = 1,
        BitsPerSample = 16
    }
);

// 音频剪辑
await audioProcessingService.ClipAudioAsync(
    inputFilePath: "long_audio.wav",
    outputFilePath: "clipped_audio.wav",
    startTime: TimeSpan.FromSeconds(10),
    duration: TimeSpan.FromSeconds(30)
);

// 音频降噪
await audioProcessingService.RemoveNoiseAsync(
    inputFilePath: "noisy_audio.wav",
    outputFilePath: "denoised_audio.wav",
    noiseReductionLevel: NoiseReductionLevel.Medium
);

// 音频合并
await audioProcessingService.MergeAudioFilesAsync(
    inputFilePaths: new List<string> { "audio1.wav", "audio2.wav", "audio3.wav" },
    outputFilePath: "merged_audio.wav"
);

// 调整音频参数
await audioProcessingService.AdjustAudioAsync(
    inputFilePath: "original.wav",
    outputFilePath: "adjusted.wav",
    volume: 1.5f,  // 增大50%音量
    speed: 1.2f,   // 加快20%速度
    pitch: 1.1f    // 提高10%音调
);

// 提取音频特征
var audioFeatures = await audioProcessingService.ExtractAudioFeaturesAsync("sample.wav");

Console.WriteLine("音频特征:");
Console.WriteLine($"平均能量: {audioFeatures.AverageEnergy}");
Console.WriteLine($"频谱中心: {audioFeatures.SpectralCentroid}");
Console.WriteLine($"过零率: {audioFeatures.ZeroCrossingRate}");
Console.WriteLine($"带宽: {audioFeatures.Bandwidth}");

## 贡献者

- yswenli

## 许可证

保留所有权利

yswenli/TJC.Cyclops.Speechv2026.2.26.1

Get Started

Readme

Cyclops.Speech

项目概述

核心功能模块

语音识别（Speech-to-Text）

语音合成（Text-to-Speech）

语音分析

音频处理

服务集成

技术栈

环境依赖

安装配置

NuGet安装

基本配置

代码示例

语音识别示例

语音合成示例

语音分析示例

音频处理示例

Keywords

Maintainers