让 AI 的声音哭出来
大多数 TTS 给你一个声音。豆包说它能给你情感。我跑了一轮系统性实验——30 个音频样本,跨越预置音色、克隆音色、3 种情感控制方法——来验证到底什么管用。自己听。
大多数 TTS 给你一个声音。豆包说它能给你情感。我跑了一轮系统性实验——30 个音频样本,跨越预置音色、克隆音色、3 种情感控制方法——来验证到底什么管用。自己听。
Most TTS gives you a voice. Doubao claims to give you emotions. I ran a systematic experiment — 30 audio samples across stock voices, cloned voices, and 3 emotion control methods — to find out what actually works. Listen for yourself.
火山引擎 Seed ASR 大模型完整接入指南——提交-轮询 REST API、中文语音转文字、链式降级模式、成本追踪,以及 8 个 v2/v3 API 差异的坑。
Complete integration guide for Volcengine Seed ASR bigmodel — the submit-then-poll REST API for Chinese speech-to-text. Console setup, two-step async flow, fallback chain pattern, cost tracking, and 8 gotchas about v2 vs v3 APIs that will save you a day of debugging.
火山引擎豆包 Seed-ICL 2.0 完整接入指南——声音克隆、context_texts 自然语言情感控制、逐句多次调用合成、NDJSON 响应解析,以及 7 个能帮你省几小时 debug 的坑。
Complete integration guide for Volcengine Doubao Seed-ICL 2.0 — voice cloning, natural language emotion control via context_texts, per-sentence multi-call synthesis, NDJSON response parsing, and 7 gotchas that will save you hours of debugging.
给 AI 伴侣找一个声音,意味着同时解决两个问题:让它听起来像人,让它听起来有感情。Mio v2 把这两件事分给了自定义 LLM(编剧)和 Hume EVI(演员)——这种分工可能就是情感 AI 语音的未来。
Finding a voice for an AI companion means solving two problems: making it sound human, and making it sound like it feels. Mio v2 splits these between a custom LLM (the screenwriter) and Hume EVI (the actor) — a division of labor that might be the future of emotional AI voice.
© Xingfan Xia 2024 - 2026 · CC BY-NC 4.0