-
Notifications
You must be signed in to change notification settings - Fork 90
feat(audio): 添加 AsrConfig支持增强语音转录功能 #138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(audio): 添加 AsrConfig支持增强语音转录功能 #138
Conversation
qyou
commented
Dec 5, 2025
- 在 AsrConfig 类中新增多个配置项,包括 enable_ddc、enable_itn 和 enable_punc,默认值设为 true
- 为 userLanguage 字段设置默认值 "common"
- 在 RoomConfig 中为 roomMode 设置默认空字符串并添加 Builder 默认注解
- 更新 TranscriptionsUpdateEventData 模型以支持 asrConfig 参数
- 扩展 WebSocket 客户端测试用例,验证带和不带 asrConfig 的事件处理逻辑
- 更新示例代码以演示如何传递 ASR 配置进行实时转录
- 修复测试工具类中的序列化版本 UID 缺失问题
- 在 AsrConfig 类中新增多个配置项,包括 enable_ddc、enable_itn 和 enable_punc,默认值设为 true - 为 userLanguage 字段设置默认值 "common" - 在 RoomConfig 中为 roomMode 设置默认空字符串并添加 Builder 默认注解 - 更新 TranscriptionsUpdateEventData 模型以支持 asrConfig 参数 - 扩展 WebSocket 客户端测试用例,验证带和不带 asrConfig 的事件处理逻辑 - 更新示例代码以演示如何传递 ASR 配置进行实时转录 - 修复测试工具类中的序列化版本 UID 缺失问题
WalkthroughAdded ASR (Automatic Speech Recognition) configuration support to websocket audio transcriptions by introducing new fields to AsrConfig with builder defaults, adding asrConfig to TranscriptionsUpdateEventData, and updating test coverage. Also applied Builder.Default annotations to existing model fields for consistency. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10–15 minutes
Possibly related PRs
Suggested labels
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
api/src/main/java/com/coze/openapi/client/websocket/event/model/AsrConfig.java (1)
25-35: Consider using primitivebooleaninstead ofBooleanwrapper.All three enable flags (
enableDdc,enableItn,enablePunc) use theBooleanwrapper type with defaults oftrue. If tri-state logic (true/false/null) is not required, consider using primitivebooleantypes instead. This would:
- Prevent potential
NullPointerExceptionissues- Make the API contract clearer (always present, never null)
- Align with typical boolean flag patterns
However, if the API specification requires nullable booleans for backward compatibility or future extensibility, the current implementation is fine.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
api/src/main/java/com/coze/openapi/client/audio/rooms/model/RoomConfig.java(1 hunks)api/src/main/java/com/coze/openapi/client/websocket/event/model/AsrConfig.java(1 hunks)api/src/main/java/com/coze/openapi/client/websocket/event/model/TranscriptionsUpdateEventData.java(1 hunks)api/src/test/java/com/coze/openapi/service/service/websocket/audio/transcriptions/WebsocketAudioTranscriptionsClientTest.java(4 hunks)api/src/test/java/com/coze/openapi/utils/Utils.java(1 hunks)example/src/main/java/example/websocket/audio/transcriptions/WebsocketTranscriptionsExample.java(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
api/src/test/java/com/coze/openapi/service/service/websocket/audio/transcriptions/WebsocketAudioTranscriptionsClientTest.java (1)
api/src/main/java/com/coze/openapi/client/websocket/event/EventType.java (1)
EventType(3-94)
api/src/main/java/com/coze/openapi/client/websocket/event/model/AsrConfig.java (1)
api/src/main/java/com/coze/openapi/service/service/CozeAPI.java (1)
Builder(150-310)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: test (Java 17 on Windows)
- GitHub Check: test (Java 17 on macOS)
- GitHub Check: test (Java 17 on Ubuntu)
- GitHub Check: test (Java 8 on macOS)
- GitHub Check: test (Java 11 on Windows)
🔇 Additional comments (11)
api/src/test/java/com/coze/openapi/utils/Utils.java (1)
13-14: LGTM!Adding
serialVersionUIDto the anonymousHashMapsubclass is a good practice to suppress serialization warnings. The value1Lis appropriate for internal test utilities where serialization version control is not critical.api/src/main/java/com/coze/openapi/client/audio/rooms/model/RoomConfig.java (1)
23-24: LGTM!Adding
@Builder.Defaultensures the builder pattern respects the default value of""forroomMode. Without this annotation, the builder would setnullwhen the field is not explicitly provided, which would break the intended default behavior.api/src/main/java/com/coze/openapi/client/websocket/event/model/AsrConfig.java (1)
22-23: LGTM!Adding
@Builder.Defaultwith"common"as the default value foruserLanguageensures consistent builder behavior and provides a sensible language-agnostic default.api/src/main/java/com/coze/openapi/client/websocket/event/model/TranscriptionsUpdateEventData.java (2)
16-17: LGTM!Adding the optional
asrConfigfield with@JsonProperty("asr_config")properly extends the event data model to support ASR configuration while maintaining backward compatibility (field can be null).
19-21: LGTM!The convenience constructor provides a simpler API for cases where
asrConfigis not needed, supporting backward compatibility. It complements the all-args constructor generated by@AllArgsConstructorwithout conflict.example/src/main/java/example/websocket/audio/transcriptions/WebsocketTranscriptionsExample.java (2)
123-128: LGTM!The
AsrConfigconstruction properly demonstrates the builder pattern and shows how to override defaults. The enable flags (enableDdc,enableItn,enablePunc) will automatically use their defaulttruevalues.
130-130: LGTM!The
transcriptionsUpdatecall correctly demonstrates passing bothinputAudioandasrConfigusing the all-args constructor generated by Lombok's@AllArgsConstructor.api/src/test/java/com/coze/openapi/service/service/websocket/audio/transcriptions/WebsocketAudioTranscriptionsClientTest.java (4)
123-128: LGTM!Adding the null check for
asrConfigensures backward compatibility is tested—verifying that events withoutasr_configin the JSON payload correctly deserialize withasrConfigas null.
130-187: LGTM!The new test comprehensively validates
asrConfigdeserialization, covering all fields including the list of hot words, string context, language setting, and boolean enable flags. This provides excellent test coverage for the new functionality.
355-360: LGTM!The test correctly demonstrates using the builder pattern to include
asrConfigin the transcriptions update, validating the happy path with ASR configuration.
368-383: LGTM!The new test validates the backward-compatible path where
asrConfigis not provided, using the single-argument constructor. This ensures both usage patterns are supported and tested.