Skip to content

Conversation

@ypatz
Copy link

@ypatz ypatz commented Jan 12, 2026

1、剩余的帧应该是audio_sample[-m:]或者torch.empty(0)
2、另外关于scama的init_cache部分代码,audio_sample_list = load_audio_text_image_video(...)返回的数据应该是在cpu上的tensor,如果cache["prev_samples"] 初始化的时候指定其他device,会导致torch.cat((cache["prev_samples"], audio_sample_list[0]))异常

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @ypatz, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

此拉取请求主要修复了流式语音识别模型中一个关键的 prev_samples 处理错误。通过调整样本截取逻辑,确保了在处理连续音频流时,每个音频块的末尾部分能够被正确地传递和利用,从而提高了流式识别的准确性和连续性。同时,还对 prev_samples 的初始化方式进行了小幅优化。

Highlights

  • 修复 prev_samples 截取逻辑: 修正了流式识别模型中 prev_samples 的截取方式,确保正确保留音频块的尾部样本,以便在下一个处理周期中使用。
  • 优化 prev_samples 初始化: 在 scama 模型中,移除了 prev_samples 初始化时显式指定设备的操作,简化了初始化过程。

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

此拉取请求修复了流式识别模型(包括 fsmn_vad_streamingparaformer_streamingscama)中 prev_samples 处理的错误。之前,prev_samples 的截取逻辑不正确,可能导致后续音频块处理时数据丢失或上下文不连续。通过更新截取逻辑,确保了当前音频块的正确尾部被保留,以便在下一次处理时作为 prev_samples 使用。此外,scama/model.py 中的 init_cache 函数也已更新,使 prev_samples 的设备放置与其它流式模型保持一致,避免了潜在的设备不匹配错误。这些改进提高了流式推理逻辑的正确性和鲁棒性。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant