[Docs]Update DSK-V3.1 Docs#6347

Open

Linboyan-trc wants to merge 4 commits intoPaddlePaddle:developfrom

Linboyan-trc:02_DSK_Docs

Linboyan-trc commented Feb 4, 2026

Motivation

💡 If this PR is a Cherry Pick, the PR title needs to follow the format by adding the [Cherry-Pick] label at the very beginning and appending the original PR ID at the end. For example, [Cherry-Pick][CI] Add check trigger and logic(#5191)

💡 如若此PR是Cherry Pick，PR标题需遵循格式，在最开始加上[Cherry-Pick]标签，以及最后面加上原PR ID，例如[Cherry-Pick][CI] Add check trigger and logic(#5191)

Modifications

add Deepseek-V3.1 best practice guidance docs

Usage or Command

Accuracy Tests

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

Linboyan-trc added 4 commits

February 4, 2026 17:04


          add Deepseek-V3.1 Docs


          add Deepseek-V3.1 Docs

bd20540


          delete annotation

d05ab91


          Merge remote-tracking branch 'origin/develop' into 02_DSK_Docs

c3e5325

Linboyan-trc temporarily deployed to Metax_ci

February 4, 2026 09:33

— with

GitHub Actions Inactive

CLAassistant commented Feb 4, 2026

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

paddle-bot bot commented Feb 4, 2026

Thanks for your contribution!

paddle-bot bot added the contributor label

chang-wenbin reviewed

View reviewed changes

docs/zh/best_practices/Deepseek-V3.1-Terminus.md

+              #### 2.2.3 Chunked Prefill
+              **原理：** 采用分块策略，将预填充（Prefill）阶段请求拆解为小规模子任务，与解码（Decode）请求混合批处理执行。可以更好地平衡计算密集型（Prefill）和访存密集型（Decode）操作，优化GPU资源利用率，减少单次Prefill的计算量和显存占用，从而降低显存峰值，避免显存不足的问题。 具体请参考[Chunked Prefill](../features/chunked_prefill.md)
+              - **参数：** `--enable-chunked-prefill`

Collaborator

chang-wenbin Feb 5, 2026

这个命令已经失效了，注意文档内容的验证，保证所有参数都是可用的；

Collaborator

chang-wenbin Feb 5, 2026

deepseek 需要关闭这个参数，可以找一下当前如何关闭；并在文档里指出

docs/zh/best_practices/Deepseek-V3.1-Terminus.md


		- 相关配置:

		`--max-num-batched-tokens`：限制每个chunk的最大token数量。多模场景下每个chunk会向上取整保持图片的完整性，因此实际每次推理的总token数会大于该值。推荐设置为384。

Collaborator

chang-wenbin Feb 5, 2026

这个推荐值是哪来的呀？

docs/zh/best_practices/Deepseek-V3.1-Terminus.md

+              **启用方式：**
+              自2.2版本开始（包括develop分支），Prefix Caching已经默认开启。
+              对于2.1及更早的版本，需要手动开启。其中`--enable-prefix-caching`表示启用前缀缓存，`--swap-space`表示在GPU缓存的基础上，额外开启CPU缓存，大小为GB，应根据机器实际情况调整。建议取值为`(机器总内存 - 模型大小) * 20%`。如果因为其他程序占用内存等原因导致服务启动失败，可以尝试减小`--swap-space`的值。

Collaborator

chang-wenbin Feb 5, 2026

这样功能需要验证

docs/zh/best_practices/Deepseek-V3.1-Terminus.md

+              > **最大序列数量**
+              - **参数：** `--max-num-seqs`
+              - **描述：** 控制服务可以处理的最大序列数量，支持1～256。

Collaborator

chang-wenbin Feb 5, 2026

当前最大值还是256吗？需要确定

docs/zh/best_practices/Deepseek-V3.1-Terminus.md

+                     --quantization wint4 &
+              ```
+              其中：
+              - `--quantization`: 量化策略，可选：

Collaborator

chang-wenbin Feb 5, 2026

还有其他可选项，最好写清楚一些

docs/zh/best_practices/Deepseek-V3.1-Terminus.md

+              ### 2.1 基础：启动服务
+              通过下列命令启动服务
+              ```bash
+              python -m fastdeploy.entrypoints.openai.api_server \

Collaborator

chang-wenbin Feb 5, 2026

deepseek需要添加部分环境变量，要给出来，
best_practices宗旨是用户可以简单快速的部署模型，并且拥有较好的性能和精度；

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels