Skip to content

add activation checkpoint for batch_split#3108

Merged
copybara-service[bot] merged 1 commit intomainfrom
qinwen/add_checkpoint
Feb 7, 2026
Merged

add activation checkpoint for batch_split#3108
copybara-service[bot] merged 1 commit intomainfrom
qinwen/add_checkpoint

Conversation

@suexu1025
Copy link
Collaborator

@suexu1025 suexu1025 commented Feb 6, 2026

Description

  • add activation checkpoint for batch split

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@codecov
Copy link

codecov bot commented Feb 7, 2026

Codecov Report

❌ Patch coverage is 66.66667% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/MaxText/layers/deepseek_batchsplit.py 0.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Collaborator

@NuojCheng NuojCheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I suggest adding "attention_out" to other attention modules as well for consistency, e.g. attention_op.py, attention_mla.py.

@suexu1025
Copy link
Collaborator Author

LGTM but I suggest adding "attention_out" to other attention modules as well for consistency, e.g. attention_op.py, attention_mla.py.

just updated, added attention_out for both other attention files to be consistent available config.

@suexu1025 suexu1025 requested a review from NuojCheng February 7, 2026 01:12
update

clean up

add attention_out for attention, attention_mla
Copy link
Collaborator

@RissyRan RissyRan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we reuse context for attention out?

@copybara-service copybara-service bot merged commit 95ef3e1 into main Feb 7, 2026
34 checks passed
@copybara-service copybara-service bot deleted the qinwen/add_checkpoint branch February 7, 2026 01:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants