Add pyproject.toml with legacy build backend to keep most logic in setup.py#7033
Open
Add pyproject.toml with legacy build backend to keep most logic in setup.py#7033
pyproject.toml with legacy build backend to keep most logic in setup.py#7033Conversation
stas00
reviewed
Feb 14, 2025
Collaborator
|
@mrwyattii we just went through some of this with arctic training. If it’s helpful @loadams let’s discuss on slack a bit. There’s a ton that’s currently happening in setup.py, this could be a big lift? But I agree, needs to happen! |
pyproject.toml with legacy build backend to keep most logic in setup.py
Collaborator
Author
|
Edit: this is no longer correct with latest changes. The current problem is that the logic inside setup.py aside from the call to |
This change is required to successfully build fp_quantizer extension on ROCm. --------- Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Signed-off-by: Logan Adams <loadams@microsoft.com>
Signed-off-by: Logan Adams <loadams@microsoft.com>
Signed-off-by: Logan Adams <loadams@microsoft.com>
Fix #7029 - Add Chinese blog for deepspeed windows - Fix format in README.md Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Signed-off-by: Logan Adams <loadams@microsoft.com>
Adding compile support for AIO library on AMD GPUs. --------- Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Signed-off-by: Logan Adams <loadams@microsoft.com>
Signed-off-by: Logan Adams <loadams@microsoft.com>
Signed-off-by: Logan Adams <loadams@microsoft.com>
Signed-off-by: Logan Adams <loadams@microsoft.com>
Update CUDA compute capability for cross compile according to wiki page. https://en.wikipedia.org/wiki/CUDA#GPUs_supported --------- Signed-off-by: Hongwei <hongweichen@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com>
Signed-off-by: Logan Adams <loadams@microsoft.com>
Signed-off-by: Logan Adams <loadams@microsoft.com>
Signed-off-by: Logan Adams <loadams@microsoft.com>
…ently, so we aren't seeing cupy installed. Signed-off-by: Logan Adams <loadams@microsoft.com>
Signed-off-by: Logan Adams <loadams@microsoft.com>
Propagate API change. Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com>
- add zero2 test - minor fix with transformer version update & ds master merge. Signed-off-by: inkcherry <mingzhi.liu@intel.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com>
bf16 with moe refresh optimizer state from bf16 ckpt will raise IndexError: list index out of range Signed-off-by: shaomin <wukon1992@gmail.com> Co-authored-by: shaomin <wukon1992@gmail.com> Co-authored-by: Hongwei Chen <33092912+hwchen2017@users.noreply.github.com> Signed-off-by: Logan Adams <loadams@microsoft.com>
**Auto-generated PR to update version.txt after a DeepSpeed release** Released version - 0.16.4 Author - @loadams Co-authored-by: loadams <loadams@users.noreply.github.com> Signed-off-by: Logan Adams <loadams@microsoft.com>
@jeffra and I fixed this many years ago, so bringing this doc to a correct state. --------- Signed-off-by: Stas Bekman <stas@stason.org> Signed-off-by: Logan Adams <loadams@microsoft.com>
Description This PR includes Tecorigin SDAA accelerator support. With this PR, DeepSpeed supports SDAA as backend for training tasks. --------- Signed-off-by: siqi <siqi@tecorigin.com> Co-authored-by: siqi <siqi@tecorigin.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Signed-off-by: Logan Adams <loadams@microsoft.com>
Keeps lines within PEP 8 length limits. Enhances readability with a single, concise expression. Preserves original functionality. --------- Signed-off-by: Shaik Raza Sikander <srsikander@habana.ai> Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: Max Kovalenko <mkovalenko@habana.ai> Signed-off-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: shaomin <wukon1992@gmail.com> Signed-off-by: Stas Bekman <stas@stason.org> Signed-off-by: siqi <siqi@tecorigin.com> Signed-off-by: Logan Adams <loadams@microsoft.com> Signed-off-by: Wei Wu <wuwei211x@gmail.com> Signed-off-by: ShellyNR <shelly.nahir@live.biu.ac.il> Signed-off-by: Lai, Yejing <yejing.lai@intel.com> Signed-off-by: Hongwei <hongweichen@microsoft.com> Signed-off-by: Liang Cheng <astarxp777@gmail.com> Signed-off-by: A-transformer <astarxp777@gmail.com> Co-authored-by: Raza Sikander <srsikander@habana.ai> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Max Kovalenko <mkovalenko@habana.ai> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: inkcherry <mingzhi.liu@intel.com> Co-authored-by: wukong1992 <wukong1992@users.noreply.github.com> Co-authored-by: shaomin <wukon1992@gmail.com> Co-authored-by: Hongwei Chen <33092912+hwchen2017@users.noreply.github.com> Co-authored-by: loadams <loadams@users.noreply.github.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: siqi654321 <siqi202311@163.com> Co-authored-by: siqi <siqi@tecorigin.com> Co-authored-by: Wei Wu <45323446+U-rara@users.noreply.github.com> Co-authored-by: Masahiro Tanaka <81312776+tohtana@users.noreply.github.com> Co-authored-by: Shelly Nahir <73890534+ShellyNR@users.noreply.github.com> Co-authored-by: snahir <snahir@habana.ai> Co-authored-by: Yejing-Lai <yejing.lai@intel.com> Co-authored-by: A-transformer <astarxp777@gmail.com> Signed-off-by: Logan Adams <loadams@microsoft.com>
Unpin transformers version for all workflows except `nv-torch-latest-v100` as this still has a tolerance issue with some quantization tests. Signed-off-by: Logan Adams <loadams@microsoft.com>
Resolves #6997 This PR conditionally quotes environment variable values—only wrapping those containing special characters (like parentheses) that could trigger bash errors. Safe values remain unquoted. --------- Signed-off-by: Saurabh <saurabhkoshatwar1996@gmail.com> Signed-off-by: Saurabh Koshatwar <saurabhkoshatwar1996@gmail.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Signed-off-by: Logan Adams <loadams@microsoft.com>
Correct the BACKWARD_PREFETCH_SUBMIT mismatch FORWARD_PREFETCH_SUBMIT = 'forward_prefetch_submit' --------- Signed-off-by: Shaik Raza Sikander <srsikander@habana.ai> Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: Max Kovalenko <mkovalenko@habana.ai> Signed-off-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: shaomin <wukon1992@gmail.com> Signed-off-by: Stas Bekman <stas@stason.org> Signed-off-by: siqi <siqi@tecorigin.com> Signed-off-by: Logan Adams <loadams@microsoft.com> Signed-off-by: Wei Wu <wuwei211x@gmail.com> Signed-off-by: ShellyNR <shelly.nahir@live.biu.ac.il> Signed-off-by: Lai, Yejing <yejing.lai@intel.com> Signed-off-by: Hongwei <hongweichen@microsoft.com> Signed-off-by: A-transformer <astarxp777@gmail.com> Co-authored-by: Raza Sikander <srsikander@habana.ai> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Max Kovalenko <mkovalenko@habana.ai> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: inkcherry <mingzhi.liu@intel.com> Co-authored-by: wukong1992 <wukong1992@users.noreply.github.com> Co-authored-by: shaomin <wukon1992@gmail.com> Co-authored-by: Hongwei Chen <33092912+hwchen2017@users.noreply.github.com> Co-authored-by: loadams <loadams@users.noreply.github.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: siqi654321 <siqi202311@163.com> Co-authored-by: siqi <siqi@tecorigin.com> Co-authored-by: Wei Wu <45323446+U-rara@users.noreply.github.com> Co-authored-by: Masahiro Tanaka <81312776+tohtana@users.noreply.github.com> Co-authored-by: Shelly Nahir <73890534+ShellyNR@users.noreply.github.com> Co-authored-by: snahir <snahir@habana.ai> Co-authored-by: Yejing-Lai <yejing.lai@intel.com> Signed-off-by: Logan Adams <loadams@microsoft.com>
…Tests (#7146) Enhancing ci/nightly coverage for gaudi2 device Tests added : test_autotp_training.py test_ulysses.py test_linear::TestLoRALinear and test_linear::TestBasicLinear test_ctx::TestEngine these provide coverage for model_parallesim and linear feature. The tests are stable. 10/10 runs pass. New tests addition is expected to increase ci time by 3-4 mins and nightly job time by 15 min. Signed-off-by: Shaik Raza Sikander <srsikander@habana.ai> Signed-off-by: Logan Adams <loadams@microsoft.com>
Changes from huggingface/transformers#36654 in transformers cause issues with the torch 2.5 version we were using. This just updated us to use a newer version. --------- Signed-off-by: Logan Adams <loadams@microsoft.com>
Signed-off-by: Masahiro Tanaka <mtanaka@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com>
@tjruwase Don't merge yet, I will leave a comment when it is ready for merge. Thank you. --------- Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: Logan Adams <loadams@microsoft.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Logan Adams <loadams@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com>
) This PR is a continuation of the efforts to improve DeepSpeed performance when using PyTorch compile. Dynamo breaks the graph because `flat_tensor.requires_grad = False`: * Is a side-effecting operation on tensor metadata * Occurs in a context where Dynamo expects static tensor properties for tracing `flat_tensor.requires_grad` is redundant and can be safely removed because: * `_allgather_params()` function is already decorated with `@torch.no_grad()` which ensures the desired property * `flat_tensor` is created using the `torch.empty()` which sets the `requires_grad=False` by default. --------- Signed-off-by: Max Kovalenko <mkovalenko@habana.ai> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Hongwei Chen <33092912+hwchen2017@users.noreply.github.com> Signed-off-by: Logan Adams <loadams@microsoft.com>
ZeRO3 requires explicit cleaning in tests when reusing the environment. This PR adds `destroy` calls to the tests to free memory and avoid potential errors due to memory leaks. Signed-off-by: Masahiro Tanaka <mtanaka@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com>
Signed-off-by: c8ef <c8ef@outlook.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com>
Signed-off-by: Hongwei <hongweichen@microsoft.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Signed-off-by: Logan Adams <loadams@microsoft.com>
705edb3 to
31ec2b7
Compare
Signed-off-by: Logan Adams <loadams@microsoft.com>
sfc-gh-mwyatt
approved these changes
Mar 25, 2025
Signed-off-by: Logan Adams <loadams@microsoft.com>
agronholm
reviewed
Apr 9, 2025
Comment on lines
+4
to
+6
| "setuptools>=64", | ||
| "torch", | ||
| "wheel" |
There was a problem hiding this comment.
If you depend on setuptools 70.1 or later, you won't need wheel.
Suggested change
| "setuptools>=64", | |
| "torch", | |
| "wheel" | |
| "setuptools>=70.1", | |
| "torch" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Successfully built deepspeed-0.16.5+1d869d1f-cp311-cp311-win_amd64.whl--no-build-isolationpython -m build-Successfully built deepspeed-0.16.5+1d869d1f.tar.gz and deepspeed-0.16.5+unknown-py3-none-any.whlThe main goal of this effort is to become compliant with the coming changes to pip in 25.1 listed here which will break editable installs. Future PRs will fully move from
setup.pytopyproject.tomlFixes: #7031
MII equivalent PR: deepspeedai/DeepSpeed-MII#555
DS-Kernels equivalent PR: deepspeedai/DeepSpeed-Kernels#20