Add examples of AutoTP by tohtana · Pull Request #998 · deepspeedai/DeepSpeedExamples

tohtana · 2026-01-22T00:13:40Z

This PR adds examples of AutoTP training including custom partitioning partterns.

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

@inkcherry

This PR introduces a flexible, configuration-driven API for AutoTP (Automatic Tensor Parallelism) that allows users to define custom layer partitioning patterns for training. @inkcherry @delock ## Motivation Previously, AutoTP relied on hardcoded layer detection logic that was difficult to customize for new model architectures. This PR enables: 1. **Custom models**: Users can define exact regex patterns to match their model's parameter names 2. **Fused layers**: Support for fused QKV, gate_up_proj, and other packed weight matrices with unequal sub-parameter sizes (e.g., GQA with different Q/K/V dimensions) 3. **Extensibility**: Easy to add new model presets or customize existing ones Here is an example of a config including custom partitioning patterns: ```json { "tensor_parallel": { "autotp_size": 4, "partition_config": { "use_default_specs": false, "layer_specs": [ { "patterns": [".*\\.o_proj\\.weight$", ".*\\.down_proj\\.weight$"], "partition_type": "row" }, { "patterns": [".*\\.[qkv]_proj\\.weight$"], "partition_type": "column" }, { "patterns": [".*\\.gate_up_proj\\.weight$"], "partition_type": "column", "shape": [2, -1], "partition_dim": 0 } ] } } } ``` Refer to the [document](https://github.com/tohtana/DeepSpeed/blob/tohtana/autotp_custom_patterns/docs/code-docs/source/training.rst) for more details (including preset models and how to define partitioning for fused models). We also opened a new [PR](deepspeedai/DeepSpeedExamples#998) to show the usage. ## Simplified initialization step AutoTP previously required calling ``set_autotp_mode(training=True)`` and ``deepspeed.tp_model_init`` before ``deepspeed.initialize``. Now we can include all the necessary configurations in the DeepSpeed config. We still support the traditional initialization path for backward compatibility. When you use both (i.e. calling ``set_autotp_mode(training=True)`` and ``deepspeed.tp_model_init`` and passing the config to ``deepspeed.initialize``), we will merge the settings at initialization. When we have conflicting settings, we will error out. --------- Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

…eepSpeedExamples into tohtana/custom_auto_tp

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

The current code has the following issues: - `use_default_specs: false` doesn't work - Injection by the traditional pattern runs even when custom patterns are set - `mpu` needs to be passed to `deepspeed.initialize` (HF integration doesn't pass mpu) This PR fixes AutoTP setup to respect `use_default_specs: false` and disable the traditional injection path when custom patterns are enabled. Also, when `mpu` is not passed, we create a TP group in the initialization process. With these changes, the [related tests](https://github.com/deepspeedai/DeepSpeed/tree/master/tests/unit/model_parallelism) pass and [all AutoTP examples](https://github.com/tohtana/DeepSpeedExamples/tree/tohtana/custom_auto_tp/training/tensor_parallel) in DeepSpeedExamples work now ([PR](deepspeedai/DeepSpeedExamples#998)). --------- Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

tohtana added 6 commits January 20, 2026 16:07

add custom autotp example

ee7c364

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

update key name

783391a

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

update tp examples

cb7de37

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

update tp examples

a547421

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

update example

4285cd9

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

update READMEs

7eedc31

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

tohtana requested a review from tjruwase as a code owner January 22, 2026 00:13

tohtana mentioned this pull request Jan 22, 2026

Support custom partitioning patterns for AutoTP deepspeedai/DeepSpeed#7806

Merged

tohtana and others added 2 commits February 1, 2026 10:18

replace deprecated openai package

1846518

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

Merge branch 'master' into tohtana/custom_auto_tp

6691b1b

tohtana mentioned this pull request Feb 1, 2026

Fix AutoTP custom patterns: respect use_default_specs deepspeedai/DeepSpeed#7827

Merged

tohtana added 3 commits February 1, 2026 15:23

fix patterns

a8aaacf

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

Merge branch 'tohtana/custom_auto_tp' of https://github.com/tohtana/D…

c06cef5

…eepSpeedExamples into tohtana/custom_auto_tp

update train config

ff01ee0

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

tohtana merged commit ece52bc into deepspeedai:master Feb 7, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add examples of AutoTP#998

Add examples of AutoTP#998
tohtana merged 11 commits intodeepspeedai:masterfrom
tohtana:tohtana/custom_auto_tp

tohtana commented Jan 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tohtana commented Jan 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant