Skip to content

Support New Model

Please make sure you have install and initialize pre-commit before adding any new commit. Refer PRE_COMMIT for more details.

Cache Acceleration

In order to support cache acceleration for new model, we have to register it's BlockAdapter at caching/block_adapters/adapter.py and use _safe_import func to import it at caching/block_adapters/__init__.py. For example:

@BlockAdapterRegister.register("QwenImage")
def qwenimage_adapter(pipe, **kwargs) -> BlockAdapter:
  try:
    from diffusers import QwenImageTransformer2DModel
  except ImportError:
    QwenImageTransformer2DModel = None  # requires diffusers>=0.35.2

  _relaxed_assert(pipe.transformer, QwenImageTransformer2DModel)

  return BlockAdapter(
    pipe=pipe,
    transformer=pipe.transformer,
    blocks=pipe.transformer.transformer_blocks,
    forward_pattern=ForwardPattern.Pattern_1,
    check_forward_pattern=True,
    has_separate_cfg=True,
    **kwargs,
  )
qwenimage_adapter = _safe_import(".adapters", "qwenimage_adapter")

Context Parallelism

In order to support context parallelism for new model, we have to register it's ContextParallelismPlanner at context_parallelism and use _safe_import func to import it at cp_planners.py. For example:

  • step 1: Implement the FluxContextParallelismPlanner at FLUX.1 CP planner at flux.py
  • step 2: use _safe_import func to import it at planners.py.

Tensor Parallelism

In order to support tensor parallelism for new model, we have to register it's TensorParallelismPlanner at tensor_parallelism and use _safe_import func to import it at planners.py. For example:

  • step 1: Implement the FluxTensorParallelismPlanner at FLUX.1 TP planner at flux.py
  • step 2: use _safe_import func to import it at planners.py.

Text Encoder Parallelism

In order to support text encoder tensor parallelism for new model, we have to register it's TextEncoderTensorParallelismPlanner at text_encoders and use _safe_import func to import it at planners.py. For example:

Auto Encoder (VAE) Parallelism

In order to support auto encoder (VAE) data parallelism for new model, we have to register it's AutoEncoderDateParallelismPlanner at autoencoders and use _safe_import func to import it at planners.py. For example:

Examples and Tests

Once the acceleration support for the new model is completed, we should add the new models to the Examples and perform the necessary tests.