MUSA AI Tensor Engine

MATE (MUSA AI Tensor Engine) is a centralized library for Generative AI workloads on MUSA. It provides high-performance attention and GEMM operators, and compatibility wrappers for CUDA-oriented Python APIs.

Highlights

High-performance attention and GEMM operators for MUSA
Compatibility wrappers for flash_attn and deep_gemm
CLI tools for environment checks, configuration inspection, and replay

Quick Links

CLI documentation: docs/mate_cli.md
FlashAttention compatibility summary: docs/flash_attention.md
FlashAttention wrapper: wrappers/flash-attention/README.md
DeepGEMM wrapper: wrappers/deep_gemm/README.md

Requirements

Component	Requirement
MUSA Toolkit	`4.3.6` or later
TorchMUSA	`2.7` or later
Architecture	`Pinghu (MP31)`

Build From Source

git clone https://github.com/MooreThreads/mate.git --recursive
cd mate
bash build.sh

Repository Layout

Path	Purpose
`mate/`	Core Python package and public APIs
`wrappers/`	Compatibility wrapper packages for existing Python ecosystems
`docs/`	Markdown docs and Sphinx sources
`tests/`	Correctness and integration tests
`benchmarks/`	Performance and benchmarking scripts

MATE CLI

MATE provides a command-line interface for configuration, debugging, diagnostics, and replay.

Command	Purpose
`mate check`	Validate the runtime environment
`mate show-config`	Display installation and runtime configuration
`mate env`	Show relevant environment variables
`mate replay --dir PATH`	Replay API calls from Level 10 dumps
`mate list-dumps PATH`	List recorded dump directories

Example:

mate check
mate show-config
mate env
mate replay --dir mate_dumps/
mate list-dumps mate_dumps/

See docs/mate_cli.md for full CLI documentation.

Wrappers

MATE uses the packages under wrappers/ as a compatibility layer for CUDA-oriented software stacks on MUSA. These wrappers preserve familiar package names and high-level APIs while routing execution to MATE operators and kernels on MUSA, which helps existing integrations migrate with smaller code changes.

Wrapper	Package	Import Path	Purpose	Documentation
`wrappers/flash-attention`	`mate-flash-attention`	`flash_attn`	FlashAttention-compatible APIs on top of MATE attention operators on MUSA	wrapper README, compatibility summary
`wrappers/deep_gemm`	`mate-deep_gemm`	`deep_gemm`	DeepGEMM-compatible APIs on top of MATE GEMM operators on MUSA	wrapper README

Build Documentation

After installing mate, build the Sphinx docs with:

pip install sphinx furo
cd docs
make html

Acknowledgement

MATE is inspired by FlashInfer, FlashAttention, cutlass, FlashMLA, and DeepGemm.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
3rdparty		3rdparty
benchmarks		benchmarks
csrc		csrc
docs		docs
include/mate		include/mate
licenses		licenses
mate		mate
tests		tests
wrappers		wrappers
.clang-format		.clang-format
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
build.sh		build.sh
build_backend.py		build_backend.py
build_utils.py		build_utils.py
format.sh		format.sh
pyproject.toml		pyproject.toml
setup.py		setup.py
version.txt		version.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MUSA AI Tensor Engine

Highlights

Quick Links

Requirements

Build From Source

Repository Layout

MATE CLI

Wrappers

Build Documentation

Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MUSA AI Tensor Engine

Highlights

Quick Links

Requirements

Build From Source

Repository Layout

MATE CLI

Wrappers

Build Documentation

Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages