-
Notifications
You must be signed in to change notification settings - Fork 932
Are MQA and GQA in development? #727
Copy link
Copy link
Open
Description
Hi Experts,
Recently some of the emerging models use MQA (Multi-Query Attention) or GQA (Grouped-Query Attention), From issues list, I noticed that some users have already mentioned about the support of these two algorithms, and it's been quite a long time, can I ask is there any plan to support it, and when will the code be MERGED?
Currently using MQA, GQA for modeling:
- Llama2 (GQA)
- ChatGLM2-6B
- Falcon
- SantaCoder, StarCoder
Any comments will be appreciated.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels