Skip to content

pyannote/speaker-diarization-3.0 runs slower than pyannote/[email protected] #499

Closed
@kaihe-stori

Description

@kaihe-stori

] + ["pyannote.audio @ git+https://github.com/pyannote/pyannote-audio@db24eb6c60a26804b1f07a6c2b39055716beb852"],

Currently pyannote.audio is pinned to 3.0.0, but it has been reported that it performed slower because the embeddings model ran on CPU. As a result a new release 3.0.1 fixed it by replacingonnxruntime with onnxruntime-gpu.

It makes sense for whisperX to update pyannote.audio to 3.0.1, however, there is a conflict with faster_whisper on onnxruntime, as discussed here. Until it is resolved on the faster_whisper side, installing both will end up onnxruntime still in CPU mode and thus slower performance.

My current workaround is running the following commands post installation

pip install pyannote.audio==3.0.1
pip uninstall onnxruntime
pip install --force-reinstall onnxruntime-gpu

Alternative, use the old 2.1 model.

model = whisperx.DiarizationPipeline(model_name='pyannote/[email protected]', use_auth_token=YOUR_AUTH_TOKEN, device='cuda')

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions