-
Notifications
You must be signed in to change notification settings - Fork 11.8k
mtmd : (WIP) add ultravox audio input #13623
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Ok somehow it works magically, the code is still nowhere near finish Tested using first 6 seconds from https://www.youtube.com/watch?v=vP4iY1TtS3s ![]() |
#define MINIAUDIO_IMPLEMENTATION | ||
#define MA_NO_ENCODING | ||
#define MA_NO_DEVICE_IO | ||
#define MA_NO_RESOURCE_MANAGER | ||
#define MA_NO_NODE_GRAPH | ||
#define MA_NO_ENGINE | ||
#define MA_NO_GENERATION | ||
#define MA_API static | ||
#include "miniaudio.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ggerganov I initially use dr_wav.h
here, but I struggled to write myself a resampling algorithm to downsample/upsample audio to 16KHz. I ended up using miniaudio.h
here which provide decoding wav, mp3, flac, etc while also come with resampling built-in.
However, the caveat is that this single-header library is 3MB of code, and most of the components are disabled upon compilation as you see here.
What do you think about keeping this lib? I think the other components can be useful for TTS, as it allow us to play the generated audio without an external command.
Supersede #12745
TODO writing more in details:
miniaudio.h
, why it's compiled in as separated lib (but static link)?image
tomedia
orbuffer
mtmd_
API