added format check for method read_data in rawread#70
added format check for method read_data in rawread#70Charliechen1 wants to merge 2 commits intobeetbox:mainfrom
Conversation
sampsyo
left a comment
There was a problem hiding this comment.
Interesting! Thank you for the contribution—this looks promising.
Is it possible to provide a sample audio file where this fix is necessary? It would be great to have that documented here so we can come back to it if something goes wrong in the future.
I'd also love to hear more detail about why padding with the byte 0xFF is the right thing. I admit I'm not too knowledgable about the sample format, but I would have guessed that 0x00 (null) would have been the right padding byte.
Finally, I made some nitpicky Python style comments inline.
Thanks again!
audioread/rawread.py
Outdated
|
|
||
| remainder = len(data) % old_width | ||
| if remainder != 0 : | ||
| data = data + PATCH_BYTE*(old_width-remainder) |
There was a problem hiding this comment.
Here are some very low-level style (i.e., PEP8) comments:
- Please remove the whitespace on the blank line.
- Please remove the space between the 0 and the : in the
ifstatement. - Please add spaces around the binary operators
*and-.
audioread/rawread.py
Outdated
|
|
||
| # Produce two-byte (16-bit) output samples. | ||
| TARGET_WIDTH = 2 | ||
| PATCH_BYTE = b'\xff' |
There was a problem hiding this comment.
The comment above doesn't apply to this constant. So please add a blank line above it and, ideally, write a brief sentence explaining what this is useful for.
There was a problem hiding this comment.
Really thanks to your advice, I am a fresh graduate, so there must be lots of things to learn lol. The suggestions are quite helpful to me.
|
Here for your reference: |
|
Hmm; perhaps! But on the other hand, another reasonable (silent) fix might be to round down instead of up—that is, to drop the last (partial) sample if it exists. Would that make sense to you? |
|
It should works. |
|
OK, great! Want to give it a try and see if it works on the file you have? |
|
Sure~ |
The python audioop.lin2lin will complain if the length of data can not be divided by old_width, and it's not that convenient to check the length of the audio before using the model, especially when a large batches of audio files are used in some machine learning tasks. Therefore, I have made some patch for the input audio data if the length is not to the satisfaction. Thank you for taking my suggestion into consideration, and the project is truly intensive for me. 👍