fix(renderer): add timestamp synchronization with recording pipeline#167
Conversation
Add reference to GLRecordingPipeline for synchronized timestamp updates during rendering. When available, uses the pipeline's synchronized timestamp instead of direct camera timestamp for better video/audio alignment.
There was a problem hiding this comment.
Summary of Changes
Hello @anonfaded, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request primarily focuses on improving the synchronization between video and audio streams during recording, aiming for better alignment in the final output. Beyond synchronization, I've also significantly enhanced the overall robustness and reliability of the recording pipeline, making it more resilient to various operational challenges and ensuring recorded files are consistently playable.
Highlights
- Video/Audio Timestamp Synchronization: I've implemented a new mechanism within
GLRecordingPipelineto generate synchronized timestamps for both video and audio frames. This ensures better alignment in the recorded output. TheGLWatermarkRenderernow utilizes these synchronized timestamps for setting presentation times, moving away from direct camera timestamps. - Enhanced Recording Pipeline Robustness: I've introduced comprehensive error handling and fallback strategies for the
MediaMuxerandMediaCodec. This includes newemergencyDrainEncodersandemergencyFinalizeMuxermethods, designed to prevent data loss and ensure file playability even in unexpected scenarios during recording shutdown. - Improved Encoder Configuration and Fallbacks: The video encoder initialization now includes logic to configure encoders with essential and industry-standard settings. This also incorporates a codec fallback mechanism (e.g., attempting H.264 if HEVC fails) to maximize device compatibility and recording reliability.
- Refined Muxer Start-up Logic: I've modified the muxer initialization to explicitly wait for both video and audio encoder output formats to be available before starting the muxer. This addresses common issues related to format changes and ensures correct track addition, preventing potential duration corruption.
- Synchronized Audio/Video Recording Start/Stop: The recording start process has been adjusted to give the audio encoder a brief moment to initialize, and the stop process now prioritizes stopping audio synchronously with video. This further contributes to improved audio/video alignment in the final recording.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Code Review
The pull request introduces synchronized timestamp updates and robustness improvements. However, there are a few critical issues that need attention: a regression that removes the video segment splitting logic from the render loop, an incorrect constant used for encoder configuration, and a bug in the timestamp calculation fallback logic. Please address these points for a successful merge.
| // Set complexity for better encoding (API 21+) | ||
| if (android.os.Build.VERSION.SDK_INT >= android.os.Build.VERSION_CODES.LOLLIPOP) { | ||
| try { | ||
| format.setInteger(MediaFormat.KEY_COMPLEXITY, MediaCodecInfo.EncoderCapabilities.BITRATE_MODE_VBR); |
There was a problem hiding this comment.
There appears to be a copy-paste error in this block. MediaFormat.KEY_COMPLEXITY is being set with MediaCodecInfo.EncoderCapabilities.BITRATE_MODE_VBR, which is a constant for KEY_BITRATE_MODE. This is incorrect and will likely lead to unexpected encoding behavior or errors.
Since complexity values are highly encoder-specific, it's safer to remove this configuration unless you are querying for supported values from MediaCodecInfo.EncoderCapabilities.
| if (glRenderer != null) { | ||
| glRenderer.renderFrame(); | ||
| } | ||
|
|
||
| // Only drain encoder if rendering was successful | ||
| if (renderSuccess) { | ||
| // Drain encoded frame data | ||
| drainEncoder(); |
There was a problem hiding this comment.
The logic for video segment splitting (if (shouldSplitSegment()) { rolloverSegment(); }) was present in the previous version of renderRunnable but has been removed in this change. This is a critical regression that will cause the video splitting feature to stop working.
The check should be restored in the render loop, likely after drainEncoder().
drainEncoder();
// Check if we need to split the segment due to size
if (shouldSplitSegment()) {
Log.d(TAG, "Size limit reached, rolling over segment");
rolloverSegment();
}| long cameraTimestamp = cameraSurfaceTexture.getTimestamp(); | ||
| long synchronizedTimestamp = 0; | ||
|
|
||
| // Get synchronized timestamp from recording pipeline if available | ||
| if (recordingPipeline != null) { | ||
| synchronizedTimestamp = recordingPipeline.getSynchronizedVideoTimestamp(cameraTimestamp); | ||
| } else { | ||
| // Fallback to camera timestamp if no recording pipeline | ||
| synchronizedTimestamp = cameraTimestamp; | ||
| } | ||
|
|
||
| EGLExt.eglPresentationTimeANDROID(eglDisplay, eglSurface, synchronizedTimestamp * 1000L); // Convert to nanoseconds |
There was a problem hiding this comment.
There's a bug in the fallback logic for the presentation timestamp. cameraSurfaceTexture.getTimestamp() returns a value in nanoseconds, while recordingPipeline.getSynchronizedVideoTimestamp() returns microseconds. In the else block, synchronizedTimestamp is assigned the nanosecond value from cameraTimestamp. Then, synchronizedTimestamp * 1000L is passed to eglPresentationTimeANDROID, which incorrectly scales the already-nanosecond timestamp.
This will result in incorrect presentation times when the recording pipeline is not available. The fallback should use the camera timestamp directly without scaling.
long cameraTimestamp = cameraSurfaceTexture.getTimestamp();
long presentationTimeNanos;
// Get synchronized timestamp from recording pipeline if available
if (recordingPipeline != null) {
// Convert synchronized timestamp (microseconds) to nanoseconds
presentationTimeNanos = recordingPipeline.getSynchronizedVideoTimestamp(cameraTimestamp) * 1000L;
} else {
// Fallback to camera timestamp (already in nanoseconds)
presentationTimeNanos = cameraTimestamp;
}
EGLExt.eglPresentationTimeANDROID(eglDisplay, eglSurface, presentationTimeNanos);| try { | ||
| Thread.sleep(50); | ||
| } catch (InterruptedException e) { | ||
| Thread.currentThread().interrupt(); | ||
| } |
There was a problem hiding this comment.
Using Thread.sleep() to wait for the audio encoder to initialize is unreliable and can lead to race conditions, especially on slower devices or under heavy load. This can result in synchronization issues or failures.
A more robust approach would be to use a proper synchronization mechanism, such as a CountDownLatch, to signal when the audio encoder has produced its format change and is ready.
| * Initializes video timestamp synchronization when the first video frame arrives. | ||
| * This should be called from GLWatermarkRenderer when setting presentation time. | ||
| */ | ||
| public void initializeVideoTimestamp(long cameraTimestampNanos) { |
There was a problem hiding this comment.
The Javadoc for initializeVideoTimestamp states it should be called from GLWatermarkRenderer, but it's actually a helper method called by getSynchronizedVideoTimestamp. To improve encapsulation and prevent potential misuse, this method should be private.
Additionally, the Javadoc should be updated to reflect its actual usage.
/**
* This is called internally by getSynchronizedVideoTimestamp.
*/
private void initializeVideoTimestamp(long cameraTimestampNanos) {| } | ||
| } | ||
|
|
||
| // -------------- Fix Start for this method(emergencyDrainEncoders)----------- |
- Fix timestamp conversion in GLWatermarkRenderer by properly handling nanoseconds - Make initializeVideoTimestamp private as it's only used internally - Remove incorrect encoder complexity configuration - Add segment splitting logic when size limit is reached
Add reference to GLRecordingPipeline for synchronized timestamp updates during rendering. When available, uses the pipeline's synchronized timestamp instead of direct camera timestamp for better video/audio alignment.