-
Notifications
You must be signed in to change notification settings - Fork 698
Open
Description
Description
Thanks to the nodegit maintainers for this excellent library!
This issue was debugged with the assistance of Cursor and Opus 4.5.
Current Behavior
DiffLine.rawContent() returns a JavaScript string type, but the underlying libgit2 git_diff_line.content is a raw byte pointer (const char *) that is not NUL-terminated and may contain non-UTF-8 encoded content (e.g., GBK, GB18030).
The current implementation in lib/diff_line.js:
var _rawContent = DiffLine.prototype.content; // Save original native method
DiffLine.prototype.content = function() {
// ...
this._cache.content = Buffer.from(this.rawContent())
.slice(0, this.contentLen())
.toString("utf8");
return this._cache.content;
};
DiffLine.prototype.rawContent = function() {
return _rawContent.call(this); // Calls native binding
};The problem is that _rawContent (the native binding) already converts const char * to a JavaScript string, presumably using v8::String::NewFromUtf8() or similar, which assumes UTF-8 encoding.
Expected Behavior
rawContent() should return a Buffer containing the original bytes, allowing users to detect and decode the encoding themselves:
DiffLine.prototype.rawContent = function() {
// Return Buffer instead of string
return _rawContent.call(this); // Should return Buffer
};
DiffLine.prototype.content = function() {
// ... existing implementation
return this.rawContent()
.slice(0, this.contentLen())
.toString("utf8");
};Metadata
Metadata
Assignees
Labels
No labels