Fix DataInputStream.readUTF() panic on non-UTF-8 (EUC-KR) input#162
Open
mirusu400 wants to merge 1 commit into
Open
Fix DataInputStream.readUTF() panic on non-UTF-8 (EUC-KR) input#162mirusu400 wants to merge 1 commit into
mirusu400 wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Pull request overview
Fixes a crash in java.io.DataInputStream.readUTF() when input bytes are not valid UTF-8 (observed with EUC-KR-encoded WIPI game data), by avoiding an unconditional unwrap() and adding an EUC-KR decode fallback.
Changes:
- Replace
String::from_utf8(...).unwrap()with a match that falls back toencoding_rs::EUC_KRdecoding on UTF-8 errors. - Update the in-code TODO/comment to reflect the new fallback behavior.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+188
to
+191
| // TODO handle modified utf-8 (EUC-KR fallback) | ||
| let string = match RustString::from_utf8(buf) { | ||
| Ok(x) => x, | ||
| Err(e) => { |
Comment on lines
+192
to
+194
| let bytes = e.into_bytes(); | ||
| let (decoded, _, _) = encoding_rs::EUC_KR.decode(&bytes); | ||
| decoded.into_owned() |
Comment on lines
+189
to
+196
| let string = match RustString::from_utf8(buf) { | ||
| Ok(x) => x, | ||
| Err(e) => { | ||
| let bytes = e.into_bytes(); | ||
| let (decoded, _, _) = encoding_rs::EUC_KR.decode(&bytes); | ||
| decoded.into_owned() | ||
| } | ||
| }; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
[KTF] 타워크래프트같은 게임이 아래와 같이 크래시가 납니다.[186, 243, 32, 189, 189, 183, 212]바이트를 EUC-KR로 인코딩하면빈 슬롯인데, UTF8로 강제로 디코딩해서 오류가 발생하고있습니다.공식 문서 에는
수정된 UTF-8이라고 하는데 이게 그냥 EUC-KR 폴백을 넣은 UTF8이지 않을까 싶습니다..readUTF 함수 외에는 문자열을 읽는 함수가 없어서, 단순히 Throw Exception으로는 해결이 안될거라 생각하고 일단 폴백 로직을 넣었습니다. 우선은 잘 작동하는걸 확인해서, PR 요청 드려봅니다.