-
Notifications
You must be signed in to change notification settings - Fork 4k
GH-48721: [C++] Add test for file creation with UTF-8 filenames #48722
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
| // Test that file operations work with valid UTF-8 filenames. | ||
| // On Windows, PlatformFilename::FromString() converts UTF-8 strings to wide strings. | ||
| // On Unix, filenames are treated as opaque byte strings. | ||
| std::string utf8_file_name = "test_file_한국어_😀.txt"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
한국어 is "Korean" in Korean FYI .. :-)..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the emoticon Korean too? :)
pitrou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks contributing this @HyukjinKwon . Looks good in general, just one comment.
Also, can you please trim down the PR description?
|
I made the PR description shorter. Hopefully this one is easier to follow. |
|
Ah, it's not the related test failure. |
Rationale for this change
4937d9f (ARROW-5102) added the TODO comment requesting a test with valid UTF-8 filenames. Later, the UTF-8 to UTF-16 conversion logic on Windows was introduced in commit eb23ea9 (ARROW-5648) which should fix the issue.
Essentially we should add a test for:
arrow/cpp/src/arrow/util/io_util.cc
Lines 143 to 149 in 727106f
StringToNative()). This test complements existingFileNameWideCharConversionRangeExceptiontest (invalid UTF-8).What changes are included in this PR?
This PR adds the test described above.
Are these changes tested?
Unittest was added.
Are there any user-facing changes?
No, test-only.