You are not logged in.
I'm using yt-dlp to download music. It keeps failing when it tries to create a file, and the same problem happens when I try to touch a file with the same name, so I don't think it's a yt-dlp problem:
$ touch 'Summer - Hisaishi Joe | 오보에 oboe 피아노 piano 커버영상(cover) | 여름에 듣기 좋은 노래 | 전북예술문화원 방구석 콘서트 '
touch: cannot touch 'Summer - Hisaishi Joe | 오보에 oboe 피아노 piano 커버영상(cover) | 여름에 듣기 좋은 노래 | 전북예술문화원 방구석 콘서트 ': Operation not supportedI think this might be a zfs issue, because it works fine on my hard drive. I know I have write permissions and I don't think it's the length, because trying to create a file with a longer name results in the different 'file name too long' error. But I'm also not sure what the issue is since I can make a file with either half of the name, so I don't think its an invalid character.
Last edited by Bobthebadguy (2023-10-10 22:42:44)
Offline
Mod note: moving to AUR Issues
Sakura:-
Mobo: MSI MAG X570S TORPEDO MAX // Processor: AMD Ryzen 9 5950X @4.9GHz // GFX: AMD Radeon RX 5700 XT // RAM: 32GB (4x 8GB) Corsair DDR4 (@ 3000MHz) // Storage: 1x 3TB HDD, 6x 1TB SSD, 2x 120GB SSD, 1x 275GB M2 SSD
Making lemonade from lemons since 2015.
Offline
I'd be more concerned about the pipes and the trailing space
Try to
touch '오보에.test'Offline
touch '오보에.test'Works fine. Removing all the pipes and trailing space still does not work.
What I don't understand is this works fine:
$ touch 'Summer - Hisaishi Joe | 오보에 oboe 피아노 piano 커버영상(cover)'
$ touch '(cover) 여름에 듣기 좋은 노래 전북예술문화원 방구석 콘서트'To make it more clear, the problem creating the file only happens on my ssd with zfs, not the drive with ext4.
Last edited by Bobthebadguy (2023-10-11 20:51:18)
Offline
The original attempt is 159 bytes, the segments are 89 and 77 bytes…
Can you
touch 'Summer - 오보에 피아노 커버영상 여름에 듣기 좋은 노래 전북예술문화원 방구석 콘서트' # 116 bytes
touch 'Summer - 오보에 피아노 커버영상 여름에 듣기 좋은 노래 전북예술문화원' # 96 bytesOffline
The longer one fails, but I was under the impression that zfs had a max file name length of 255.
Offline
touch 'abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz'That's 105 bytes but it's also only ascii ![]()
Maybe zfs uses UTF16?
https://unix.stackexchange.com/question … les-e-g-fi
Edit: though your original name is < 255 bytes in UTF-16 ![]()
Last edited by seth (2023-10-11 21:02:21)
Offline
If the file name is too long, how come touch doesn't say so, but it does when the file name is even longer?
Last edited by Bobthebadguy (2023-10-11 21:24:37)
Offline
Because it gets a bogus error from the FS - did you try the ascii name?
Do you use "utf8only"?
Offline
utf8only is on. I'm not sure what you mean by the ascii name.
Edit: Nevermind. It works fine.
touch 'abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz'Last edited by Bobthebadguy (2023-10-11 21:28:57)
Offline
My best but completely uninformed guess is that ZFS uses some encoding less efficient than UTF-8 to store the filename and the CJK glyphs take more than 3 bytes.
touch 'αβcδεφγηιθκλμνοπχρστυvωξψζαβcδεφγηιθκλμνοπχρστυvωξψζαβcδεφγηιθκλμνοπχρστυvωξψζαβcδεφγηιθκλμνοπχρστυvωξψζ' # 201 bytes in UTF-8Offline
That last one worked.
But looks like the solution for yt-dlp is either to use the --trim-filenames option, which isn't great because it goes by characters and I need to limit it to 84, or use the output template from here which I need to set to -o "%(title).134B.%(ext)s". But this makes even less sense because the 84 character file name is 142 bytes. Alternatively I can just download this with a different file name, but this is all leaving me a little confused.
Offline
But this makes even less sense because the 84 character file name is 142 bytes
The only way I can make sense out of this is if ZFS doesn't store/encode filenames in either UTF8 nor UTF16 and cjk takes up 4 bytes or more per glyph.
Can you
touch 'αβcδεφγηιθκλμνοπχρστυvωξψζαβcδεφγηιθκλμνοπχρστυvωξψζαβcδεφγηιθκλμνοπχρστυvωξψζαβcδεφγηιθκλμνοπχρστυvωξψζαβcδεφγηιθκλμνοπχρστυvωξψζαβcδεφγηιθκλμνοπχρστυvωξψζ' # 301 bytes in utf-8 but only 157 in latin1Offline
That returns
touch: cannot touch 'αβcδεφγηιθκλμνοπχρστυvωξψζαβcδεφγηιθκλμνοπχρστυvωξψζαβcδεφγηιθκλμνοπχρστυvωξψζαβcδεφγηιθκλμνοπχρστυvωξψζαβcδεφγηιθκλμνοπχρστυvωξψζαβcδεφγηιθκλμνοπχρστυvωξψζ': File name too longinstead of 'Operation not supported'.
Offline
It would be very interesting to strace the two failing touch invocations to see which call fails in each case.
Offline
Well, I think I figured it out. If my understanding is right, this is because the normalization is set to formD, so it decomposes which increases the length. This is why
Summer - Hisaishi Joe | 오보에 oboe 피아노 piano 커버영상(cover) | 여름에 듣기 좋은 노래 | 전북예술문화원 .mkvworks, being 253 bytes but not the full file name being too long. I used this site to decompose the names. Changing normalization requires creating a new dataset, so I'll just ignore this problem with one file instead.
Offline