You are not logged in.
I am confused about the VFAT mount options regarding file names. The kernel docs suggest using the utf8 option instead of iocharset=utf8, and when using iocharset=utf8 anyway, one gets a kernel warning
utf8 is not a recommended IO charset for FAT filesystems, filesystem will be case sensitive!
So it seems the iocharset is also used for case mapping (although that warning appears to be wrong as well, I can still access files using different casing).
The default VFAT mount options nowadays are apparently
rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,utf8,errors=remount-ro
and if the iocharset would be used for case mapping, I'd expect non-ASCII file names such as "Ä" and "ä" to be different and not case-mapped. But again apparently that is not the case, I can create a file "ä" and then open it via "Ä" when using these default options, and it is a complete mystery what the iocharset does and what is actually used for case mapping - maybe the codepage? Or some unicode case mapping, but then which version?
According to this interesting blog post other case insensitive file systems like NTFS store the case-mapping data within the file system or reference the used unicode version, but this is clearly not the case for VFAT...
Offline
https://man.archlinux.org/man/mount.8#M … ns_for_fat
https://man.archlinux.org/man/mount.8#M … s_for_vfat
You've iocharset=ascii, the "utf8" there means sth. dfiferent
Offline
https://man.archlinux.org/man/mount.8#M … ns_for_fat
https://man.archlinux.org/man/mount.8#M … s_for_vfat
You've iocharset=ascii, the "utf8" there means sth. dfiferent
I appreciate the feedback, but I'm not sure what you are trying to tell me. From what I can tell you are wrong, the "utf8" there does not mean something different, but overrides the iocharset, at least to some extent. When I disable the utf8-option, but leave the iocharset=ascii, non-ascii file names are displayed as "?" (as expected). So clearly the utf8 option influences what the iocharset is supposed to do according to those man pages, i.e. facilitate the conversion between the on-disk 16 bit unicode and (unix/linux) 8-bit file names.
Edit: And as I also said, the kernel docs clearly recommend using the utf8 option as a replacement for iocharset=utf8, so those options are obviously closely related.
Last edited by zse (2025-07-25 21:23:09)
Offline
"at least to some extent"
https://bbs.archlinux.org/viewtopic.php?id=306137 - but indeed both variants are case insensitive, despite the warning. Probably a bug in the fat module?
Edit: might hinge on
check=s|r|n
Case sensitivity checking setting.
s: strict, case sensitive
r: relaxed, case insensitive
n: normal, default setting, currently case insensitive
nocase
This was deprecated for vfat. Use shortname=win95 instead.
shortname=lower|win95|winnt|mixed
Shortname display/create setting.
Edit #2:
plus this actually only deals w/ long names, but afaiu you tried "ä"?
Try "ääääääääääääääää" vs "Ääääääääääääääää"
Last edited by seth (2025-07-25 22:06:56)
Offline
Edit #2:
plus this actually only deals w/ long names, but afaiu you tried "ä"?
Try "ääääääääääääääää" vs "Ääääääääääääääää"
Hm you are on to something, those long names are indeed not matched up. However they are not matched for both the iocharset=utf8 NOR the single utf8 option, so that does not explain why one would prefer one over the other... what a mess this is
Edit: just tested what happens with the mount options "-o utf8,iocharset=iso8859-1" since that is a charset with the ä included, but they are still not matched up, so I just don't know. Is Fat only supposed to be case insensitive for file names that fit in 8.3? Surely not, right?
Last edited by zse (2025-07-25 22:25:40)
Offline
iocharset is from fat, utf8 is a vfat option - they do similar things, but differently.
The warning you're getting is from the fat module which is included by vfat, but you're not mounting fat (try "-t msdos") but vfat.
I cannot even mount a loop device (image) as msdos, though, to check whether iocharset behaves differently there.
The only case handling concern is the old (fat only) 8.3 dos filenames - those are strictly case insensitive and there're various ways to address that (shortname and strategies) when representing them (eg. you can prevent collisions so that "touch Ä" fails if there's a file "ä" because you're trying to address the same file differently)
The long names can be case insensitive but be aware that win32 is case insensitive, so collisions there might get you into trouble on windows.
If this isn't purely academic, leave it at the defaults, don't try to use msdos as mount type and for OS compatibility avoid case-sensitive filename collisions.
Offline
Yeah, I don't really have an issue, just morbid curiosity
The man pages don't really make a lot of sense though, as the description of the iocharset option from the fat section says "Character set to use for converting between 8 bit characters and 16 bit Unicode characters", but only vfat has Unicode characters, the 8.3 names are encoded in the codepage encoding, so that option should be rather exclusive to the vfat section as well.
Then I suppose no one really knows what these different options actually do precisely. Very weird, oh well.
Offline
If you've a FAT32 block device, can you mount if "-t msdos -o defaults,iocharset=utf8" and check its behavior?
Offline
# mount -t msdos /dev/vdb1 /mnt
# mount | grep msdos
/dev/vdb1 on /mnt type msdos (rw,relatime,fmask=0022,dmask=0022,codepage=437,errors=remount-ro)
# umount /mnt
# mount -t msdos -o defaults,iocharset=utf8 /dev/vdb1 /mnt
mount: /mnt: fsconfig system call failed: msdos: Unknown parameter 'iocharset'.
dmesg(1) may have more information after failed mount system call.
Edit: I also tested on Windows, there the file names "äääääääää" with any capitalization are all treated as the same file, so Linux does not accurately emulate the Windows case mapping, just fyi.
Last edited by zse (2025-07-26 17:57:53)
Offline
there the file names "äääääääää" with any capitalization are all treated as the same file
be aware that win32 is case insensitive, so collisions there might get you into trouble on windows … for OS compatibility avoid case-sensitive filename collisions.
The problem isn't the windows kernel but the entire windows userspace is not case sensitive itr.
https://man.archlinux.org/man/mount.8#M … ns_for_fat - iocharset should™ not be vfat specific, but
what a mess this is
and everything but vfat has fallen out of relevance decades ago - so this has probably never been encountered by anyone for a very long time…
Offline
everything but vfat has fallen out of relevance decades ago - so this has probably never been encountered by anyone for a very long time…
Right, but at least vfat is still relevant, and it would be kind of nice to know how to get maximum Windows interop compatibility for it under Linux, that's why I was looking for the mount options that are best for that.
And for that my original question on what the iocharset and utf8 option do precisely (for vfat) and how they interact still remain quite unclear. Alas.
Offline
long filenames are stored on fat as utf-16, there's no compatibility concern. Windows operates on utf-16 anyway.
Both utf8 approaches deal w/ converting long names between your locale (your system will hopefully use utf-8, the defacto standard for decades) and utf-16
Leave it as the defaults, this is only ever relevant when dealing w/ old data that was written differently.
This has nothing to do w/ the main compatibilty concers:
1. windows userspace for the most part is case insensitive, but there're no such restrictions on either fat (long names) or ntfs
2. illegal pathname characters differ between windows and linux, see eg https://stackoverflow.com/questions/197 … tory-names
You'll have to avoid 1 manually (all the restrictions deal w/ the 8.3 names which are case insensitive on fat) but I'm pretty sure posix-only filenames are denied for vfat (for the various ntfs implementations you might get different behavior and options)
Maximum compatibility: zip -0 into an 8.3 ascii file.
Offline
long filenames are stored on fat as utf-16, there's no compatibility concern. Windows operates on utf-16 anyway.
Both utf8 approaches deal w/ converting long names between your locale (your system will hopefully use utf-8, the defacto standard for decades) and utf-16
Leave it as the defaults, this is only ever relevant when dealing w/ old data that was written differently.This has nothing to do w/ the main compatibilty concers:
If the man pages are correct, and the iocharset are only used for utf-16 to local charset conversion, you are right. However the kernel warning suggest that the iocharset is also used for case insensitivity, in which case it does have something to do with the following concerns.
Edit: And if you are right, I see no logical explanation how iocharset=utf8 could be different from the utf8 option.
You'll have to avoid 1 manually (all the restrictions deal w/ the 8.3 names which are case insensitive on fat) but I'm pretty sure posix-only filenames are denied for vfat (for the various ntfs implementations you might get different behavior and options)
Linux clearly implements some form of case insensitivity for vfat per default (even for long names, at least when they are ASCII only), I presume to prevent users from creating files that cannot be read by Windows. But since the default case mapping in Linux is different from the one used by Windows, I was thinking there may be some different mount options that would make it the same. After all, why bother implementing insensitivity otherwise. And for NTFS there even is an option to prevent users from creating Windows reserved file names, so it seems at least some Linux kernel devs care for protecting users from those potential issues automatically.
Last edited by zse (Yesterday 11:12:07)
Offline
Warning:
https://github.com/torvalds/linux/blob/ … de.c#L1574
Caveat:
https://github.com/torvalds/linux/blob/ … de.c#L1786
Difference:
https://github.com/torvalds/linux/blob/ … fat.c#L599
https://github.com/torvalds/linux/blob/ … fat.c#L517
Linux clearly implements some form of case insensitivity for vfat per default (even for long names, at least when they are ASCII only)
Yup, seems so (likely because the utf8 conversion is inert)
But since the default case mapping in Linux is different from the one used by Windows, I was thinking there may be some different mount options that would make it the same.
See the "shortname=lower|win95|winnt|mixed" option, case mapping should™ not be a thing on long names and the file just maintain their original CamelCase ?
And for NTFS there even is an option to prevent users from creating Windows reserved file names, so it seems at least some Linux kernel devs care for protecting users from those potential issues automatically.
I'm pretty sure posix-only filenames are denied for vfat
touch 'ab:c' # this is supposed to simply fail on vfat
https://en.wikipedia.org/wiki/File_Allocation_Table
It doesn't perfectly apply, but https://imgs.xkcd.com/comics/dependency.png
All of this is working around a poorly designed not-filesystem from the seventies and 237¼ kB 8" floppies to allow the use of umlauts and sentence_long_file.names - it's ugly.
Offline
See the "shortname=lower|win95|winnt|mixed" option, case mapping should™ not be a thing on long names and the file just maintain their original CamelCase ?
For vfat long names, both Windows and Linux always store and preserve the case as given from user space exactly when creating new files, that is not the issue. But the case mapping is still important for determining whether there is already an existing file that has the same (modulo case mapping) name. If the case mapping algorithms are not the same, one may allow you to create a file that would be illegal under the other algorithm.
All of this is working around a poorly designed not-filesystem from the seventies and 237¼ kB 8" floppies to allow the use of umlauts and sentence_long_file.names - it's ugly.
No argument from me there.
Offline
The 8.3 names are strictly case insensitive, the long names are strictly case sensitive.
There's no legality issue, this is fine w/ the filesystem. It's just that (parts of, though I guess "most") windows won't be able to differentiate/access them.
https://wiki.gentoo.org/wiki/FAT/en#Case_sensitivity has some more details/explanations (notably 8.3 case resolution)
The utf8 function uses https://elixir.bootlin.com/linux/v6.15. … ase.c#L132 the other variant seems to mostly special-case ":" … *shrug*
Offline