On today's Byte Size, I want to look at why DOS has 8.3 file name constraints, and how
we transitioned to long file names.
We're used to assigning pretty much whatever names we like to files in our current operating
systems.
If we're feeling crazy, we can even include spaces and special characters in our Word
documents, spreadsheets, even executables!
But this hasn't always been the case, and indeed, even today, I still tend to save my
files adhering to the limited conventions of Microsoft DOS.
I mean, who uses spaces in file names??
I'd rather burn my own feet than follow such an off-the-chain practice!
So, it's clear we're talking limitations here.
Limitations of early hardware, and the software which ran on it.
As these wares have evolved, our ability to have more has increased.
But of course, we need to delve into the underlying reasons, so for that we begin with FAT.
The directory and FAT are a team in locating files.
The directory tells you what the names of your files are, and the FAT tells you where
the file is located.
The FAT immediately follows the DOS Boot Record on a disk - it always begins on DOS sector
number 1.
DOS then stores two copies of the FAT right next to each other, one for backup purposes,
and the FAT's size will vary according to how large the partition is.
The FAT is then immediately followed by the root directory.
Following this root, data is then stored on the disk.
Sub directory entries are also stored in this data region, whose paths and locations on
disk can all be traced from the root directory, providing the branching effect we're familiar
with.
Within the root directory and each subsequent directory table we find directory entries,
each of which contain 32 bytes of information about their related file.
Versions of Microsoft DOS before 1.4 contained only 16 bytes of data which supported files
no larger than 16MB and lacked a last modification date... and that's because this 32 bytes of
data is used to store crucial information about our file.
At the end of each entry we have the file size, which consumes 4 bytes.
We then have the starting cluster number, which in conjunction with FAT allows us to
determine where the file resides on disk.
2 bytes is then reserved for the last time of modification.
Another 2 bytes for the date.
10 then set aside for additional information and future expansion, this includes the creation
and access dates in DOS 7 onwards, but was also used for storing a password hash under
systems like DR DOS and Novell DOS. 1 byte for attributes is next - allowing for 8 attribute
bits, such as hidden, read only and archive, and can also be used to specify that the entry
points to a directory, rather than a file.
We are then left with 3 bytes for the file extension and 8 bytes for it's name, right
at the start of each entry.
Giving us the 8.3 filename restriction DOS is known and loved for.
The absolute directory path was also limited to 66 characters enforcing a sub folder depth
of 32 folders.
There were also restrictions on legal characters for the filename, many of which remain in
place today.
These include the asterisk and question mark symbols which are used as search operators,
and although lowercase characters could be used, they were stored as uppercase in both
FAT12 and 16 file systems.
In the early days of computing this 8.3 restriction seemed more than enough, it was a reasonable
settlement between disk space and descriptiveness, and seemed more than ample in the days of
DOS.
After all, who would want to type out lengthy file names to open and manipulate files and
directories?
Like many characteristics of DOS, it's origins are really from the CP/M operating system,
but in a DOS sense began with BASIC-86; An implementation of Microsoft BASIC created
for the Seattle Computer Products 8086 computer kit in 1979.
This implementation provided a standalone disk based language system and incorporated
an 8 bit FAT file system developed by Marc McDonald & Bill Gates, but SCP wanted an operating
system more like Digital Research's CP/M. Tim Paterson was put on task to develop BASIC-86
which would evolve into 86-DOS, part of which involved extending the 8 bit FAT to 12 bit
FAT, and also allowed for increasing the 9 character limit under BASIC-86 to 11 characters,
in line with the 8.3 filenames supported under CP/M. Microsoft then bought the rights from
SCP and the rest is history.
Why CP/M enforced a 8.3 limit isn't recorded in specific detail, but fitting directory
entries into a nice tidy 32 bytes, probably played a large part.
Also, many operating systems, such as Intel ISIS, around at the time of it's development
used a 6.3 naming convention, mainly because earlier machines like the DEC PDP-10 utilised
36 bit words and 6 bit characters.
This allowed the 3 character extension to fit neatly into half a word, and the file
name to reside in a whole word.
Given time had moved on and they weren't working to the same hardware limitations, expanding
on this by 2 seemed probably also seemed a good step in the right direction for Digital
Research, but this was still less than the 14 character limit on Unix.
Anyway, this is really speculation, so let's move on...
Although we're not limited to it, the 3 character extension still serves us well today.
But in the world of graphical environments, where all files can be quickly opened with
a double click, users demanded more description.
Microsoft implemented Long File names in 1993 with Windows NT 3.1 and the NT file system,
otherwise known as NTFS - although don't get these icon titles confused with filenames,
these are just shortcut descriptions which link to the original file.
But to ensure backwards compatibility with existing DOS setups, some clever implementation
would be required for Windows '95..
This compatibility came with the introduction of Virtual FAT.
Like many things; IBM's OS/2 was already ahead of this change through the use of Extended
Attributes which it stored in a separate hidden file.
VFAT's goal was to allow backwards compatibility with the traditional directory setup of FAT
by placing additional entries into the directory before each normal file entry.
These additional entries are marked with the volume label, system hidden and read only
attributes, which is not a combination expected under usual DOS environments, and therefore
ignored.
Each of these entries can contain up to 13 characters by using various fields of the
original 32 byte entry size.
20 of these entries can be chained before each file, allowing a maximum file name length
of 255 characters.
Under DOS these files will be visible by the first 6 characters of their name, followed
by a tilde symbol and integer, to avoid duplication... if this leaves more than 9 files with the
same name, the last 3 characters are used instead.
Extended characters, allowed in Long File Names such as plus, comma, semi colon and
equals are converted to underscores and spaces are simply ignored.
If you choose to fire up a command prompt in Windows 95 or 98 then you'll be able to
see both long and short file names, however if the file name or directory has a space
or special character in, you'll need to reference it's shortened 8.3 filename to navigate successfully...
or if you're feeling fancy, you can just use quotation marks!
Under Windows VFAT simply collected the data from the hidden entries first to present the
long file name to the user, in a completely hidden process.
Early versions of Windows 95 included a utility called LFNK.EXE, which stripped file names
from the VFAT volume and stored them in a text file called LFNNK.DAT, incase you ever
wanted to revert back to the good old days! and you can actually get drivers such as DOSLFN
which implement long file name support in any version of DOS.
There were also programs such as 4DOS which replaced the default COMMAND.COM interpreter
and allowed for additional file descriptions of up to 511 characters in length
Of course now days, you can use long file names via.
a DOS command prompt - even spaces!... but of course, NTFS is now pretty standard, although
FAT does remain with us in the guise of ExFAT... buuuttt, it's quite a different beast to the
FAT we know and love and sadky lacks support for the traditional 8.3 format filenames.
Thanks for watching!
Click a video, subscribe, support me, or do other things, of your choosing.
In any case, have a great evening!
Không có nhận xét nào:
Đăng nhận xét