Files, Directories and inodes

UNIX and Linux

What is the difference between UNIX and LINUX? It's difficult to get an exact answer, but basically UNIX is an operating system that is owned and maintained by a few major vendors, including IBM, Oracle/SUN and HP. The different flavours of UNIX are very similar and are supposed to be straight compatible. For example, it is easy to work between SUN Solaris and IBM AIX. However the source code is propriety and is not available.
Linux can be considered an open source version of UNIX. Linux comes in lots more flavours than UNIX and many of them are free. Most of the following pages will apply to both operating systems, I'll hilight the differences as I know them, but as always, do not assume that anything discussed here applies directly to your environment. Test it first.

Files

In general, everything in UNIX and Linux is a file, although some of these files are a bit special. Most files, or regular files as they are called, just contain normal data like text, programs or images. Special files include

  • Directories: files that contain lists of other files
  • IO files: are used for input and output devices and are held in the /dev directory
  • Links: are pointer files used to make a given file or directory visible in a different part of the file tree.
  • Domain sockets: communication sockets that provide inter-process networking
  • Named pipes: a bit like sockets but they provide an easier way for inter processes communication.

File names can be up to 256 characters long, but are best kept reasonably short to make them manageable. If you use standard extensions like .exe or .txt, then everyone will know what a file does. File names are case sensitive, and can include special characters. It's considered good UNIX practice to use lower case. Avoid using command names for file names as this will cause confusion. For the same reason, it is best to avoid wild card characters like * or ? in file names.

i-nodes

We normally think of the file system as being like a tree, with a single 'root' directory that has directories attached to it, then subdirectories to those and eventually terminating with lots of 'leaves' or files. UNIX systems are usually drawn like a tree, but in actual fact it is just a collection of pieces of data with links between them. The data is identified by inodes. An inode is a bit like a label that is attached to a file. An inode label has a unique number within each partition and also contains

  • The file owner
  • The file type as categorised above
  • The access permissions on the file
  • The creation, last update and last accessed date.
  • The date and time of last change to inode information
  • The number of links to this file
  • The file size
  • The physical location of the data.

You can display the inode number for a file with the ls command, usually by using a –i switch.

ls –i

When you create a partition, you set aside part of that partition space for a fixed number of inodes, and the inode numbers are unique within that partition, but duplicate inode numbers can exist between partitions. Every time you create a file, you assign a free inode to that file. The file system can build up a tree picture of the file structure by piecing together the file names and inode numbers (the inode does not contain the file name and directory, these are held in the directory files).

Directories

Files are organised into directories to make it easier to find things, and to keep common files in a central location. The lowest level directory is the root directory, denoted by a / (forward slash). A number of sub-directories will exist below the root; the common or standard directories are shown in the table below, but be aware that the names of some of these directories can be changed.

Directory Content
/bin Used for program files, by the system, the system administrator and users.
/boot Contains the files and information needed to start the system up. Newer versions of Linux use a GRand Unified Boot loader or GRUB, and the GRUB files will be in here.
/dev Contains the 'special files' that describe all the peripheral hardware like VDUs, keyboards, disks, tapes, etc.
/etc Contains the system configuration files
/home Contains the 'home' or personal directories for most users. This directory name is actually heavily site dependent, though home is the default
/initrd Used by some Linux versions to hold boot information.
/lib Library files, includes files for all kinds of programs needed by the system and the users.
/lost+found Used for files that were saved during failures
/mnt external file systems mount point
/net remote file systems mount point
/proc Contains system resources information.
/root The home directory for the administrative user. Not the same as /, which is the root directory.
/sbin Contains system programs.
/tmp Temporary system space and general scratchpad.
/usr Contains user programs, libraries and documentation.
/var Contains files that are variable in size, like mailboxes, and temporary user files like downloads log files, print spool etc.

You can add your own directories to the root, but its best to add them as subdirectories under the appropriate main directory. Directory names are case sensitive, must be unique within one directory and can contain almost any character.

There are lots of GUI style file managers available for Linux that have the same look and feel as Windows Explorer. Popular products include Nautilus, Conqueror and Midnight Commander. The UNIX Command section discusses the line commands to display files and directories

Paths

Say you develop a program called 'space.exe' in your /usr/storage/bin directory that goes out and checks on how full all your disks and partitions are, and reports the results back to your terminal. To execute that program you might have to type /usr/storage/bin/space which is a pain because it requires extra typing and you have to remember where the program is. UNIX stores lists of program locations in a PATH variable. To see what PATHs are defined type

>env |grep PATH
  or in Linux
>echo $PATH

And the result will look like

/usr/local/bin:/usr/storage/bin:/usr/bin:/usr/sbin/:/bin

UNIX will search these directories in order when a command is entered, and will execute the first command it finds, so if you type 'space' on a command line and there is a space program in /usr/local/bin then that version of space.exe will be executed, not the version in /usr/storage/bin

Each user could have a different path variable. The 'which' command will display the full path name for a command,

Which space
Space is /usr/local/bin

Identifies that the wrong program is being called.