Tuesday, October 7, 2014

Disk Metadata : Superblock, Directory and Inodes

Metadata
Filesystem blocks are user for 2 purposes : To store User data and Metadata
  • User data - stores actual data contained in files
  • Metadata - stores file system structural information such as superblock, inodes, directories
Metadata describes the structure of the file system. Most common metadata structure are superblock, inode and directories.

Superblocks
Every FS has a superblock which contains info about filesystems such as :
  • File system type
  • Size
  • Status
  • Information about other metadata structures
    • For filesystems with 1k blocksizes, a backup superblock can be found at block 8193
    • For filesystems with 2k blocksizes, at block 16384
    • For 4k blocksizes, at block 32768.
List backup superblocks:
# dumpe2fs /dev/hda3 | grep -i superblock

If Superblock  is corrupted, restore with backup :
# e2fsck -f -b 8193 /dev/sda3
 
Inode
 
 An inode is a data structure on a Linux Unix FS which stores stores basic information about a regular file, directory, or other file system objects.  
Inode data structure contain information about FS objects such as :
 => File type (executable, block, special etc)
=> Permissions (read, write etc)
=> Owner
=> Group
=> File Size
=> File access, change and modification time (remember UNIX or Linux never stores file creation time, this is favorite question asked in UNIX/Linux sys admin job interview)
=> File deletion time
=> Number of links (soft/hard)
=> Extended attribute such as append only or no one can delete file including root user (immutability)
=> Access Control List (ACLs) 

Whenever a user or a program needs access to a file, the operating system first searches for the exact and unique inode (inode number), in a table called as an inode table. In fact the program or the user who needs access to a file, reaches the file with the help of the inode number found from the inode table.

EXT2 FS contains 15 Block pointers. 1st 12 blocks --> Direct Block pointers which can point to 12 blocks (Each block is 4KB in size). So it can address 48 k data. 

Indirect block pointers --> whenever the size of the data goes above 48k(by considering the block size as 4k), the 13th pointer in the inode will point to the very next block after the data(adjacent block after 48k of data), which inturn will point to the next block address where data is to be copied.
Now as we have took our block size as 4K, the indirect block pointer, can point to 1024 blocks containing data(by taking the size of a block pointer as 4bytes, one 4K block can point to 1024 blocks because 4 bytes * 1024 = 4K). 
which means an indirect block pointer can address, upto 4MB of data(4bytes of block pointer in 4K block, can point and address 1024 number of 4K blocks which makes the data size of 4M)

Double indirect block pointer --> if file size is above 4MB+ 48k, inode will start using double indirect points. 14th pointer will point to the loc of 4MB+48K block. This block contains pointers to 1024 Indirect pointers. So 4MB * 1024 = 4GB can be addressed.

Tripple indirect pointers --> 1024 Double indirect pointers, 4GB  *1024 = 4TB data

Directory
A directory also has an inode structure. Inode structure of a directory just consists of Name to Inode mapping of files and directories in that directory. 

1 comment: