Home ¦ Posts ¦ Books ¦ Articles ¦ Talks ¦ Notes

Hard Links and Soft/Symbolic Links on Linux

Much has been written (and asked) on the topic of hard links and soft links (a.k.a symbolic links) on Linux. I have read a few of those more than once. However, I end up getting confused between the two, specifically the differences between the two. So, here's my post on the topic with the hope that I will stop getting confused ever again.

Our setup

Let's create a file and write a line into it:

$ echo "Hello, I am file1" > file1

Next, we create a hard link using the ln command:

$ ln file1 file1-hlink

Now, let's create a soft link using ln -s:

$ ln -s file1 file1-slink

At this stage, if we use the cat command to display the contents of each of the above, we will see the same line of text:

$ cat file1
Hello, I am file1

$ cat file1-hlink
Hello, I am file1

$ cat file1-slink
Hello, I am file1

Investigation: Inodes

One of the key differences between soft links and hard links is with respect to how they are represented in the filesystem. If we run ls with the i switch, it will show the inode number of each of the above files:

$ ls -il

15481123719144131 -rw-rw-rw- 2 asaha asaha 18 Nov  9 13:52 file1
15481123719144131 -rw-rw-rw- 2 asaha asaha 18 Nov  9 13:52 file1-hlink
29836347531381846 lrwxrwxrwx 1 asaha asaha  5 Nov  9 13:54 file1-slink -> file1

We can see that:

  • The hardlink file1-hlink has the same Inode number as the original file itself (file1)
  • The softlink file1-slink has a different Inode number

This tells us two things straightaway:

When we create a soft link, it is equivalent to creating a new file with its own filename. In the filesystem, it is a separate file, with the special property that its contents is the path to the real file file1.

Graphically:

            Soft link ->   FILE CONTENTS -> Path of original file -> FILE CONTENTS -> "Hello, I am file1"

A hard link on the other hand is a reference to the original file. It exists on the filesystem, but only as another reference or a link. Let's explore a bit into what it means. If we execute ls with the -l (small L) switch, the second column gives the number of link counts of a file:

$ ls -l file1
-rw-rw-rw- 2 asaha asaha 18 Nov  9 13:52 file1

We have created a hard link above, so, the link count now is 2. If we create another hard link, the link count will be 3:

$ ln file1 file1-hlink-2
$ ls -l file1
-rw-rw-rw- 3 asaha asaha 18 Nov  9 13:52 file1

Graphically:

file1-hlink  -----> FILE CONTENTS ("Hello, I am file1") <------ file1
                         /|\ 
                          |
                     file1-hlink-2

Perhaps, this post here best describes how hard links defer from soft links.

Investigation: Size of hard links and soft links

Let's go back to one of the previous output of ls -l:

$ ls -il

15481123719144131 -rw-rw-rw- 2 asaha asaha 18 Nov  9 13:52 file1
15481123719144131 -rw-rw-rw- 2 asaha asaha 18 Nov  9 13:52 file1-hlink
29836347531381846 lrwxrwxrwx 1 asaha asaha  5 Nov  9 13:54 file1-slink -> file1

The sixth column above of the output shows the number of bytes in each of the files. We see 18 as the size of the original file, file1 and the hardlink, file1-hlink. 18 is the number of characters in "Hello, I am a file1" and a new line character. This doesn't mean that each hard link takes up 18 bytes on the disk. Each link is effectively a directory entry.

What are the five bytes in file1-slink? The readlink command will help us:

$ readlink file1-slink
file1

It is the "relative" path to the original file. Contrary to a hard link, a soft link actually takes up some space of it's own.

Investigation: Deleting the original file

What happens to each kind of link when we delete the original file? From the graphics above, we expect that the symbolic link will basically be a "dangling" link and hence, we will lose access to the file contents. In the case of hard link, the contents will still continue to be accessible, since all we are doing is deleting one of the links. Even though it is the original file, it doesn't matter. Other links continue to exist and point to the data.

Let's validate our theory:

$ rm file1

$ ls -lrt
total 0
-rw-rw-rw- 2 asaha asaha 18 Nov  9 13:52 file1-hlink-2
-rw-rw-rw- 2 asaha asaha 18 Nov  9 13:52 file1-hlink
lrwxrwxrwx 1 asaha asaha  5 Nov  9 13:54 file1-slink -> file1

We delete the original file above. Now the link count of file1-hlink and file-hlink-2 has decreased by 1 and is now 2.

If we try to display the contents of a hard link:

$ cat file1-hlink
Hello, I am file1

For the soft link though:

$ cat file1-slink
cat: file1-slink: No such file or directory

What the above error really says is I am trying to look for a file, file1, but it doesn't exist. This also means that we can essentially do:

$ echo "Hello, I am a different file1" > file1
$ cat file1-slink
Hello, I am a different file1

I wonder what kind of security risk this may post - may be we need symbolic links with checksums?

Investigation: Modifying original file contents

What happens if we modify the original file contents? They will be reflected in both types of links

Investigation: Directories and Links

We cannot create hardlinks to directories. This link is a good resource to learn why. Soft links doesn't have such a limitation.

Mildly related to this topic is the number of "default" links for a directory on Linux:

$ ls -lrta
total 12
drwxr-xr-x 6 ubuntu ubuntu 4096 Nov  9 05:38 ..
drwxrwxr-x 2 ubuntu ubuntu 4096 Nov  9 05:41 dir2
drwxrwxr-x 3 ubuntu ubuntu 4096 Nov  9 05:41 .

The above is a directory listing which has another sub-directory, dir2 inside it. Note the . and .. entries? The . is a hard link to the current directory, .. is a hard link to the parent directory. Each directory by default will have these additional entries. Where do we get the two links by default?

  • The first is the . inside the directory itself
  • The other is each directory will have a link to the sub-directory, hence 2 links

Miscellaneous

Is it a symbolic link or a hard link?

As a program how do I know if a file is a "regular" file, symbolic link or a hard link? The answer lies in the data that the stat() system call returns. Specifically, the st_mode field as described here.

Links and Filesystem Boundaries

A hard link - since it points to the same Inode cannot span a filesystem boundary. That is, we cannot create a hard link to a file which resides in a different filesystem. Soft links have no such limitations.

Using links to solve a problem

What are links useful for? One reason you may want to use links is to not have duplicate data in multiple files. Let's say, we have a bunch of files lying around in our file-system and we want to keep only a single copy of any duplicate data, and replace the others by links. Since hard links cannot span more than one filesystem, symbolic links may seem more attractive. However, one caveat to keep in mind with symbolic links is, if we accidentally delete the original file, we end up losing the data. So, it depends on the use-case.

Learning more

Share on: Diaspora*TwitterFacebookGoogle+Email

© Amit Saha. Built using Pelican. Customised theme based on the one by Giulio Fidente on github.