Concurrency safe file access in Go

January 18, 2021

Using a file for persistent storage (and not a database - datastore/object store) sounds like an academic exercise. For me, it brings back memories of writing a structure in C (programming language) to a file to simulate a student record database. However, there may be situations where you may just get by using it especially when you just want to run a single copy of your application.

Let’s see a pattern for implementing concurrency safe file access in Go using sync.Mutex.

Our data structure
A reference implementation
- Writing to the file
- Reading from file
Notes
Learn more

Our data structure

Let’s consider the following struct type:

type fileStore struct {
	Mu    sync.Mutex
	Store map[string]string `json:"store"`
}

Our data store here is the Store map object. This is what we will read from the file and write to the file. At any point of time our application is running, the Store object must be strongly consistent with what’s inside the file. Of course, the application may be killed or crash between updating the map and writing to file which doesn’t help with our consistency goal, but let’s ignore that for now.

The sync.Mutex object, Mu is our guard rail here. It is our mechanism here to ensure that:

Only one goroutine is ever reading or writing to Store object
Only one goroutine is ever populating the Store object from the file
Only one goroutine is ever persisting the Store object to the file

You have already realized that I am going to be persisting the map object as a JSON encoded object in the file. Of course, you could choose any other object type to persist, any other encoding mechanism instead of JSON, or a combination of the two. The implementation will vary then, but hopefully the ideas are transferable.

A reference implementation

We will create a new package to encapsulate all the file operations:

package filestore

// imports


// Our data structure that we will persist is guarded with
// a Mutex object
type fileStore struct {
	Mu    sync.Mutex
	Store map[string]string `json:"store"`
}

// FileStoreConfig encapsulates the DataFileName and
// the fileStore object
var FileStoreConfig struct {
	DataFileName string
	Fs           fileStore
}

Having defined the above types, we can write an Init() function which will take the file path as a parameter, create the file if it doesn’t exist and returns a FileStoreConfig object which the rest of the application can use:

// Init creates the file if it doesn't exist and initializes the  FileStoreConfig
// to be used in the rest of the application
func Init(dataFileName string) error {
	_, err := os.Stat(dataFileName)

	if err != nil {
		_, err := os.Create(dataFileName)
		if err != nil {
			return err
		}
	}
	FileStoreConfig.Fs = fileStore{Mu: sync.Mutex{}, Store: make(map[string]string)}
	FileStoreConfig.DataFileName = dataFileName
	return nil
}

The Init() function is called at application startup.

Once that’s done, the rest of the application then interacts with two methods implemented for the fileStore object:

Write()
Read()

For the purpose of this reference implementation, we will consider another struct type:

type myDataType struct {
    Key string
    Value string
}

This type will encapsulate a single data item we may want to read or write from the map, Store.

Writing to the file

The Write() function is called with an object of myDataType which contains both the key and value to store in the map and looks like this:

func (j *fileStore) Write(data myDataType) error {
	j.Mu.Lock()
	defer j.Mu.Unlock()

	err := j.ReadFromFile()
	if err != nil {
		return err
	}
	j.Store[data.Key] = data.Value
	return j.WriteToFile()
}

The Write() function updates the Store map object and then uses a helper function, writeToFile() which looks like this:

func (j *fileStore) WriteToFile() error {
	var f *os.File
	jsonData, err := json.Marshal(j.Store)
	if err != nil {
		return err
	}
	f, err = os.Create(FileStoreConfig.DataFileName)
	if err != nil {
		return err
	}
	defer f.Close()
	_, err = f.Write(jsonData)
	return err
}

The WriteToFile() function essentially writes the Store object to disk. It’s worth noting here that the os.Create() function will truncate the file if it exists. This is exactly what we want here as we are writing the complete Store object every time.

You can see that by using the sync.Mutex object:

Only one goroutine is ever reading or writing to Store object
Only one goroutine is ever persisting the Store object to the file

If we have another goroutine attempting to get a lock to write to the file, it will block till the previous goroutine has released the lock.

Reading from file

The Read() function accepts a key, id whose value we are interested in. Recall that our key object here is a map[string]string

func (j *fileStore) Read(id string) (string, error) {
	j.Mu.Lock()
	defer j.Mu.Unlock()

	err := j.ReadFromFile()
	if err != nil {
		return "", err
	}

	data := j.Store[id]
	delete(j.Store, id)
	j.WriteToFile()

	return data, nil
}

The helper function, ReadFromFile(), like its counterpart, WriteToFile() reads the complete file and overwrites the current Store object in memory:

func (j *fileStore) ReadFromFile() error {

	f, err := os.Open(FileStoreConfig.DataFileName)
	if err != nil {
		return err
	}
	jsonData, err := ioutil.ReadAll(f)
	if err != nil {
		log.Fatal(err)
	}
	if len(jsonData) != 0 {
		return json.Unmarshal(jsonData, &j.Store)
	}
	return nil
}

Notes

It’s worth noting though that in the WriteToFile() function, truncation operation removes the old file before creating the new one. Hence if the application is killed in the window between the removal and the creation, we have lost all the data. So we can improve upon this by using a rename() operation instead.

Learn more

An introduction to sync.Mutex
Tailscale - An unlikely database migration
Beware of copying mutexes in Go
For an alternative to using sync.Mutex, see this post