Checksums are a popular data integrity verification method that is widely used in computer networks, file systems, and other software applications. This guide will teach you about checksums and how to use them to detect file changes in the Go programming language.
What is a Checksum?
A checksum is a short, fixed-size string or number generated from a block of data using a hashing algorithm. It acts like a digital fingerprint of the data. The primary purpose of a checksum is to detect changes or errors in the data.
How does it work?
A hashing function (like SHA256, MD5, or CRC32) processes the original data. It produces a unique checksum value based on the data’s content. If the content of the data changes in any way, the checksum will also change. To detect changes in data using checksums, the data’s original checksum must be compared with the checksum of the data in its current state to see if they match.
Implementing checksums to detect file changes with Go
To follow along with this guide, you’ll need:
- Basic Programming Knowledge
- Go installed
- A Code editor (VSCode, Zed, WebStorm, etc.)
Implementation Details
We will implement the checksum exactly as it was mentioned earlier. We’d use it to detect changes in a text file, and after each check, we’d save the file’s checksum in its current state to a JSON file so we can retrieve it. Every time we run our software, we will see if the file has changed.
Creating our project directory
Create a directory for the project (we’ll name it checksum-golang for this tutorial):
mkdir checksum-golang
We’ll create our main.go file, all our implementation would be written here:
touch main.go
Create a text file and add some content to it:
touch file.txt && echo 'Hello There!' > file.txt
Now in our main.go file we’ll create some useful structs:
package main
import (
"crypto/sha256"
"crypto/subtle"
"encoding/json"
"fmt"
"io"
"log"
"os"
)
type Checksum struct {
Hash []byte `json:"hash"`
}
type Changelog struct {
Hash []byte `json:"hash"`
Diff int
}
We need to implement utility functions for generating and comparing hashes:
func createSHA256Hash(data []byte) []byte {
hasher := sha256.New()
hasher.Write(data)
return hasher.Sum(nil)
}
func compareHash(x, y []byte) bool {
if len(x) != len(y) {
return false
}
return subtle.ConstantTimeCompare(x, y) == 1
}
createSHA256Hash generates SHA256 hashes, which we’ll use throughout this tutorial. compareHash securely compares two hashes using subtle.ConstantTimeCompare to prevent timing attacks which is a safer alternative to direct comparison.
func getChecksum() (*Checksum, error) {
checksumFd, err := os.OpenFile("checksum.json", os.O_RDWR|os.O_CREATE, 0644)
if err != nil {
return nil, err
}
defer checksumFd.Close()
checkSumFC, err := io.ReadAll(checksumFd)
if err != nil {
return nil, err
}
var checksum Checksum
if len(checkSumFC) > 0 {
err = json.Unmarshal(checkSumFC, &checksum)
if err != nil {
return nil, err
}
}
return &checksum, err
}
func generateChecksum(content []byte) *Checksum {
checksumHash := createSHA256Hash(content)
return &Checksum{Hash: checksumHash}
}
func scanFileForChanges(content []byte, checksum *Checksum) *Changelog {
newChecksumHash := createSHA256Hash(content)
changeLog := &Changelog{}
changeLog.Hash = newChecksumHash
if checksum == nil {
return changeLog
}
if len(checksum.Hash) != 0 && !compareHash(newChecksumHash, checksum.Hash) {
changeLog.Diff = 1
}
return changeLog
}
func updateChecksum(changelog *Changelog) (err error) {
checksum := &Checksum{Hash: changelog.Hash}
cJson, err := json.Marshal(checksum)
err = os.WriteFile("checksum.json", cJson, 0644)
return
}
getChecksum would help us load up our stored checksum from checksum.json, generateChecksum creates a new checksum given the file content, scanFileForChanges checks for changes and returns Changelog, and updateChecksum would update checksum.json with the checksum of the file in its current state.
And finally we put it all together in our main function:
func main() {
checksum, err := getChecksum()
if err != nil {
log.Fatal(err)
}
txtFd, err := os.Open("file.txt")
if err != nil {
log.Fatal(err)
}
defer txtFd.Close()
txtContent, err := io.ReadAll(txtFd)
changelog := scanFileForChanges(txtContent, checksum)
err = updateChecksum(changelog)
if err != nil {
log.Fatal(err)
}
if changelog.Diff > 0 {
fmt.Println("==> File changes detected <==")
} else {
fmt.Println("==> No file changes detected <==")
}
}
In main, we retrieve the checksum using getChecksum, we open our text file and read all its contents into txtContent, the file is scanned for changes and the checksum of the file in its current state is updated. We then use changelog to determine if the file was changed (i.e when changelog.Diff > 0).
Running main.go the first time, we should get:
go run main.go
# ==> No file changes detected <==
And if we check the directory, we should see that checksum.json has been created:
ls
# checksum.json file.txt main.go
cat checksum.json
# {"hash":"2eAQ5osjnMrNZQw5f91ckRB2Oscu2bGdHjIXGbt5ffg="}
Let’s edit file.txt and run the code again:
echo 'Hello World!' > file.txt
This would overwrite “Hello There!” to “Hello World!” (you can edit this from your code editor instead).
go run main.go
# ==> File changes detected <==
cat checksum.json
# {"hash":"A7ogTlDRJuRnTABeBNguhMITZngK8fQ71Uo3gWtqs0A="}
The file change was detected and our checksum was updated successfully.
Conclusion
In this guide we wrote a program to detect file changes using checksum in Go all using only packages from the standard library (os, crypto, io, etc). What we’ve learnt in this guide can be helpful when generating backup systems, monitoring configuration files, and securing application assets. All the code written in this tutorial is available here.