Dynamic Image Compression with Go

This article is part of my learning how to use Go in a production web environment. It’s written in the style of a tutorial, but has plenty of my own notes about my experience. View the complete code here.

Since launching an online shop/ordering system for a couple of local pubs last summer, the websites have continued to be used for getting craft beer and other such delights into the hands of customers around the area. As the publicans have expanded their range and as customers have mounted up, one element has been bugging me.

The solution (which runs on Django) doesn’t do anything fancy with images which are uploaded to the product catalog, and it doesn’t sit behind a CDN, which (you’ve guessed it) leads to some huge pixel-punching powerhouses of the JPEG variety being used for thumbnails. Some pages have dozens of these 4000x4000px images and it’s therefore an absolute snail.

Side note: I’ll likely write-up a separate piece on how I went about creating the Django app.

So as lockdown continues I thought it would be an interesting mini-project to build out a sidecar application for compressing these images. I’ve not built anything for production in Go, so I thought it would be a nice low-risk project to kick off with. Sure, I could’ve just rolled in a Django plugin to compress the images, but then what else would I do with my Saturday?!

The Brief

Build an image proxy server which sits behind Nginx, alongside Django, to serve static product images in JPEG, PNG, or GIF format. Compress these on demand and save the result to disk for a more optimal app. Also, learn how to Go.

App Design

The basic logic of the app is as follows (in pseudo-code):

var size = get_param(size)
var file = get_param(file)

if file_exists(file):
  optimised_name = file + size
  if file_exists(optimised_name):
    return serve_file(optimised_name)
  else:
    optimised = optimise_file_for_size(file, size)
    save_to_disk_with_name(optimised, optimised_name)
    return serve_file(optimised_name)
else:
  return 404

Essentially the app should be given a filename and size, then execute the above control flow.

Some additional considerations:

  1. We should limit the allowed sizes, otherwise someone could easily DDoS the serve by iterating from 1 to ∞ on every file and almost certainly cause an explosion in a datacenter near you
  2. We should limit the files that are allowed to be compressed

The Code

Okay, let’s get cracking. First up, create a file called main.go and add the following:

package main

import (
  "fmt"
  "net/http"
)

func main() {
  http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
    fmt.Fprintf(w, "Hello, world!")
  })

  http.ListenAndServe(":9990", nil)
}

Kicking things off, we create a super-simple web server which runs on port 9990 on localhost. When you navigate to the root you’ll see the hello world message. I’m impressed both by the standard library here and the formatter beautifying my code every time I hit save.

Next up, we need to be able to pass in some params to this route. I’d like to use the format: /filename.jpg?s=200 wherein the s param represents the maximum width of the image to be output. So let’s take a look at how we can interact with both the URL and the file system.

http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
  // Build the filepath
  path := "./uploads/" + filepath.Base(r.URL.Path)

  if _, err := os.Stat(path); err == nil {
    // The file does exist
  } else {
    // The file doesn't exist
    http.NotFound(w, r)
  }
})

Here we’re pulling the filename from the URL by simply grabbing the base name. Interestingly removing the use of the fmt call and adding the filepath.Base method cause the formatter to add imports, so the top of my file now looks like this:

import (
  "net/http"
  "os"
  "path/filepath"
)

I literally don’t know what I’m going to do with the literally dozens of hours I used to spend trying to compile, seeing an error, then researching a package import. This is a handy feature for sure.

The stdlib has a handy method for all the complexity of serving files which we can inject here to show that it’s working:

http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
  // Build the filepath
  path := "./uploads/" + filepath.Base(r.URL.Path)

  if _, err := os.Stat(path); err == nil {
    // The file does exist
    http.ServeFile(w, r, path)
  } else {
    // The file doesn't exist
    http.NotFound(w, r)
  }
})

How splendid! This now serves the file directly if it exists, and fires back a 404 if not. Lovely, we’ve just reinvented a reverse proxy.

Back to the matter at hand! Let’s see how to access the URL params:

if _, err := os.Stat(path); err == nil {
  // The file does exist
  // Check the size param
  query := r.URL.Query()
  s, ok := query["s"]

  if !ok {
    // No size param, serve the default file
    http.ServeFile(w, r, path)
  } else {
    // We have a size param, now in variable s
  }
} else {
  // The file doesn't exist
  http.NotFound(w, r)
}

Now before you admonish me for reversing the order of good/bad conditions with this second block, take a deep breath, make a brew.

Here we use the r param from above, and parse the query string from the URL. An idiom throughout the stdlib and beyond is returning a tuple – the return value (here s) and an error type (here err) which we can inspect to see if all is well. Interestingly because it’s perfectly valid to send multiple values under one key Go returns a slice into the value s – i.e. /myImage.png?s=100&s=200 – in our context this doesn’t make sense, but one to remember when we want to use this value.

If things have gone wrong, we simply serve the file. This is likely to be the case when a size param isn’t passed. Before we dive into using this param in earnest, we should validate the input. As mentioned above, we want to restrict what sizes the app will operate on, so we should define them with the following code, placed outside of the main function:

// AllowedSizes defines which sizes the app will serve.
var AllowedSizes = map[int]bool{
  100: true,
  200: true,
  600: true,
}

I initially used a slice for this, but it turns out that the stdlib doesn’t include a contains function, so using a map will allow us to have a super simple lookup ability.

Moving back to the inside of the else block from above, we’ll use the following to check if the param passed in is allowed:

// We have a size param, now in variable s
targetSize, _ := strconv.Atoi(s[0])

if _, exists := AllowedSizes[targetSize]; exists {
  // ...magic?
}

As noted above s is a slice. We’ll use the first value here and pass this into a cheeky string converter. Again, the method will tell us if there was an error, and we’ll just ignore this by using _ as the name.

Once we’ve done this, we just use a lookup in the map defined above, and if we find the size we’ll progress. Within this if block we’ll take a look on disk for the optimised file as the first action:

// Check disk to see if the resized image exists
ext := filepath.Ext(path)
resizedpath := strings.Replace(path, ext, fmt.Sprintf("_s%d%s", targetSize, ext), -1)

if _, err := os.Stat(resizedpath); err == nil {
  // Serve the target file
  http.ServeFile(w, r, resizedpath)
} else {
  // Generate the file
}

First up we’ll use the filepath module again, and extract the file extension. We can then build up a file path for the optimised, resized image in the format filename_s200.png for a PNG resized to 200px in width. We do a simple find/replace for the extension and inject the size param into the name.

Using the same os.Stat method from above, we check if it exists i.e. the URL has been requested before. If it does, we can just serve the file. Huzzah! If not, we need to actually generate the file.

This is a good point to note that our lovely main function is getting a little…fat. So let’s write the ideal code which uses a resizing function in the else block above, then we’ll write that function:

if _, err := resizeFile(path, resizedpath, targetSize); err != nil {
  // Something went wrong, serve the original path
  http.ServeFile(w, r, path)
}

// Serve it
http.ServeFile(w, r, resizedpath)

We’ll use the idiom of a tuple return for our new function, and pass it the original file path, the resized file path, and the target size. If something does happen to go wrong with our method, we can just return the original file – it’s better to have a safe failure mode which is non-optimal, rather than a broken one.

If things go well with the resize the file will now exist on disk, so the last line just serves the resized path.

Let’s write our magic method.

I’m a big believer in writing your method’s documentation before implementing it so as to force you to think about the ideal API and any edge cases. It’s easier to figure this out in plain English than it is to code, debug, and refactor:

/*
Resize a file and save the result to disk. Maintains
aspect ratio when resizing.

source string The existing file path
target string The file path to write the output to
size int The max width of the output

string The output path
err An error which occurred when processing
*/
func resizeFile(source string, target string, size int) (string, error) {
  // Size matters
}

The method will take the source and target paths and the target size of the image, along with returning the target path and error if one should occur.

Before writing the main logic we know that 3 mains things need to happen here:

  1. We need to load the image into memory
  2. We need to resize it
  3. We need to save it in the correct format

Go has some stdlib code for accessing the file system and interacting with basic types which means (1) and (3) are covered. But the resizing aspect is non-standard.

After some searching I came across a well-referenced library – https://github.com/nfnt/resize – although it’s no longer actively maintained, because our use-case is tiny, I’m going to use it.

First up, loading the file into memory:

file, err := os.Open(source)
if err != nil {
  return target, err
}

// Decode the file
file.Seek(0, 0)
img, _, err := image.Decode(file)

if err != nil {
  return target, err
}
file.Close()

Here we’re opening a file stream and reading the contents into a variable named img – using the standard image module. Note that this will process standard image types. If we encounter any errors we bomb out and return the error. And remember kids: always close your file streams.

Next, we use the 3rd-party library from above (by running go get with the URL first):

// Resize it
m := resize.Resize(uint(size), 0, img, resize.Bicubic)

We cast the integer type, and pass in other basic params, including a compression method – bicubic in this instance. Oh, you don’t like single letter variable names? K.

Next we want to create the image on disk. The first step is to create the empty file:

// Create a new, empty file
out, err := os.Create(target)
if err != nil {
  return target, err
}

// Make sure the stream is closed once we're done here
defer out.Close()

Note: we use defer which will execute the method once our function has returned.

Finally, we need to write the contents to disk and return successfully:

// Save to data to disk
switch ftype := filepath.Ext(source); strings.ToLower(ftype) {
case ".jpg", ".jpeg":
  jpeg.Encode(out, m, nil)
case ".png":
  png.Encode(out, m)
case ".gif":
  gif.Encode(out, m, nil)
}

return target, nil

Because the formats encode their data in different ways, we need to switch based on the filetype. There’s probably a better way of doing this. Once we’re done, we just need to return.

Kablamo!

We now have a working little app. It’s certainly not perfect – if you give it malformed files it will likely crash. But hey, it works beautifully.

The code, alongside deployment instructions for an Ubuntu box are available on Github.

Results from Production

Let’s take a look at how this impacted the site it was deployed on. Bear in mind that absolutely no time has been spent in optimising the site, and it’s fairly lightweight in most regards.

Running the site through GTMetrix we see the following results:

Before

After

Bandwidth Implications

Alongside the obvious enhancement from a user’s perspective, looking at outbound bandwidth from the server tells an interesting story. Peak usage has dropped from 2.5Mb/sec to about 100kb/sec.

Here’s the bandwidth graph. Can you tell when GoSquash was turned on?

The lesson here is that you shouldn’t optimise right away – you’ll deprive yourself of the joy of doing it later!


Posted

in

,

by

Tags: