GoCenter is Fast – How Does It Do That?

 

UPDATE: As of May 1, 2021 – GoCenter central repository has been sunset and all features deprecated. For more information on the sunsetting of the centers read the deprecation blog post

 

I recently wrote a blog saying “seeing is believing” (which is also your chance to score an awesome Go shirt), about the difference in speed downloading Go modules from a version control system or from GoCenter. From that blog post, we got a lot of positive comments in the Gopher community. We also got some questions on “how” it is faster, like this one on Twitter.

To see why GoCenter is faster let’s look at how the Go client gets modules. Let’s dive a little more into what happens under the covers when you go get modules (pun most definitely intended). We’ll compare what happens with and without setting the GOPROXY variable.

TL;DR Using GoCenter speeds up the download of your Go modules because it uses the right protocol when downloading files, saves on the number of HTTP calls and doesn’t have to recreate the module on the client side.

Getting your imports

The first difference starts with the imports of libraries. Every app written with Go makes use of libraries. Those libraries could be the standard libs, like “fmt” or”io, or ones that come from somewhere else. If you do not set the GOPROXY variable, and your package is in any of the predefined version control systems, the Go client will know how to resolve the URL to get the module without any problems. If your package isn’t in any of those systems, the Go client will perform a dynamic HTML check and look for a meta tag how and where to get the module. This approach needs HTTP redirects and HTML parsing. When you point your GOPROXY variable to GoCenter, the Go client sends one request to GoCenter to get the module information. Our first win is not needing complicated HTML and HTTP handling, but rather relying on one simple HTTP request. This especially adds up when you have a lot of Go modules that “live” outside the predefined version control systems.

Getting the sources

The next step in the process is to get the module to your machine. Go modules are packaged in zip files and have a few additional metadata files, which we’ll get to later. If you do not set the GOPROXY variable the Go client will perform a “git clone” and create a “bare” repository in $GOPATH/pkg/mod/cache/vcs. When you set the GOPROXY variable to GoCenter, the Go client will send an HTTP GET request and download the zip file. There we have the second win, downloading files is generally a lot faster than performing a git clone. In this case, as Go modules are text files that are compressed really well, you’ll save a lot more bandwidth (and time).

Constructing the module

Next up is constructing the module so you can use it in your Go apps. As I said earlier, Go modules are zips and have some additional files. These are files like .info (containing version numbers and commit times) and .mod (the module file for the module you downloaded). The Go client needs those files and the actual sources in a specific way on your machine. Without the GOPROXY variable set, the Go client will have to calculate these files itself and place them with the sources in the $GOPATH/pkg/mod/cache/download folder. It will also have to move the files to $GOPATH/mod/<vcs>/<module>. With the GOPROXY variable set, all the files were already downloaded in the previous step to $GOPATH/pkg/mod/cache/download. The Go client simply extracts the zip and places the source files, without the metadata files, in $GOPATH/mod/<vcs>/<module>. BOOM! Unzipping a file is a lot faster than calculating the checksums, generating the files, and getting a copy of a bare git repository.

Summarizing the wins

Looking at the three wins, when you do not set the GOPROXY variable, the steps the Go client takes are:

  • Find the location and protocol to get the sources, which might need additional HTTP and HTML handling (see Getting your imports)
  • Perform a “git clone” and create a “bare” repository in $GOPATH/pkg/mod/cache/vcs
  • Generate the zip, calculate checksums, and generate metadata files and place them in $GOPATH/pkg/mod/cache/download
  • Place the source files, without the metadata files, in $GOPATH/mod/<vcs>/<module>

With the GOPROXY set, the Go client has to do a lot less:

  • Download the .mod, .info and .zip for a module and place them in $GOPATH/pkg/mod/cache/download (only 3 HTTP GET operations)
  • Unzip the sources to $GOPATH/pkg/mod/<vcs>/<module>

So with GoCenter, you remove the HTTP and HTML handling to find your modules, the “expensive” cloning of source code and the generation of metadata files.

Why don’t you take GoCenter for a test drive?

To see for yourself, you can run “go get -v ./...” with and without the GOPROXY set and see the difference in both your console and your disk (“$GOPATH/pkg/mod/cache”).

For questions or comments, feel free to leave a message here or drop me a note on Twitter!