Naming is Hard: The Quest for the Right Name for “Go Module Repository”

Go Modules Repository

TL;DR: There is a lot of confusion on the Go modules cache/proxy/registry/repository term. After much deliberation, we have agreed upon Go Modules Repository.

Names Matter!

You know the drill, naming is the single major reason for procrastination in software engineering, perhaps after expense reports. Coming up with a name for something can be extremely hard, picking the right name from a set of proposed suggestions is only a little bit easier. Also, naming sticks – what we pick in the beginning will be with us for years, so let’s try to pick a correct term for the systems that store Go modules. With JFrog Artifactory, The Athens Project, and GoCenter already released, and the whole vision of a “Go mirrors ecosystem” laid down by Google, it’s perfect timing (or maybe just a tiny bit too late?) to decide what to call these services.

This blog post reflects the discussion between The Athens Project maintainers, JFrog Artifactory and GoCenter vendors and contributors. We hope that future service providers will agree with our arguments on the terminology. So, let’s get started.

So what is the Go Modules cache/proxy/registry/repository?

Is it a mirror?

“Go Module Mirrors” is a term Russ Cox used in his latest blog post to describe the systems that store Go modules. Maven used “mirror” back in 2005 in their `settings.xml` schema to describe a remote location of Maven dependencies. While thanks to the popularity of Maven, we all understand what that means, we feel that the term is misleading. “Mirror” stands for an exact and up-to-date copy. It contains all the content of “the original”. But what is the original? Go doesn’t have a canonical central repository which a “mirror” can, well, mirror.

More so, organizational repositories have no reason to contain everything from the upstream! It’s like having the entire GitHub in your $GOPATH. Why? Let’s just contain what we need, bringing stuff in automatically when needed.

Or a cache?

Another term we could use is “cache”.  After all, those tools cache Go modules, don’t they? They store modules like a cache would, but “cache” means some functionality that these tools don’t and shouldn’t have.

We hear things like “if it doesn’t work, clear your browser’s cache” and “delete your Maven ~/.m2 cache” all the time. Ouch! A cache is ephemeral and easily purgeable without any consequences except speed. Athens, JFrog Artifactory, and GoCenter cannot be ephemeral and are not easily purgeable by design. All of them promise immutability, which is impossible to achieve with an ephemeral cache.

Maybe a proxy?

This one was coined by the Go team in its naming of the `GOPROXY` environment variable, which defines a repository to fetch the modules from. Doesn’t `GOPROXY` actually define a proxy?

Proxy is technically the correct term for these systems, but they have other connotations in various technical communities which we think don’t fit the systems we’re talking about here.

Here are some things that come to mind when we think of a proxy:

Proxies usually redirect to an upstream, subject to an include/exclude list. That’s only a small part of a repository’s functionality. Do we imply that the variable `GOPROXY` is ill-named? No! A Go module “proxy” is the lowest common denominator for the functionality required by `GOPROXY`, but this term is wrong for systems like Athens, JFrog Artifactory and GoCenter. These projects have a rich set of features on top of their proxying functionality like redundant storage, RBAC, metadata, search and so on. We encourage you to take a look!

Or perhaps a registry?

Registry is another extremely popular term to describe a source and a target for artifacts. npm uses this term, for example. But what are the properties of the npm registry? It’s the single definite source of packages, with no upstream. Athens, JFrog Artifactory and GoCenter have (or at least can have and probably should have) an upstream and modules come from all over. So registry won’t work.

The key differentiator between these systems and registries is federation. Registries are a single origin and database for code. Athens, JFrog Artifactory, and GoCenter are databases for code that comes from a multitude of other sources – GitHub, GitLab, other version control systems, or even other Athens and Artifactory instances.

It’s a Repository!

So, it leaves with the term in the title of this blog post. We believe that Athens, JFrog Artifactory and GoCenter are Repositories of Go Modules. They have the basic functionality of a Proxy plus more, they cache modules but don’t purge or expire the cache entries, and they have more logic than that too. “Repository” is a well-defined term that explains the functionality of those tools well enough. We feel that it’s the right term. Russ uses the term “repositories” as well when talking about Athens and JFrog Artifactory even in the official Go Modules documentation! Looks like we are on the right track.

So here you have it, long live Go Module Repositories!