How go.mod files work in GoCenter Go modules repository

UPDATE: As of May 1, 2021 – GoCenter central repository has been sunset and all features deprecated. For more information on the sunsetting of the centers read the deprecation blog post

Every science fiction fan knows that you can’t fix the past. Try to, and it wrecks the balance of events that led to the present. That’s also what we learned while making GoCenter. While trying to correct an early Go modules design choice and improve certainty, we accidentally made things harder for the Go community. So, like every fictional time traveler, we had to retrace our steps to undo the change.

We learned some tough lessons on our GoCenter journey. By telling that story, we can explain our original reasoning, share what we learned, and suggest better practices for creating Go modules for the future.

Background

Go Modules were introduced in Go 1.11 to add package versioning and dependency management to the Go ecosystem. Using Go Modules, Go developers can declare project requirements in their go.mod files and specify which versions of which packages are part of their builds.

When developers started the adoption of Go Modules in their projects, they faced several dependencies that had not opted in to provide a go.mod file describing their requirements. When Go encounters a module in that state, it automatically creates a go.mod file with no requirements for that module. This lack of requirements information had to be addressed in order for Go Modules to achieve reproducible builds. Go Modules handles that by adding transient dependencies not declared in a go.mod file of a dependency in the tree to the user’s module go.mod file with the // indirect comment. While this can help a module to achieve reproducible builds, it cannot avoid inconsistent behavior between different modules. In other words, we can have different modules depending on the same direct dependency with different indirect dependencies just because they were created at a different point in time.

Early this year, JFrog presented GoCenter, the public central Go Modules repository. In GoCenter, Go Modules are immutable and always available. The Go community can leverage on GoCenter to resolve their modules requirements and bring their Go projects CI/CD pipelines to the next level.

When we started working on converting projects to Go Modules to be served by GoCenter, we faced the same missing go.mod file issues faced by the developers. While we understood how one could use the provided Go Modules tools to achieve reproducible builds, we were not satisfied with the possible inconsistent behavior between modules, and we decided to do something about it and look for a solution to have all Go Modules served by GoCenter containing go.mod files that described their requirements.

Unhappy With History

To make sure that the go.mod requirements match the source code of the module, Go Modules provides the go mod tidy command. It will add any missing dependency being used by the source code to the list of requirements in the go.mod file, and can be useful to get the requirements list when we are trying to convert a project to Go Modules.

With the support of that command, our first approach to achieve all modules with requirements declared in their go.mod file in GoCenter was to tidy all modules that did not contain a go.mod file.

The first issue with that approach is that, when there is no version about the dependencies available (there is more on this in our next approach), go mod tidy will point the require statements to the latest version available of the dependencies. This can create broken Go Modules, since we don’t know if they are compatible with each other. It’s easier to understand that when we take into consideration that pointing to the latest version of the dependencies can create relationships between the modules that are impossible to happen in time. In other words, by tidying the modules we could have a module depending on another one that was created in the future. The diagram below illustrates that:

This issue gets even worse when we consider circular references. By tidying the modules that are members of a loop, we could make a module not resolvable at all, since it would depend on a newer version of itself and cause the Go client to panic. In the example below, module A@v1.0.0 would never be resolvable according to the tidy produced relationship since it would require A@v1.1.0 as a transient dependency.

At this moment we realized that it is very hard to fix the past. To bring the time variable to the picture while tidying the modules would add unwanted complexity to the process and we decided that this approach was not feasible.

Trying to Fix the Past

Before Go Modules were established, the Go community used other dependency management tools to describe their project requirements. Examples of those tools are dep, glide, and govendor. Those tools provided descriptor files where the authors described their intent in terms of which other projects and versions were required as part of their builds.

To facilitate and speed up the adoption of Go Modules, Go provides the go mod init command. This command can parse the dependency descriptor files used by those other dependency management tools and create a go.mod file with require statements that describe the dependencies declared there.

After we realized that we could not fix the past completely, we decided that we should at least fulfill the author’s intent declared in other dependency management tools when converting projects to Go Modules to be served by GoCenter. If there was no provided go.mod file in the module, we would run go mod init to create and populate it with dependencies described by the authors in another dependency descriptor file supported by the command. This would fix part of the past where we had enough information available to avoid the issues related to the previous approach.

The first issue caused by this approach is related to the verification process for Go Modules. Go Modules introduced the go.sum file which contains cryptographic checksums for both the go.mod file and the zip archive containing the module’s packages. These checksums are used to validate future downloads of the requirements and detect unexpected changes in the content. By running go mod init (or even go mod tidy, but we did not realize that at the time) on the projects and generating a GoCenter version of the go.mod file, we would be introducing checksum mismatches that would happen depending on the source used to resolve dependencies.

Checksum mismatches cannot be expected, even when there is good intent behind it. Having “expected” checksum mismatch scenarios like this can lead users to just ignore the verification process completely, and this can cause big problems when a real threat is in place.

Besides that, having two go.mod file versions, one with requirements provided by GoCenter and another one with no requirements provided by Go when resolving the modules from VCS, could change the dependencies tree resolved by a module completely.

These two issues go against the desired interoperability between Go Modules Mirror, making it hard for users to switch between different public Go Modules repositories. Because of this, we decided to revert this feature.

Setting Things Right

Once we understood changing anything in the past could cause big undesired consequences, we decided to focus on providing an experience that was in sync with the behavior users knew.

To achieve that, for modules that do not have a go.mod file, GoCenter will serve one with no requirements. This is the same behavior provided by the Go client when resolving modules from VCS. For modules that already have a go.mod file, GoCenter will serve them as is. This approach removes all the pains caused by our attempts on fixing the past while improving the interoperability between GoCenter and other public Go Modules registries.

When processing and validating a module, GoCenter still uses go mod commands like go mod tidy and go mod graph to discover all the requirements that need to be available for a module to be fully resolvable, but we do not use the results of those commands to change go.mod files served by GoCenter.

Because we had to make this change after GoCenter went live, we needed to clean up several go.mod files generated by GoCenter and replace them with their “no requirements” version. You can get more information about it here.

Facing the Future

We learned some lessons by trying to fix the past:

Since we cannot travel back in time, It’s hard to recreate the context in which things happened;
Because of that, it is hard to avoid running into paradoxes that produce unpredictable results (like a module with a dependency that was created in the future);
And that makes it really hard to predict all the consequences of a change.

To avoid having to go through all that again in the future, we suggest these practices for authors of open source Go projects, to improve how the community discovers and consumes Go dependencies.

Adopt Go Modules

From Go 1.13, Go Modules will be the default dependency management tool and the previous GOPATH method will be deprecated. Project authors should move from other tools to Go Modules. If some of your dependencies are not converted to Go Modules yet, ask the authors to adopt it. Providing a go.mod file with the right list of requirements is the only way for us to achieve reproducible builds in any context or scenario where our modules are being used.

Both Go authors and other members of the community like GoCenter have planned several other features around Go Modules. You are not going to be able to benefit from those unless you jump into the Go Modules train.

Avoid Using Pseudo-versions

Commit hashes are a VCS concept, not a dependency management one. They bring confusion and make it harder to follow the progression of versions over time. Commit hash pseudo-version were introduced to bring Go Modules support to untagged projects and they should be only used as a fallback mechanism. If the dependencies you need have release tags, use those tags in your require statements. If they do not, ask the authors to start tagging their releases. That brings us to the next item.

Tag Releases Following Semantic Versioning Rules

While you can still use commit hashes pseudo-versions with Go Modules, you should always create semantically valid tags for your releases. Besides the module compatibility benefits provided by the semantic versioning, tagging your releases can make it easier for users and services like GoCenter to detect new versions available of your modules. This can reduce the time it takes for the community to be aware of your changes and fixes.

Avoid Using Replace Statements

In your journey to adopt Go Modules, you might be tempted to use replace statements to avoid having to fix import statements spread all over your source code, especially if you are coming out of vendoring your dependencies. While this is a valid and supported technique, replace statements are a main module only directive and will not do any good for users consuming your module as a dependency, which will have your module broken.

It might be a boring and painful process, but fixing those imports and removing the replace statements is the only way to make your module consumable as a dependency by all users and without requiring any additional steps.

Wrapping Up

With these simple insights from our GoCenter journey, we hope we can illuminate past steps, and make the road forward with Go modules a lot less bumpy for the entire Go community.

GoCenter’s Back to the Future Journey