What kind of storage solution does Artifactory implements?


Artifactory implements a checksum based storage in order to store artifacts in a resource effective way.


A file that is uploaded to Artifactory, first has its SHA1 checksum calculated, and is then renamed to its checksum. It is then hosted in the configured filestore in a directory structure made up of the first two characters of the checksum. For example, a file whose checksum is "ac3f5e56…" would be stored in directory "ac"; a file whose checksum is "dfe12a4b…" would be stored in directory "df" and so forth. The example below shows the "d4" directory that contains two files whose checksum begins with "d4".
User-added image

In parallel, Artifactory creates a DataBase entry mapping the file's checksum to the path it was uploaded to in a repository. This way of storing binaries optimizes many operations in Artifactory since they are implemented through simple DataBase transactions rather than actually manipulating files.

Since the checksum is a unique parameter for each binary and file, this way of storing and managing files prevents duplicate files and corruption of files.