Filestore Management In The Age of Petabytes

Artifactory 4.6 was released last week, and along with adding Google Cloud Storage to the already extended family of storage providers, introduces support for the most complex storage configuration needs of any company in today’s world of binaries management. This will make your filestore management much more reliable and flexible, allowing you to mix n’ match a variety of storage providers and features you want in your filestore.

(Drum roll…) Filestore sharding

Sharding is the jewel in the crown of this release, giving you the best a filestore can offer in any repository manager. A sharded filestore lets you configure any number of mounts, and store files with any degree of redundancy. Redundancy of your binaries means your filestore is extremely reliable and can withstand an outage of any mount with no downtime! Scaling your filestore is no longer a long complex process involving hours or days of data transfer. Just add as many mounts to your filestore chain, and Artifactory will automatically balance your storage to include the new mount behind the scenes. And you have full control over how your filestore behaves with parameters to configure read behavior, write behavior, data balancing and more.

Customized filestore management

Until now, the options you had to configure your filestore pretty much amounted to the filesystem, fullDB, cachedFS, and s3. Version 4.6 introduces a whole new mechanism for filestore management which gives you the freedom to decide how your filestore behaves. The mechanism is based on “binary providers” and chains.

Mix n’ match

Using a simple XML configuration file you can implement an advanced storage solution that is tailor made for your needs. To customize the functionality you want from your filestore, you can chain a set of binary providers together. For example, you can configure a cached file system in front of a shard that will include multiple mounts of different providers, each with its own relevant chain. You can even configure read and write priority rules that will attempt to read a file from a local disk before trying to read it from a remote S3 bucket.

Here is an example of a filestore that combines a cached file system in front of S3 storage with an eventually persistent volume and retries. This improves performance when using S3 to serve popular files from the cache while write operations are handled smartly by the eventually persistent volume:

<config version="v1">
<chain template="s3"> //This filestore is based on the S3 default chain based on jets3t
   <provider id="cache-fs" type="cache-fs">   //It first tries to read from the cache
       <provider id="eventual" type="eventual">   //It is eventually persistent so writes are also written directly to persistent storage
           <provider id="retry" type="retry">     // If a read or write fails, retry 
               <provider id="s3" type="s3"/>      // Actual storage is S3

<provider id="s3" type="s3">
    <identity>test</identity> // Credentials and endpoint for your Amazon S3 storage

So from this version, Artifactory filestore management will never be the same. With more and more enterprise cloud providers entering the scene, and advanced filestore configuration with chains and templates Artifactory offers unprecedented freedom in how you set up your filestore, and unprecedented stability and reliability with filestore sharding.

A meeting of giants

Artifactory has supported S3 object storage since version 3.6, and in the spirit of being a universal repository, not only with package managers, but with the leading storage providers as well, we are in the process of adding support for additional enterprise cloud storage providers. In this version, we added support for Google Cloud Storage  letting you choose which giant to use as your filestore. Both providers offer a similar set of benefits and your choice of which one to use is likely to be influenced by other factors outside of the Artifactory domain.

What else?

While filestore took front and center in this release, life with Artifactory does not revolve solely around storage.

Docker meets

It’s not every day you add a new domain to your name card; there has to be at least one good reason. Our reason (well, one of them) is to make it easier to use Docker with Artifactory Online.

From now on, you can define as many Docker repositories you want on your Artifactory Online server, and access them through {account_name}-{repo-key}

Bower repositories and registries

Bower is not new to Artifactory. With version 4.6, Artifactory is also a private Bower registry. This means you can register your Bower packages through remote and virtual repositories in Artifactory, and retrieve them directly from your private Git repositories.

We understand that avoiding vendor lock-in is just as important with your filestore as it is with your package manager, build tool and CI server. In this version we added Google Cloud Storage and advanced filestore management. Support for more object storage providers is coming soon.

Ready to optimize your filestore? Download Artifactory now.