Deploy Large Files Using Multi-Part Upload

JFrog Artifactory Documentation

Products
JFrog Artifactory
Content Type
User Guide
ft:sourceType
Paligo

For large files, Artifactory implements a fast and reliable multi-part upload approach with the JFrog CLI. One of the main advantages of multi-part upload is that, in the case of upload failure, it has a retry mechanism that resumes uploads from the point of failure, thus preserving all content that was uploaded before the failure. In contrast, with the standard upload, an upload failure will result in the loss of all data and require a restart from the beginning.

The JFrog CLI automatically uses multi-part upload for large files without any user intervention, according to the values of the --min-split, --split-count , and --chunk-size settings.

  • The --min-split setting determines the minimum file size required for multi-part upload. Its default value is 200 MB.

  • The --split-count setting determines the number of parts that can be concurrently uploaded per file during a multi-part upload. Its default value is 5.

  • The --chunk-size setting determines the upload chunk size in MB of the parts that are concurrently uploaded. Its default value is 20.

You can change the values for --min-split , --split-count, and --chunk-size when you run an upload command in the CLI. For example:

jf rt upload filename.zip /target-repo --min-split=150 --split-count=6 --chunk-size=25

The default values for --min-split , --split-count , and --chunk-size can be changed per command, but after running the command the values for these settings revert to the default. For more information on uploading files with the JFrog CLI, click here.

Note

Multi-part Upload is available on Artifactory version 7.90.7 and later versions, and using JFrog CLI version 2.62.2 and later versions.

Configuring Multi-part Upload for Self-Hosted Platforms

Multi-part upload is available on Self-Hosted platforms using S3 Storage or Google Cloud Storage configured in binarystore.xml.

Note

Multi-part upload is not supported on Self-Hosted platforms with S3-Sharding or when S3 storage is configured with Client-side KMS encryption (kmsClientSideEncryptionKeyId).

To enable multi-part upload on Self-Hosted platforms, set artifactory.multipart.upload.enabled = true.

Multi-part Upload with Encryption using a KMS Key

To perform multi-part upload with encryption using a KMS key, you must have permission for the kms:Decrypt and kms:GenerateDataKey actions on the key. These permissions are required because Amazon S3 must decrypt and read data from the encrypted file parts before it completes the multi-part upload.

If your IAM user or role is in the same AWS account as the KMS key, then you must have these permissions on the key policy. If your IAM user or role is not in the AWS account as the KMS key, then you must have permissions on both the key policy and your IAM user or role.

How does it work?

The diagram below shows the sequence of the events that occur when a user performs a multi-part upload.

MPU_4_drawio.png
  1. The user employs the JFrog CLI to submit an upload request.

  2. Artifactory authenticates the request.

  3. JFrog CLI splits the file into parts and sends a request to Artifactory for five pre-signed URLs for the first five parts.

    Note

    The default number of parts that can be uploaded concurrently is five, but this number can be modified with the --split-count variable in the JFrog CLI.

  4. Artifactory sends the pre-signed URLs to JFrog CLI.

  5. JFrog CLI concurrently uploads the first five parts directly to the S3 bucket.

  6. Steps 3-5 are repeated until all parts are uploaded.

  7. When all parts have been uploaded, JFrog CLI sends a ‘Complete Upload’ trigger to the S3 bucket.

  8. The S3 bucket merges the parts into a single file (this merge can take some time). While the merge is happening, JFrog CLI polls Artifactory regarding the progress of the merge.

  9. Artifactory verifies the checksum and notifies JFrog CLI that the upload has completed successfully.