XRAY: Introduction to Xray’s Garbage Collection and its Configuration

XRAY: Introduction to Xray’s Garbage Collection and its Configuration

Products
JFrog_Xray
Content Type
Administration_Platform
AuthorFullName__c
Elumalai Ganesan
articleNumber
000006652
FirstPublishedDate
2025-10-09T09:23:12Z
lastModifiedDate
2025-10-09
VersionNumber
1
Introduction 

In the article, we will look at how Xray performs a clean up of Xray data on artifacts which are deleted or out of the retention period.

Xray employs a Garbage Collection process to handle the cleanup of artifacts. When an artifact is deleted or exceeds its retention period, the Garbage Collection process is initiated. This process involves marking the associated Xray data for deletion by transferring the artifact's entries from the "root_files" table to the "deleted_root_files" table, while also removing any related violations. This initial step is referred to as "Shallow Deletion" (Phase 1).

The subsequent phase, known as "Deep Delete" (Phase 2), involves the complete removal of these entries from the database. The Garbage Collection process operates for a duration of three minutes every two hours by default to facilitate the deletion of Xray data. This time limitation is necessary due to the temporary suspension of the Index and Persist microservices during the Garbage Collection process. Consequently, any incoming artifacts will be paused for scanning for the duration of the three minutes.

For instance, if there are 1,000 artifacts awaiting deletion, and within the three-minute window only 100 artifact records are processed, the remaining entries will be addressed in the following scheduled run, which occurs after a two-hour interval.

Additionally, Xray features an Idle Listener for Garbage Collection. If Xray remains idle for five seconds without receiving any new / active  messages in the queue related to Indexing or Persisting, the Garbage Collection process is triggered for a period of ten seconds, unless new artifacts are received for scanning during that time.

You can find the GC configuration and last run GC status using the REST API's.

GC Configuration Sample Response:
root#~ % curl -X GET -u<username> :<Password> "https://<arturl>/xray/api/v1/configuration/gc"

{"scheduler_enabled":true,"scheduler_period_minutes":120,"max_duration_seconds":180,"max_retry_count":3,"idle_listener_enabled":true,"idle_listener_gc_duration_seconds":10,"idle_listener_sampling_rate_seconds":5}%

GC Status Sample Response:
root#~ % curl -X GET -u<username> :<Password> "https://<arturl>/xray/api/v1/gc/status"

{"is_running":false,"last_time_started":"2025-07-30T12:00:05Z","last_time_ended":"2025-07-30T12:00:05Z","last_successful_run":"2025-07-30T12:00:05Z","last_state":"succeeded"}%

Note:

You can start the GC anytime using the Force GC REST API by specifying the maximum running duration, remember whenever GC is started, the new artifacts are paused for scanning due to the delete and create event as it can cause corruption of Xray data.

You can tune the GC Configuration, for example change the time interval between two GC, max_duration_seconds according to your use case using this Set GC Configuration REST API.