How does Artifactory’s application lock and UI session management work after the removal of Hazelcast in Artifactory 6?

Joshua Han
2019-01-24 18:37

Summary

Q & A of DB based lock, which is used for UI Session management and write locks

Affected Versions

6.0 and above

Resolution

1) Does spikes in the archive logs after disabling Hazelcast feature is common or is that should be worried of?

Disabling hazelcast causes additional INSERT and DELETE operations against the DB, every application resource that is required to be locked is added and later deleted from the ‘distributed_locks’ table. See section 2.

2) What is the "Distributed_Locks" table used for?

The 'distributed_locks' is a table that contains application distributed locks, so that any lock that is required to be distributed between Artifactory High Availability cluster members is now stored in a dedicated table called 'distributed_locks', an entry added into the table means that a specific lock was acquired by a specific node and a thread. As long as the entry exists in the table, it means that an internal application ‘lock’ is acquired. Most of the lock entries are deleted very close to the INSERT time, locks are not expected to be persisted and should be removed from the DB when the application restarted as well.

A few application lock examples:

  1. Concurrent download lock – When multiple threads (clients) are trying to download the same file (i.e. 10gb file) from an Artifactory remote repository while the file is not yet cached in Artifactory, there is a mechanism that ensure that only one thread will actually fetch the resource from the actual remote site and the other threads will wait until the file is fully cached. Once the file is cached, all threads will return a response with the file content to the actual clients. The mechanism is using lock (which is entry added to the 'distributed_locks' table of 'category', 'lock_key', 'owner', 'owner_thread', 'owner_thread_name' and 'acquire_time' ) to ensure that only single thread can perform the operation (fetch the resource from the remote server), and Artifactory helps reduce the network load by using this lock on this specific feature.
  2. Task lock – When a task is triggered to run multiple HA cluster members (such as Debian metadata calculation that is triggered after deploy event to each node), the first task that acquired the lock will perform the metadata calculation first while the other task might wait for the first task completion. Depends on the task type, the first task might perform both task operations and disable the need for the other task to run, but in order to eliminate any interference or conflicts between multiple threads that expected to work on the same resources, a lock is acquired (and again, lock represented by a new entry on the specific operation added to the 'distributed_locks' table)
  3. Artifact deployment – When deploying an artifact, the thread of the deployment acquieres a write lock (adding an entry on the 'distributed_locks' table) on the artifact path to eliminate conflicts.

3) Writes to the "Distributed_locks" table will increase after disabling the Hazelcast?

Writing to the ‘distributed_locks’ table will start only when the ‘artifactory.locking.provider.type’ system property is set to ‘db’ (default now), which means indeed means that hazelcast is not used for locks. On high scale environment, we expect many entries to be added and removed on the table.

4) Can we mark "Distributed_locks No logging ( like disabling the logs)?

Entries stored within the ‘distributed_locks’ table are not expected to be persisted and should be removed when the application is restarted (automatically by the application); therefore, since the redo logging is used for recovery, we don’t see a reason to record (log) all the locks INSERT and DELETE operations.

5) Does Distributed_Locks table gets cleared once the application restarts?

Yes, the entries in the table is automatically cleanup in one of the following scenarios:

  1. A lock is held for more than 30 minutes while the application is running (configurable)
  2. A lock entry still exists in the table while the Artifactory instance that acquired it is no longer running (by a periodic application job that perform the cleanup)
  3. Startup and shutdown – When starting/shutting down an Artifactory HA instance, if a lock entry exists in the DB for this specific instance, it is deleted.

6) Why disabling Hazelcast generating REDO logs?

See answers 1, 2, 3.

7) Will the transaction be accessed by any other session”?

By deafult, Artifactory UI sessions are also not distributed using Hazelcast anymore, but stored in the DB tables now called: UI_SESSION and UI_SESSION_ATTRIBUTES. The load against these tables is expected to be smaller than the locking table, and the sessions are automatically cleaned when expired (30 minutes, configurable by ‘artifactory.ui.session.timeout.minutes’ system property.

8) Why hasn’t JFrog used ‘select for update’ method that doesn’t generate logs?

The DB lock implementation is similar to the hazelcast implementation, it was basically adding level on top of the locking abstraction that was already exists. It might be more flexible in application level and can be easily debug/modified/control when needed.