ARTIFACTORY: Diagnosing and Resolving Performance Issues with Long-Running Database Queries in Artifactory

ARTIFACTORY: Diagnosing and Resolving Performance Issues with Long-Running Database Queries in Artifactory

AuthorFullName__c
Pranav Hegde
articleNumber
000006229
ft:sourceType
Salesforce
FirstPublishedDate
2024-11-11T08:48:56Z
lastModifiedDate
2024-11-11
VersionNumber
3
Long-running database queries in Artifactory can impact system performance, slow down operations, and hinder the overall user experience. 
This article provides steps to troubleshoot and resolve issues related to long-running database queries in Artifactory, helping to ensure optimal performance and reliability.



Understanding Long-Running Queries:

In Artifactory, database queries are an essential part of daily operations, including artifact management, user authentication, and metadata retrieval.
However, if a query takes too long to execute, it can block other operations, consume excessive database resources, and even cause timeouts in Artifactory's services.


Common causes of long-running queries include:
  • Poorly optimized SQL queries
  • High database load
  • Large datasets (e.g., when managing a large number of artifacts)
  • Insufficient database resources (memory, CPU, disk space)
  • Indexing issues
 
Step 1: Identify Long-Running Queries

To troubleshoot long-running queries, first, identify which queries are causing performance bottlenecks.
Method 1: Enabling debug loggers in Artifactory

This debug logger records the exact SQL statements Artifactory makes on the database. This includes the time the query took to run.This logback configuration is very verbose! A single REST API can create tens of SQL requests. Consider adding this logger to a test Artifactory connected to the same database type. Then use your reproduced SQL on the production DB so you don't have to enable debug logging in Prod. The debug logger should go to logback.xml file located in $JFROG_HOME/artifactory/var/etc/artifactory/ folder.

<appender name="jdbc" class="ch.qos.logback.core.rolling.RollingFileAppender">
    <File>${log.dir}/artifactory-jdbc.log</File>
    <rollingPolicy class="org.jfrog.common.logging.logback.rolling.FixedWindowWithDateRollingPolicy">
      <FileNamePattern>${log.dir.archived}/artifactory-jdbc.%i.log.gz</FileNamePattern>
      <maxIndex>10</maxIndex>
    </rollingPolicy>
    <triggeringPolicy class="ch.qos.logback.core.rolling.SizeBasedTriggeringPolicy">
      <MaxFileSize>25MB</MaxFileSize>
    </triggeringPolicy>
    <encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
      <layout class="org.jfrog.common.logging.logback.layout.BackTracePatternLayout">
        <pattern>%date{yyyy-MM-dd'T'HH:mm:ss.SSS, UTC}Z [jfrt ] [%-5p] [%-16X{uber-trace-id}] [%-30.30(%c{3}:%L)] [%-20.20thread] - %m%n</pattern>
      </layout>
    </encoder>
  </appender>

  <logger name="org.jfrog.storage.JdbcHelper" additivity="false">
    <level value="DEBUG"/>
    <appender-ref ref="jdbc"/>
  </logger>

NOTE: Debug loggers are resource intensive, once the issue is reproduced, remove the debug logger.
Next Check Artifactory Logs,

Artifactory logs can be used to identify slow database queries. Start by reviewing the artifactory-jdbc.log for slow query entries. You can then cross-reference these with the Artifactory microservice logs using the trace_id to pinpoint which specific user triggered the query.


E.g log:
2024-10-31T16:08:21.804Z [jfrt ] [DEBUG] [289f317ae2ed0ba0] [o.j.s.JdbcHelper:538          ] [0ba0|art-exec-130220] - Query returned in 1.41 secs: "select distinct  n.repo as itemRepo,n.node_path as itemPath,n.node_name as itemName,n.created as itemCreated,n.modified as itemModified,n.updated as itemUpdated,n.created_by as itemCreatedBy,n.modified_by as itemModifiedBy,n.node_type as itemType,n.bin_length as itemSize,n.node_id as itemId,n.depth as itemDepth,n.sha1_actual as itemActualSha1,n.sha1_original as itemOriginalSha1,n.md5_actual as itemActualMd5,n.md5_original as itemOriginalMd5,n.sha256 as itemSha2,np.prop_value as propertyValue  from  nodes n left outer join node_props np on np.node_id = n.node_id  where ( n.repo = 'dev-sigma-npm-local' and ( exists ( select 1 from node_props npsub where npsub.node_id = n.node_id and npsub.prop_key = 'npm.name' and npsub.prop_value like '@itc/itc-graph-common-library' )) and( np.prop_key = 'npm.version')) and n.node_type = 1 "

Upon correlating the Artifactory microservice logs using the trace_id `289f317ae2ed0ba0`, the following log reference was identified:
2024-10-31T16:08:22.331Z [289f317ae2ed0ba0] [ACCEPTED DEPLOY] dev-sigma-npm-local:.npm/@itc/itc-graph-common-library/package.json  for client : _system_ / 

Method 2: Use Database Monitoring Tools

Artifactory uses an external database (e.g., MySQL, PostgreSQL, Oracle, etc.) for data storage. Use your database’s monitoring tools to identify long-running queries.
For PostgreSQL: Run the following query to identify long-running queries:
SELECT pid, query, state, start_time, age(now(), start_time) AS duration
FROM pg_stat_activity
WHERE state = 'active' AND age(now(), start_time) > interval '5 minutes';

This query will list all active queries running longer than 5 minutes.


Step 2: Analyze the Query
Once you've identified the long-running queries, the next step is to analyze their execution and understand why they're taking so long.
Key Areas to Check:

1. Query Complexity: Look for queries that involve complex joins, subqueries, or aggregations. Ensure that queries are well-optimized by avoiding unnecessary operations.
2. Indexes: Ensure that the necessary indexes are in place for frequently queried tables. Missing indexes can significantly slow down query performance. To check existing indexes in PostgreSQL:
SELECT * FROM pg_indexes WHERE tablename = 'your_table';

 

You can compare the indexes list from the lower environment to production environment databases connected with the respective Artifactory instance to confirm if there are any missing indexes. 
3. Database Configuration: Check the database’s configuration for resource limitations, such as maximum allowed connections, query timeouts, and buffer sizes. Ensure the database has sufficient memory, CPU, and disk space to handle the load.

Step 3: Optimize the Query

If you've identified a specific query that’s underperforming, try the following optimization techniques:

1. Indexes: Check if there are any missing indexes for frequently queried columns and fix them. It's strongly recommended to consult JFrog Support before adding custom indexes. Avoid adding too many indexes as this can slow down insert operations.

2. Rewrite Complex Queries: Simplify complex queries by breaking them into smaller, more manageable parts. Use EXPLAIN (for PostgreSQL, MySQL) or AUTOTRACE (for Oracle) to get query execution plans and see how the database is executing the query.

3. Query Caching: Use database query caching for frequently accessed queries or result sets to reduce the load on the database.

Step 4: Check Database Server Resources

Long-running queries might also be a symptom of insufficient database server resources. Check the following:

1. Memory and CPU Usage: Ensure the database server has adequate memory (RAM) and CPU resources available. Monitor server performance using tools such as top, htop (Linux), or database-specific monitoring tools like PostgreSQL's pgAdmin.

2. Disk I/O: Ensure that the disk I/O throughput is sufficient for handling database queries. Check for disk space and disk usage using tools like df -h or iostat.

Step 5: Use Database Query Timeout

If a query is taking too long and negatively impacting system performance, consider setting query timeouts in the database or Artifactory configuration.

For PostgreSQL: You can set a query timeout in PostgreSQL's configuration (postgresql.conf):
statement_timeout = 60000  # 60 seconds

 

Step 6: Collecting thread dumps 

1. Identifying Blocked Threads: Thread dumps provide a snapshot of all active threads in the system, showing their current state (e.g., waiting, running, or blocked). If a database query is running for an unusually long time, the thread dump can reveal if the thread is blocked, waiting for a database resource or lock, which could be causing the delay.

2. Resource Contention: Thread dumps allow you to observe if there are multiple threads contending for the same database resources (like locks or connections). This can highlight potential deadlocks, resource exhaustion, or contention issues that may lead to slow queries.

3. Correlation with Database Logs: When you capture thread dumps alongside database logs (e.g., artifactory-jdbc.log), you can correlate specific long-running queries with the threads that are executing them, helping you to pinpoint inefficiencies or areas in the codebase or database schema that need optimization.

For more information on how to collect thread dumps, please refer to the following KB article.



Step 7: Consider Scaling the Database/Artifactory

If your database is under heavy load due to the size or frequency of queries, consider scaling your database horizontally or vertically.

1. Vertical Scaling: Upgrade the hardware (more CPU, memory, disk) of the database server to handle the increased load.

2. Horizontal Scaling: Implement database clustering or sharding if your Artifactory instance is very large and requires higher throughput.

3. Scaling the Artifactory application to handle heavy loads: By default, Artifactory is configured with a set of standard parameters, which can comfortably support up to ~200 concurrent connections. For further guidance on performance tuning and scaling, please refer to the following KB article.



Conclusion

Long-running database queries in Artifactory can significantly affect performance. By identifying the queries, analyzing their execution, optimizing the database schema and queries, and ensuring your server resources are adequate, you can mitigate these issues and keep your Artifactory instance running smoothly. Regular maintenance, indexing, and scaling strategies can further optimize performance as your artifact repository grows. For any persistent issues or advanced troubleshooting, consider reaching out to JFrog support for assistance.