How to troubleshoot common replication issues?

Maayan Amrani
2019-06-20 07:49

Subject: 

Troubleshooting Replication issues

Description:

Artifactory supports two types of replication: Push and Pull. Push replication is used to synchronize local repositories and can be triggered by events, as well as by configuring a cron expression. Pull replication is invoked by a remote repository, and runs according to a defined schedule to synchronize repositories at regular intervals. This solution is intended to troubleshoot some of the common issues that may appear during Replication.

Common errors:

1.Source Artifactory
[ERROR] – Unexpected EOF read on the socket
java.io.EOFException: Unexpected EOF read on the socket
    at org.apache.coyote.http11.Http11InputBuffer.fill(Http11InputBuffer.java:734)
    at org.apache.coyote.http11.Http11InputBuffer.access$300(Http11InputBuffer.java:40)
    at org.apache.coyote.http11.Http11InputBuffer$SocketInputBuffer.doRead(Http11InputBuffer.java:1084)
    at org.apache.coyote.http11.filters.IdentityInputFilter.doRead(IdentityInputFilter.java:140)
    at org.apache.coyote.http11.Http11InputBuffer.doRead(Http11InputBuffer.java:263)
    at org.apache.coyote.Request.doRead(Request.java:581)
    at org.apache.catalina.connector.InputBuffer.realReadBytes(InputBuffer.java:326)
    at org.apache.catalina.connector.InputBuffer.checkByteBufferEof(InputBuffer.java:642)
    at org.apache.catalina.connector.InputBuffer.read(InputBuffer.java:349)
    at org.apache.catalina.connector.CoyoteInputStream.read(CoyoteInputStream.java:183)
……

Possible resolution:
Increase File Upload Max Size (MB) on the target.  

2.Source Artifactory

[http-nio-8081-exec-1] [ERROR] – Could not retrieve list
org.apache.catalina.connector.ClientAbortException: java.net.SocketTimeoutException
at org.apache.catalina.connector.OutputBuffer.doFlush(OutputBuffer.java:321)
at org.apache.catalina.connector.OutputBuffer.flush(OutputBuffer.java:284)
at org.apache.catalina.connector.CoyoteOutputStream.flush(CoyoteOutputStream.java:118)
at org.springframework.session.web.http.OnCommittedResponseWrapper$SaveContextServletOutputStream.flush(OnCommittedResponseWrapper.java:458)
at org.codehaus.jackson.impl.Utf8Generator.flush(Utf8Generator.java:1091)
at org.codehaus.jackson.map.ObjectMapper.writeValue(ObjectMapper.java:1615)
at org.codehaus.jackson.impl.JsonGeneratorBase.writeObject(JsonGeneratorBase.java:314)
at org.artifactory.addon.artifact.LocalRepoFileListTreeStreamer.streamLocalFileListRecursively(LocalRepoFileListTreeStreamer.java:156)
…..

Possible resolution, related to a known issue:

Disable/increase Tomcat timeout by specify the following attribute in the connector in $TOMCAT_HOME/conf/server.xml:

<Connector port="8081" sendReasonPhrase="true" connectionTimeout="-1"/>

Additional possible solution: local Filelist (related to internal issue- RTFACT-19064):

In Artifactory version 6.10.1 a new flag/system property is included:
artifactory.replication.push.fullTree.saveLocally=true/false (default: false)
(we can ignore the 'push' naming in the flag itself as it works for 'pull' as well)

When enabled, Artifactory will save the FileList to the filesystem under the Artifactory temp work directory – in a file called FullTree-[digits].json (by default, located under: $ARTIFACTORY_HOME/data/tmp/work).
This file will be deleted after the replication is done.

Without setting this flag, the default behaviour of replicating while streaming the file-list remains.

Mind that this may take considerable storage space (based on the number of replications, artifacts and folders stored).
It is recommended to separate different replications timing to avoid running them simultaneously if using this flag (in general – it's always advisable to separate different replications timing).

3. Source Artifactory

[ERROR] Error occurred while performing folder replication for ‘’: Could not retrieve remote file list for repo '' at '': HTTP/1.1 403 Forbidden

Possible resolution:

We may run this REST API call, which the replication performs behind the scenes, with the user configured for the replication. For example:
$ curl -u<ReplicationUser>:<password> http://artifactory_url/api/storage/libs-release-local/org/acme?list&deep=1&listFolders=1&mdTimestamps=1

In case the above results with a 403 response as well, we should check the permissions given to the replication user.
In case the above results with a 200 response, it means that a different user is configured for the replication authentication and we should check the request.log (for instance: the target request.log) and see which user appears in the relevant log entry. For example:

TIMESTAMP|3|REQUEST|Source_IP|USER|PUT|/PathToFile|HTTP/1.1|403|44717

Example of a correlated log entry from artifactory.log of the source instance:

TIMESTAMP [replication-consumer] [ERROR] (o.a.a.c.BasicStatusHolder:211) – Error while deploying item 'RepoKey:PathToFIle on Url:http://artifactory_url/RepoKey ': Forbidden [403]

We should make sure that the user reaching the target has sufficient permissions to run the replication and it is not ‘anonymous’ (in case anonymous access is enabled).

General information about replication issues:

A.In general, in order to troubleshoot replication issues, we can add the following loggers (to the logback.xml file, located under: $ARTIFACTORY_HOME/etc/) in both: source and target Artifactory instances:

<logger name="org.artifactory.addon.replication.core">
<level value="trace"/>
</logger>

The above will add verbosity to the logs that are related to the replication process.

B.In order to try and prevent timeout issues during replication, we can disable any timeout configurations in the proxies, along with setting up a dedicated port and entry for the replication.