There may be times when the router health checks to other microservices in the application will fail. Because of this, the application will report itself as unhealthy. For example, in the below, the router in Xray fails its health check to the analysis microservice:
2021-09-02T03:37:36.619Z [jfrou] [DEBUG] [67aa951c48980641] [healthcheck.go:65 ] [main ] - Checking health of service 'jfxana_01e4c6macc08byb8wcppf6dyz7-xrayv36-0' using URL 'http://localhost:7000/api/v1/system/readiness' returned an error: Get "http://localhost:7000/api/v1/system/readiness": context deadline exceeded
You can tune several fields in the system.yaml for the router, as seen below (defaults are shown):
Router:
topology:
local:
## Settings for checking the health of the local services
healthCheck:
## Duration between health checks
interval: 5s
## Health check request timeout
requestTimeout: 5s
## The number of consecutive successful health checks that must occur before declaring an instance healthy
healthyThreshold: 2
## The number of consecutive failed health checks that must occur before declaring an instance unhealthy
unhealthyThreshold: 2
If increasing the above timeouts doesn’t help, it may help to check any monitoring on the application and the database, to see why the readiness probe request is taking so long. You can also reach out to Jfrog Support if you have a valid subscription including support.