ARTIFACTORY: What is the meaning of errors like “Invalid character 0xd83d in XML stream”?

Elina Floim
2023-03-15 09:52

Some RPM packages' metadata calculation may result in the following errors under certain circumstances: 2023-02-12T13:19:01.904Z [jfrt ] [ERROR] [f9686727650b5c22] [.s.PrettyPrintPackageWriter:23] [art-exec-9868       ] - Invalid character 0xd83d in XML stream
2023-02-12T13:19:01.907Z [jfrt ] [ERROR] [f9686727650b5c22] [o.j.m.y.s.YumXmlSerializer:81 ] [art-exec-9868       ] - Unexpected XML character. Failed to serialize RPM metadata of 'python35-urllib3'.

Since special characters (such as emojis) can be found in some package descriptions, such errors can appear during the package metadata calculation. This is an example of an RPM package description containing special characters:

User-added image

Up until version 7.23.3, Artifactory used an XML serializer that accepted special characters that the YUM client did not. The serializer would write these characters into filelists.xml/primary.xml files, resulting in an issue when the yum client attempted to read filelists.xml and primary.xml from the repository. Due to this behavior, validations were added to the XML serializer (as part of RTFACT-26693) to ensure it’s standard XML 1.0 as required. 

To avoid errors like the above being printed to the logs, it is possible to disable the validations by adding the following system property to the $JFROG_HOME/artifactory/var/etc/artifactory/artifactory.system.properties file:
 artifactory.yum.xml.printer.mode=-
*In the case of an HA environment, the changes need to be done on all of the nodes, following a rolling restart for the changes to take effect. 

Important note: The "old" XML serializer will not display these errors during package metadata calculation and will write the package metadata into the metadata XML files. However, these packages are still not accepted by the yum client and will not be resolved by it.
While reverting back to the old serializer is possible, it is recommended to utilize the new serializer in order to identify such packages using the errors in the logs and remove them from Artifactory.