Artifactory: How to Verify Storage Summary Differences After System Export and Import with Python

Artifactory: How to Verify Storage Summary Differences After System Export and Import with Python

AuthorFullName__c
David Shin
articleNumber
000006199
ft:sourceType
Salesforce
FirstPublishedDate
2024-10-10T13:49:56Z
lastModifiedDate
2024-10-10
VersionNumber
1
Introduction 
After performing a system export and subsequent import, you may occasionally notice significant differences in the number of artifacts. These discrepancies can be surprising and may arise due to various reasons described here.

To pinpoint the exact repositories that are causing concern, it’s crucial to compare the storage summaries before and after the system import.

To do this, you'll need two storage summary JSON files: one generated before the system export and another after the system import. You can obtain these files using the REST API for storage summary information or by extracting them from the support bundles.


Resolution 
The following Python script will help you compare the filesCount values between the two JSON files to identify any differences:
#!/usr/bin/env python3
import json
def load_json_file(file_path):
    """Load JSON data from a file."""
    with open(file_path, 'r') as f:
        return json.load(f)
def compare_files_count(json_data_1, json_data_2):
    """Compare the `filesCount` values and find missing repos."""
    list_1 = json_data_1.get("repositoriesSummaryList", [])
    list_2 = json_data_2.get("repositoriesSummaryList", [])
    
    # Convert lists to dictionaries for quick lookup by repoKey
    list_1_dict = {item.get("repoKey"): item for item in list_1}
    list_2_dict = {item.get("repoKey"): item for item in list_2}
    
    diffs = []
    missing_in_target = []
    missing_in_source = []
    
    # Compare items in list_1 to list_2
    for repo_key_1, item_1 in list_1_dict.items():
        item_2 = list_2_dict.get(repo_key_1)
        
        if item_2:
            if item_1.get("filesCount") != item_2.get("filesCount"):
                diffs.append({
                    "repoKey": repo_key_1,
                    "Source_filesCount": item_1.get("filesCount"),
                    "Target_filesCount": item_2.get("filesCount"),
                    "difference": item_2.get("filesCount", 0) - item_1.get("filesCount", 0)
                })
        else:
            missing_in_target.append({
                "repoKey": repo_key_1,
                "repoType": item_1.get("repoType"),
                "Source_filesCount": item_1.get("filesCount"),
                "Target_filesCount": 0,
                "difference": 0 - item_1.get("filesCount", 0)
            })
    
    # Compare items in list_2 to list_1 to find missing repos in the source
    for repo_key_2, item_2 in list_2_dict.items():
        if repo_key_2 not in list_1_dict:
            missing_in_source.append({
                "repoKey": repo_key_2,
                "repoType": item_2.get("repoType"),
                "Source_filesCount": 0,
                "Target_filesCount": item_2.get("filesCount"),
                "difference": item_2.get("filesCount", 0) - 0
           })
    
    return diffs, missing_in_target, missing_in_source
def main(file_1, file_2):
    """Main function to load and compare JSON files."""
    json_data_1 = load_json_file(file_1)
    json_data_2 = load_json_file(file_2)
    
    diffs, missing_in_target, missing_in_source = compare_files_count(json_data_1, json_data_2)
    
    if diffs:
        print("Differences found:")
        for diff in diffs:
            print(diff)
    else:
        print("No differences found.")
    
    if missing_in_target:
        print("\nMissing repos in the target compared to the source (only exist in Source):")
        for repo in missing_in_target:
            print(repo)
    else:
        print("No missing repos found in the target.")
    
    if missing_in_source:
        print("\nMissing repos in the source compared to the target (only exist in Target):")
        for repo in missing_in_source:
            print(repo)
    else:
        print("No missing repos found in the source.")
if __name__ == "__main__":
    file_1 = 'storage-summary-source.json'
    file_2 = 'storage-summary-target.json'
    
    main(file_1, file_2)

Conclusion  
This script will help you identify differences in filesCount between the source and target storage summaries. It will also flag any repositories that are present in one environment but missing in the other, enabling you to quickly address any discrepancies that may have arisen during the system import process.