Building a Software Data Retention Strategy and Why You Need One
Every day, your developers are pushing software. Some of that software will make it to production, but many of those incremental builds will not. While you shouldn’t remove those incremental builds and old release versions haphazardly, if left unchecked, they can clog up your software repositories as well as the workflows and systems they serve.
Daily development efforts, internal policies, and around data retention, privacy, and cybersecurity require organizations to maintain certain pieces of software data for defined periods, sometimes over multiple years. A carefully considered software retention strategy is essential to balance maintaining clean, well-maintained repositories with retaining vital software artifacts.
What are Data Retention Policies in Software Development?
At the heart of your software retention strategy will be various software retention policies. A software retention policy is a type of data retention policy that governs how long software assets used and generated in the development process must be maintained by an organization, either in operational systems or archives.
Software retention policies can fall under two categories: cleanup policies and archival policies.
- Cleanup policies dictate when to remove stale and unneeded artifacts and their metadata from storage and databases. Once removed, these assets are gone forever.
- Archival policies define when to move artifacts and their metadata from operational repositories into a dedicated long-term archive. These assets will be maintained for longer periods of time, but kept in a dedicated archiving location, typically outside your operational system.
What to consider when building your software data retention policies
Every organization will have unique factors dictating how they define and implement software retention policies. Retention policies should always be built in partnership with a legal and/or audit team. Here are four key areas to consider when drafting your retention policies.
Resource capacity
Resource capacity encompasses two resource categories: people and technology.
Taking inventory of your “people resources” will help dictate how manual* your retention strategy can be – including everything from removing artifacts from systems to auditing. If you’re building a homegrown or DIY solution to implement your policies, you should consider the effort that’s needed to set up and maintain systems, perform manual cleanups, manually move artifacts into your dedicated archival space, and provide reports and data to auditing teams.
Technology considerations will include things like how much storage you have in the cloud or on your servers. You’ll need to evaluate your current tools to see how they can help automate the implementation of your retention policies. If you’re going the homegrown route, you’ll also need technical resources to maintain and secure the tools you build.
*In most cases, JFrog recommends full automation of all cleanup and archiving to prevent human error.
Regulations
It’s important to understand what regulations your organization is subject to, as this will dictate what types of software assets you need to maintain and for how long. Typical regulations to evaluate against include:
- Sarbanes-Oxley – learn more
- GDPR – learn more
- ISO – learn more
- HIPAA – learn more
- PCI – learn more
Your development processes
Every organization’s software development lifecycle (SDLC) is unique. When setting up retention policies, you need to consider how long you need to keep incremental builds and other assets available for your developers to access. For example, JFrog works with a major retailer that has all incremental builds published to one repository which is flushed at the end of each day. Your organization’s development process and SDLC will naturally influence your software retention strategy primarily impacting your cleanup policies.
Business requirements and data availability
You also need to consider how you want to flag assets that fall under a given policy and the options available to you. Four of the most important factors will be age, usage, version condition, and location (typically defined by the repository the asset lives in). For example, you might decide to archive Docker Images from Project-A’s “Release” repository that are older than one year.
The criteria you choose to implement your policies with will be influenced by your asset storage structure and the metadata that’s available to filter and identify packages with. This will be challenging if you’re not using a solution tailored to managing software artifacts.
Benefits of implementing retention policies
Defining and implementing retention policies will impart a number of benefits across your development organization as well as auditing and governance functions. Implementing automation to administer the policies will take these benefits even further.
- Improves productivity – Clean and well-maintained repositories will typically outperform those that get bloated. That means faster builds and asset serving. They also help ensure that developers are working with the appropriate versions of software components, preventing errors and rework. Further, automating the running of policies will eliminate the need for developers, devops or infra teams from having to manually clean or archive assets.
- Contains costs – A growing organization typically means more software in development and production which means more cost to store, manage, and maintain all those assets. Retention policies ensure you’re not wasting resources on assets you no longer need for operations.
- Simplifies compliance – Having documented and implemented retention policies is essential for remaining compliant with regulations and internal policies. Automating your archiving approach ensures that every software asset you must maintain is preserved without the need for human intervention.
- Prevents accidental data loss – Defining what, when, and how you remove software artifacts from your repositories and systems is essential to avoid accidental deletion of important assets. Automating the cleanup and archival process eliminates human error that could lead to the wrong things being saved or deleted.
Best practices for software data retention policies
Ideally, you should implement your newly refined retention policies through your artifact management solution since it likely houses all of your software artifacts, builds, and releases. Here are some practical tips to keep in mind as you move forward.
- Automate everything. Evaluate what policy and automation features are available within your artifact manager to automate your retention policies. Manually implementing your policies takes teams away from value-add activities and can lead to further issues (i.e. deleting the wrong things).
- Leverage dry runs. It’s a good idea to test your policies before you institute them. This way you can be sure your new policies don’t have any negative or unintended consequences.
- Maintain the metadata. Always store the metadata of the software you’re archiving together with those assets in your archive location. That way, auditing teams have the full context of the software and DevOps teams will have an easier time restoring it if necessary.
- Don’t go it alone. Maintaining an archive of released software is necessary for regulatory compliance, but isn’t a value added activity. It’s also very likely you won’t ever need to access it, at least not in the foreseeable future. You may want to consider archiving as a service. As many CIOs are looking to reduce expenditure on physical servers and resources, this approach enables you to have a scalable, secure, and resilient archive that you’ll never have to worry about managing and maintaining.
Make retention easier – look to your artifact manager
If you don’t have an artifact management solution in place today, or it’s lacking the functionality to support the steps mentioned above, JFrog has you covered. With the JFrog Platform you can implement an automated retention strategy that empowers you to meet regulations while improving productivity, all with zero hassle.
Learn more about software retention and how JFrog can help you build and implement your tailored software retention strategy in this on-demand webinar.