The basics of securing GenAI and LLM development
With the rapid adoption of AI-enabled services into production applications, it’s important that organizations are able to secure the AI/ML components coming into their software supply chain. The good news is that even if you don’t have a tool specifically for scanning models themselves, you can still apply the same DevSecOps best practices to securing model development.
Secure models start with secure components
Many of the components that make up a model are attack vectors that organizations already focus on managing and securing. Because of this, organizations should make sure that their Data Scientists and Machine Learning Engineers are equipped with the same security tools and processes your core development teams use.
Here are a few examples:
Dependencies and packages – In building the model, Data Scientists will leverage open-source frameworks and libraries such as Tensor Flow and PyTorch. Providing them access to those dependencies from a trusted source of truth (and not directly from the internet), scanning for vulnerabilities and blocking malicious packages, ensures every component used in the model is secure.
Source code – A Data Scientist or ML Engineer will typically prepare a model in Python, C++, or R. Scanning source code with a SAST solution can make sure there are no errors in the code that impact the security of the model.
Container images – Container images are used to deploy the model for training and enable models to be consumed by other developers/applications. A final scan of the container image can help make sure that what’s being deployed doesn’t introduce risk to your environment.
By pairing model development tools such as MLflow, Qwak, and AWS Sagemaker with a single system of record for model artifacts – like the JFrog Platform – organizations can block deployment or use of unsafe, or out-of-policy components in building new models.
Ensure AI application integrity
Once a golden model has been identified, it’ll likely undergo additional development to expose the new model as a service that developers can connect to as part of an AI-enabled application. An ML Engineer will typically add additional libraries to allow the model to be called via an API, thereby creating another image that goes through the traditional software development cycle. In this case, you can apply the DevSecOps best practices mentioned in the previous section, with a few additional steps, to ensure the integrity of the AI components made available to your developers.
Those additional steps typically include:
Artifact Signing – You can and should sign all the components that make up your new service as early as possible in the MLOps pipeline and treat them as one immutable unit as they mature across stages. This helps ensure that nothing about your application has changed as it moves towards release.
Promotion / Release Blocking – As an application or service moves across the MLOps pipeline, you should automatically rescan it as part of the promotion process. This allows you to identify any issues that arise as early as possible.
Enable ML Model security without impacting productivity
Any approach to protecting your MLOps pipelines should be executed without impacting the agility of model development. To make sure that there’s little to no disruption, you should provide standard APIs for accessing artifacts and ensure the tools and processes you leverage easily integrate with multiple ML solutions.
Tools like JFrog Xray and Curation help bridge the divide between Integrated Development Environments and ML solutions. With these tools, the security policy definition that protects your environment is made transparent to your Data Scientists and ML Engineers.
There are still a few areas where traditional DevSecOps tools and practices fall short for AI/ML development — such as data set issues or the need to quickly deploy containers for training in the development process. That said, taking the steps outlined above will provide a solid foundation for securing ML model development and get you well on your way to governing ML development across your organization.
To learn more about how JFrog is empowering organizations to adopt an MLSecOps approach, read more here, or take a tour of our platform.