As the markets for AI shift from those organizations that have the technical expertise required to build models from scratch to those organizations looking to consume models built by others, the focus shifts from tooling and platforms focused solely on model development to tools and platforms focused on the overall usage, consumption, and management of models. Machine Learning Model Operationalization Management, also referred to as “ML Ops”, is focused on the lifecycle of model development and usage, machine learning model operationalization, and deployment. As your organization looks to implement ML Ops, here are 11 things to consider when implementing ML Ops at your organization.
- Define clear goals and objectives for your ML Ops program. Organizations are moving towards consuming models developed by others rather than building their own models from scratch. When you’re using models developed by others make sure the team sets our clear goals and objectives. This will help ensure that everyone is on the same page and working towards the same end goal.
- Establish a dedicated ML Ops team that is responsible for managing and maintaining the ML infrastructure. This team should have a mix of data scientists, software engineers, and operations experts. If you don’t have the internal expertise needed, consider what roles you do have and how you can augment the team as needed.
- Develop a comprehensive ML pipeline that includes all relevant stages of the ML lifecycle. Model consumers will be primarily concerned with the quality and reliability of existing models more so than the earlier stages of the ML pipeline such as data labeling and certain data prep tasks. Make sure you understand your needs and how best to use other’s models.
- Have Model Versioning & Iteration processes in place. As models are consumed, they will most likely be iterated and versioned to deal with new and emerging needs as models change based on new training or real-world data. Make sure you have systems in place to properly label and store different versions of models as well as offer visibility into model version history.
- Use version control systems to track changes to models and configurations, and make it easy to roll back to a previous version if necessary. Remember, that just because a model was updated doesn’t mean that everyone using the older model needs to now use the newest version.
- Implement robust monitoring and logging systems to keep track of model performance and troubleshoot any issues that may arise. Since the real world continues to change and doesn’t match up to the world used in training data, ML Ops solutions need to monitor and manage model usage, consumption, and results of models to make sure that their accuracy, performance, and other measures continue to provide acceptable results.
- Ensure that models are properly secured and that sensitive data is properly protected. Models are assets that need to be protected. ML Ops solutions can provide functionality to protect models from being corrupted by tainted data, being overwhelmed by denial of service attacks, attacked through adversarial means, or being inappropriately accessed by unauthorized users.
- Have a system in place for Model Discovery. In many instances, model users are not in the same part of the organization, or even company, as the creators of the particular model. MLOps solutions should provide model registries or catalogs for models produced within the tool ecosystem as well as a searchable model marketplace that provides a way to locate consumable models, both internally developed as well as third-party models.
- Implement a process for continuous retraining and updating of models to ensure that they stay current and continue to perform well.
- Have a system in place around model governance. MLOps platforms should include features for model and data provenance (tracing data changes to model change), model access control, prioritizing model access, providing transparency into how models use data, and any regulatory or compliance needs for model usage.
- Regularly evaluate and assess the performance of your ML Ops program, and make any necessary adjustments to improve efficiency and effectiveness.
Not familiar with ML Ops? Learn more on our AI Today podcast on ML Model Management and Operations (“ML Ops”).
As organizations move their AI projects out of the lab and into production across multiple business units and functions, the processes by which models are created, operationalized, managed, governed, and versioned need to be made as reliable and predictable as the processes by which traditional application development is managed. By considering these 10 things when implementing ML Ops at your organization you will be getting the most out of your machine learning models. ML Ops is a continuous process that requires ongoing attention and care to ensure that models are running smoothly and delivering the desired results.