The operationalization of a machine learning model to operate on data at the time that the user is expecting a prediction response. Real-time prediction systems often must provide a response within milliseconds or seconds to perform an action or achieve a result, such as facial recognition to unlock a device or door, conversational systems with voice assistants, fraud detection in payment systems, and other applications that operate by performing a single prediction operation on data, rather than operating at a later time with a large set of data in the case of batch prediction.