Hey there. That’s a great question. Essentially a model can be served by multiple machines to make predictions. That’s what most ML deployment tools such as SageMaker and Cortex allow you to do. However, in the case of models that are updated on the fly, things are much more complex. Indeed there is a whole litterature in the database community on how to guarantee consistency when the data (in our case a model) is being updated. The trick is that we’re doing machine learning, so it’s fine if we don’t make predictions with the latest version of the model possible. This observation has already been made before, such as in Google’s HOGWILD!. In other words eventual consistency is fine for machine learning purposes. Setting this up is probably the next big milestone for chantilly.