any suggestions on how I could avoid the ‘loading’ aspect of a model in a server that servers client resquests to a web api endpoint? such that the model is permanently ‘loaded’ and only has to make predictions?
# to save compute time that is (duh)
beep bop
submitted by /u/doctor_slimm
[visit reddit] [comments]