I wanted functionality where GPU VRAM isn't constantly hogged while I'm serving a PyTorch model, so that I could simultaneously train stuff.<p>I wanted a solution which was agnostic to the type of the model, with respect to loading and inferring.<p>So I made this AutoUnloadModel class that unloads the model if it hasn't been used for some period. I used __init_subclass__ to ensure that all the details regarding timers, locks etc are hidden from the subclass.<p>I found __init_subclass__ very cool for this job, which is the reason I'm sharing this. Thanks!