I'd like to try moving my workflow to the cloud; it's getting to be too time-consuming on my personal machine (which is really old and can barely manage Lightweight@BS=16).
I'd like to be as cost-effective as possible, so I've been thinking of using Preemptible instances/GPUs.
These are super cheap compared to their dedicated counterparts, but I need to engineer around the caveats. Those caveats are that they can be preempted at any moment (there's maybe a 5-15% chance of it happening during its lifetime, from what I'm reading). Plus, their lifespan is at most 24 hours, so they'll get torn down on you no matter what. This has the added benefit of protecting me from accidentally leaving an expensive resource on, only to get a crazy bill at the end of the month.
To work around the caveats, I was thinking of just mounting a storage bucket using FUSE.
This can be slower than interacting with a regular filesystem, because GCS objects are immutable, so a small update to a large file still requires basically re-uploading the entire file (e.g. a 400MB model file).
I figure, almost all of the work happens in memory, and it should only need to touch the disk when writing its state every 250 iterations (which I can crank up to 10000 or something), so there's a sweet spot where it will not get in the way too much. And that way, if I get preempted in the middle of the night, my state is saved in my bucket and the instance can safely disappear into the ether.
Has anyone played with this kind of approach before? Does it work, or should I just build a thing that periodically uploads to a bucket manually? I suppose I can test it locally; no need for the cloud to validate the proof of concept.