Fly.io has GPUs now

605 点作者 andes314超过 1 年前

25 条评论

k8svet超过 1 年前

Does it have basic functioning other stuff? I am shocked at how our production usage of Fly has gone. Even basic stuff as support not being able to just... look up internal platform issues. Cryptic/non-existent error messages. I'm not impressed. It feels like it's compelling to those scared of or ignorant of Kubernetes. I thought I was over Kubernetes, but Fly makes me miss it.

评论 #39367503 未加载

评论 #39366608 未加载

评论 #39367178 未加载

评论 #39367971 未加载

评论 #39367312 未加载

评论 #39372915 未加载

评论 #39378737 未加载

评论 #39368898 未加载

评论 #39366026 未加载

评论 #39366378 未加载

xena超过 1 年前

Hi, author of the post and Fly.io devrel here in case anyone has any questions. GPUs went GA yesterday, you can experiment with them to your heart's content should the fraud algorithm machine god smile upon you. I'm mostly surprised my signal post about what the "GPUs" are didn't land well here: <a href="https://fly.io/blog/what-are-these-gpus-really/">https://fly.io/blog/what-are-these-gpus-really/</a>If anyone has any questions, fire away!

评论 #39366927 未加载

评论 #39366721 未加载

评论 #39364887 未加载

评论 #39366621 未加载

评论 #39365820 未加载

评论 #39365276 未加载

评论 #39365315 未加载

niz4ts超过 1 年前

As far as I know, Fly uses Firecracker for their VMs. I've been following Firecracker for a while now (even using it in a project), and they don't support GPUs out of the box (and have no plan to support it [1]).I'm curious to know how Fly figured their own GPU support with Firecracker. In the past they had some very detailed technical posts on how they achieved certain things, so I'm hoping we'll see one on their GPU support in the future![1]: <a href="https://github.com/firecracker-microvm/firecracker/issues/1179#issuecomment-1846090947">https://github.com/firecracker-microvm/firecracker/issues/11...</a>

评论 #39364738 未加载

iambateman超过 1 年前

It’s cool to see that they can handle scaling down to zero. Especially for working on experimental sites that don’t have the users to justify even modest server costs.I would love an example on how much time a request charges. Obviously it will vary, but is it 2 seconds or “minimum 60 seconds per spin up”?

评论 #39363999 未加载

pgt超过 1 年前

I was an early adopter of Fly.io. It is not production-ready. They should fix their basic features before adding new ones.

评论 #39368063 未加载

评论 #39368101 未加载

评论 #39369050 未加载

nakovet超过 1 年前

About Fly but not about the GPU announcement, I wish they had a S3 replacement, they suggest a GNU Affero project that is a dealbreaker for any business, needing to leave Fly to store user assets was a dealbreaker for us to use Fly on our next project, sad cause I love the simplicity, the value for money, the built in VPN.

评论 #39364349 未加载

评论 #39364327 未加载

评论 #39364344 未加载

评论 #39364341 未加载

评论 #39364359 未加载

评论 #39364795 未加载

评论 #39364676 未加载

评论 #39364318 未加载

qeternity超过 1 年前

Who is the target market for this? Small/unproven apps that need to run some AI model, but won't/can't use hosted offerings by the literally dozens of race-to-zero startups offering OSS models?We run plenty of our own models and hardware, so I get wanting to have control over the metal. I'm just trying to figure out who this is targeted at.

评论 #39366102 未加载

评论 #39366604 未加载

评论 #39367824 未加载

评论 #39368456 未加载

评论 #39367432 未加载

ec109685超过 1 年前

The recipe example or any any LLM use case seems like a very poor way of highlighting “inference at the edge” given the extra few hundred ms round trip won’t matter.

评论 #39368626 未加载

评论 #39366978 未加载

holoduke超过 1 年前

Anybody has experience with the performance. First glance is that they are quite expensive. Compared to for example Hetzner (cpu machines)

评论 #39364104 未加载

unixhero超过 1 年前

I use Fly.io free tier to run uptime monitoring with Uptime kuma. It works insanely well, and I'm a really happy camper.

评论 #39369062 未加载

UncleOxidant超过 1 年前

I don't want to deploy an app, I just want to play around with LLMs and don't want to go out and buy an expensive PC with a highend GPU just now. Is Fly.io a good way to go? What about alternatives?

评论 #39367373 未加载

评论 #39365635 未加载

评论 #39365851 未加载

评论 #39365624 未加载

评论 #39366121 未加载

评论 #39365661 未加载

评论 #39368046 未加载

andes314超过 1 年前

Has anyone who has used Beam.Cloud compare that service to this one?

评论 #39364070 未加载

DreamGen超过 1 年前

Great, more competition for the price-gouging platforms like Replicate and Modal is needed. As always with these, I would be curious about the cold-start time -- are you doing anything smart about being able to start (load models into VRAM) quickly? Most platforms that I tested are completely naive in their implementation, often downloading the docker image just-in-time instead of having it ready to be deployed on multiple machines.

jimnotgym超过 1 年前

It is a bit of an odd thing that we still call GPUs GPUs when the main use for them seems to have little to do with Graphics!

Havoc超过 1 年前

How fast is the spin up/down on this scale to zero? If it is fast this could be pretty interesting

评论 #39364323 未加载

wslh超过 1 年前

Interesting. We have this discussing this kind of services (offloading training) over the last several days [1] [2] [3]. Thinking on the opportunity to compete with top cloud services such as Google Cloud, AWS, and Azure.[1] <a href="https://news.ycombinator.com/item?id=39353663">https://news.ycombinator.com/item?id=39353663</a>[2] <a href="https://news.ycombinator.com/item?id=39329764">https://news.ycombinator.com/item?id=39329764</a>[3] <a href="https://news.ycombinator.com/item?id=39263422">https://news.ycombinator.com/item?id=39263422</a>

riquito超过 1 年前

Is there any configuration to keep alive the machine for X seconds after a request has been served, instead of scaling down to zero immediately? I couldn't find it skimming the docs

评论 #39364734 未加载

评论 #39369771 未加载

dcsan超过 1 年前

Can fly run cog files like replicate uses? Would be nice to take those pre packaged models run them here with the same prediction APIMaybe cos it's replicate they might be hesitant to adopt it but it does seem to make things a lot smoother Even with lambalabs' lambdastack I still hit cuda hell <a href="https://github.com/replicate/cog">https://github.com/replicate/cog</a>

评论 #39422461 未加载

isoprophlex超过 1 年前

Almost twice as cheap as Modal! Very nice!

nextworddev超过 1 年前

Somehow cheaper than AWS?

评论 #39364461 未加载

评论 #39364333 未加载

评论 #39368291 未加载

评论 #39364550 未加载

评论 #39364245 未加载

评论 #39364690 未加载

评论 #39364371 未加载

Mikejames超过 1 年前

anyone know if this is a PCI passthrough for a full a100? or some fancy clever vgpu thing?

评论 #39366180 未加载

评论 #39366098 未加载

bugbuddy超过 1 年前

This is amazing and it shows that Nvidia should be the most valuable stock in the world. Every company, country, city, town, village, large enterprise, medium and small business, AI bro, Crypto bro, gamer bro, big tech, small tech, old tech, new tech, and start up want Nvidia GPUs. Nvidia GPUs will become the new green oil of the 21st century. I am all in and nothing short of a margin call will change my mind.

dvrp超过 1 年前

too expensive

m3kw9超过 1 年前

Now having GPUs is news now?

faust201超过 1 年前

> The speed of light is only so fastThis is the title of one of the sections. Why? Think IT sector needs to stop using such titles.