Celery 5.0

172 pointsby dragonshover 4 years ago

20 comments

sharmiover 4 years ago

Celery was never reliable for me. I used Celery as a part of Django. It kept running into weird issues. Celery-Flower used to monitor the Celery threads is not maintained and that became a huge issue for me as I had no way of tracking any issue.I have also used Celery parallelize crawling of multiple websites in parallel. Beyond a small number of workers, Celery would go into a frozen state (My data flow was one way. No chance that I know of deadlocking.) The code felt too convoluted to dive in and understand.I have since shifted to Dramatiq ( <a href="https://dramatiq.io/" rel="nofollow">https://dramatiq.io/</a> ). It has all the features I like of Celery (Chaining of async methods with error handling when any one link on the chain fails), with super reliability. It also has Django support.Having said that, it must be recognized that Dramatiq was built from the success and learnings gained from Celery. Celery is a wonderful piece of work still in active use in almost all Python shops. I am just thankful that we now have so many alternatives.PS: The Django-Q mentioned in another comment looks very well structured, comprehensive and actively developed. I shall give it a try. For now, so happy with Dramatiq.

评论 #24698118 未加载

评论 #24699851 未加载

评论 #24705776 未加载

bkovacevover 4 years ago

Amazing news and congrats to the devs! We rely on Celery to handle about 700k tasks daily and it has not failed us once.However, if anyone from the maintainers is reading - could you guys please have auvipy not deliberately close outstanding issues / PRs that he thinks are not important and that other maintainers then proceed to re-open.Disclosure: I got banned for pointing that out only, so I may be slightly biased. [0]0 - <a href="https://github.com/celery/celery/issues/4817#issuecomment-472516894" rel="nofollow">https://github.com/celery/celery/issues/4817#issuecomment-47...</a>

评论 #24698511 未加载

mumblemumbleover 4 years ago

> Starting from now users should expect more frequent releases of major versions as we move fast and break things to bring you even better experience.I was in the process of evaluating Celery for a project, and it was looking promising, but this sentence alone might have prompted me to pull the emergency brake.I've had past experience with projects that have a policy of shipping frequent major releases in order to have frequent breaking changes, and it was never a fun time. It's not just that the breaking changes themselves are troublesome. It's also that projects that are overly liberal about removing features tend to become overly liberal about adding features, too. So that, over the long run, they have a tendency to become bloated and clunky at a faster-than-average rate.

评论 #24697524 未加载

评论 #24697849 未加载

dragonshover 4 years ago

This is great release, it fixes the long standing bug [1] of memory leak in celery beat for periodic tasks.This combined with flower [2] provides a simple platform to build powerful platform for data integration with real-time and scheduled long running jobs. Apache Airflow also use celery for distributed execution of tasks and jobs.[1] <a href="https://github.com/celery/celery/issues/4843" rel="nofollow">https://github.com/celery/celery/issues/4843</a> (fixed in 5.0 will be backported to 4.x).[2] <a href="https://flower.readthedocs.io/en/latest/" rel="nofollow">https://flower.readthedocs.io/en/latest/</a>

评论 #24697255 未加载

aprdmover 4 years ago

This trend of wanting to move fast and break old code is exactly the opposite of what I want from my dependencies.Look the story of AngularJS vs React.As a library you’re the less important part of my system, I use you to save me time so that I can focus on business logic. The moment I am spending time on you constantly is the moment I am looking for something else

randletover 4 years ago

For anyone who's been looking for a simpler Task Queue for Django, check out Django-Q [1], it's been working very nicely for me. It doesn't have all the bells and whistles of Celery but it's very simple to get up and running using the DB as a broker and comes with periodic tasks out of the box.[1] <a href="https://django-q.readthedocs.io/en/latest/" rel="nofollow">https://django-q.readthedocs.io/en/latest/</a>

评论 #24697636 未加载

评论 #24697419 未加载

评论 #24699551 未加载

ComodoHackerover 4 years ago

So nice of the devs to start their release notes with one-paragraph explanation of what the product is.A link to more detailed description is missing though.

评论 #24696516 未加载

spapas82over 4 years ago

The main problem with Celery is its overuse in the python world. It's not a bad project but sees a lot of negativity because it's used in places where it shouldn't! There are a lot of tutorials suggesting that using "async tasks" are required if you are interfacing with an external service (i.e like sending email). Most of these people suggest to actually use Celery for that.Well, the fact is that if you are sending a couple of emails per hour for users that are registering to your site you don't need to use async tasks. Just offload them to your SMTP or use something like sendgrid. It won't matter to the user if there are a couple of seconds until he sees the http response or he gets the email after 1 minute. Also, even for other kinds of async tasks you can usually get away using a management command and a cron job running once per minute. Let's suppose your users may need to create a report that needs 1 minute to be created. Just flag it to run in your request response cycle and run it in the management task from cron. The user will be notified when it's been finished. Or even if you actually need that async task functionality, just start by using a simpler way to run async tasks (that aren't as complex as celery, don't need rabbit mq and can even use the database as a broker so you won't need any external parts). You almost probably don't need to support hundreds of async tasks per minute or complex task workflows using forks joins etc. I mean if you need to use celery you probably will know it.

评论 #24697990 未加载

sdfjklover 4 years ago

So, uhm. What _is_ new? There's a bunch of removals and updated pre-reqs, but somehow the page doesn't tell what's actually new in Celery 5.0?

评论 #24696372 未加载

评论 #24697336 未加载

l-albertovichover 4 years ago

It's a bit sad that there's still no windows support, it would obviously impact performance and there are a few things to keep in mind in regards of worker health checks but I don't think it's something that justifies not supporting it (not trying to imply that they owe anyone anything though).A few years ago I added native windows support to python rq but the maintainer couldn't accept the pull request which was a bummer so I just abandoned the project.

TeeWEEover 4 years ago

We use celery heavily, but now our system started growing I wished we used a message bus pub/sub system that was language agnostic. I guess its a different beast, but a lot of the same things can be solved.And if you have first-class messages instead of function calls as the data in your queue, things like decoupling into services becomes much easier.In celery the consumer and producer are very tightly coupled (same code needed).

评论 #24699639 未加载

qwertoxover 4 years ago

> Starting from now users should expect more frequent releases of major versions as we move fast and break things to bring you even better experience.How can frequent breaking changes be a better experience?

评论 #24697390 未加载

dgroshevover 4 years ago

My main problem with Celery is their cowboy handling of distributed computing combined with poor documentation. I'm not aware of any documentation making it clear what's their choices for not-exactly-once result, for one. Failure modes aren't obvious either. Do we need to make sure that tasks are idempotent? Is there any chance they'll be retried if a worker dies suddenly? What exactly happens when a worker receives SIGKILL? How all of the above works with their Fabric (task orchestration) stuff?I just can't trust a product that doesn't even discuss those issues prominently in the documentation. Distributed computing is inherent for a background task queue, and it's one of the hardest problems out there, so their best effort patchwork of retries and checks doesn't cut it for me. They seem to code and document for a happy case, which is a huge red flag.

cylixover 4 years ago

I'm working on the infrastructure of a startup (Whova). Our backend is fully in Django, so our background tasks are run with celery since that's the main tool for that in the python community and a lot of legacy code is built around that. We process millions of tasks per day, for various things: cpu-bound logic, io (email/push notifications, 3rd part api calls, ...), scheduled tasks, ... These tasks are split across 20 queues processed by multiple workers on 6 dedicated machines.My celery experience so far has been quite awful to say the least.From the infrastructure point of view, celery has been the less reliable component of our stack by far. As others have mentioned, one problem is that celery is frequently used for what it is not meant to be. And that's true for us too. If you check the documentation, deep down, celery was originally designed for short lived tasks, cpu bound, but turned out to be used for long lived tasks. And starting to process long lived tasks is the root of many problems until you find the correct settings to make it work. Of course, these settings are either not documented, or the documentation is useless at best (it sometimes creates even more confusion). There are also very few good quality resources online regarding this.For several months, we dealt with celery workers getting stuck and not processing tasks, celery workers running out of memory, ... until we found the correct solution. And even now, we still have some random issues we have difficulty to track down due to the poor quality of monitoring around celery. It actually made me smile a few weeks ago when the engineering team of DoorDash released a blog article about celery in which they mentioned several issues we encountered, including some they still have no clue but managed to mitigate (in particular, the stuck celery queue: they need to use -Ofair to fix the scheduling algorithm!) [1]It's also very easy for developers to make mistake with celery: celery routing in Django is messy (routing of individual tasks and scheduled tasks), adding new queues need some coordination upon deployment until you automate it, generating too many scheduled tasks can make your workers run out memory, ... Celery definitely requires a solid training for all the engineers that will work with it. To be fair, this is very likely to be a true for any backgroubd processing tools: it usually is a critical part of the tech stack, but resources/training about that are less.We are still using Celery 3. We few months ago, when they released celery 4, we looked into upgrading, but it was way more work that expected as the entire configuration syntax was broken. The testing needed to deploy that to production was not worth the shot, especially when factoring it took us months to find some tricky settings to get celery to finally be somewhat stable, so why risk losing that. Now, they already are at celery 5.0, and they plan to release even more breaking updates: seriously, WTF! And if you try to report issues but you use celery 3, you'll just be told to upgrade.To be frank, I believe celery is a good project. They aren't many alternatives in python anyway. But they don't seem to listen to what their users need. It really seems that there is a gap between what they expect people to do with celery and what people do with it. I understand it's hard to provide a good default configuration suiting everyone, but then provide the appropriate documentation about how you can tune celery based on your use case, or clearly state the intented use case and limitations. So, the last thing we need is more breaking versions with more uncertainty about celery, but more documentation!If they really go on that path, it's clear that we will eventually ditch celery for something else. Celery, from our experience, is not production friendly unless you put major efforts into it, or unless your project is fairly simple.[1] <a href="https://doordash.engineering/2020/09/03/eliminating-task-processing-outages-with-kafka/" rel="nofollow">https://doordash.engineering/2020/09/03/eliminating-task-pro...</a>

评论 #24700753 未加载

remramover 4 years ago

Is there any async/promise support yet? Can I get a promise for the result of a task and await it, rather than block or poll?

评论 #24707332 未加载

didipover 4 years ago

If there is 1 wish list I have for Celery:Please take care of the Redis connections. Make sure that you can control all of the client settings and make sure that it reconnects reliably.

avinasshover 4 years ago

slightly related, how does Celery implements delayed and period tasks internally? Does it use its own (local) data store to keep track of tasks?

评论 #24697695 未加载

estover 4 years ago

celery is too complex. Use redis for simpler queue.

bsenftnerover 4 years ago

Is this not a case where the goal of the project is incompatible with a language used to implement the project?From what I can gleam from a high level scan of the project, Celery is a process/task queue, yet the intermingling of Celery's framework with whatever a process/task's goal happens to be looks like a non-starter for any organization that does not use python like BASH.This appears to be a poor architecture for a distributed programming framework. There are a good number of high quality proses/task queues already, I don't see a reason for this project at all.

heavenlyblueover 4 years ago

Don’t use Celery.It doesn’t solve a single problem that a set of bare queue consumers can’t resolve.

评论 #24696642 未加载

评论 #24696659 未加载