TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Celery: A Distributed Task Queue for Django

71 pointsby Jasberalmost 16 years ago

6 comments

anuraggoelalmost 16 years ago
Looks interesting. But shouldn't a library like celery work outside the context of a web framework? I don't see a reason to call this a distributed task queue 'for Django' specifically, except for the dependencies on Django's ORM and settings definitions. Swapping out Django's ORM with SQLAlchemy (or DB-API) would make this project much more useful.<p>See pp (<a href="http://www.parallelpython.com/" rel="nofollow">http://www.parallelpython.com/</a>) for something similar, without the django dependency. More parallel processing goodies at <a href="http://wiki.python.org/moin/ParallelProcessing" rel="nofollow">http://wiki.python.org/moin/ParallelProcessing</a>.
评论 #655048 未加载
评论 #655094 未加载
piealmost 16 years ago
Having just hacked together an ugly threaded task queue for scraping and multi-stage data processing in Django, this looks like a breath of fresh air. I need to work my way out of the self-inflicted mess I've created.<p>Does anyone have experience with this library or anything similar?
评论 #654930 未加载
tdavisalmost 16 years ago
beanstalkd (<a href="http://xph.us/software/beanstalkd/" rel="nofollow">http://xph.us/software/beanstalkd/</a>) also has similarities to this, and for non-Django / simpler needs, it may be better. It's basically memcached repurposed into a queue server.<p>A "task" would be equivalent to a script which only looks for jobs in a certain bucket (or "tube" as they're called). You can run as many clients on as many machines as you like. Obviously, since it is memory-based, you'll lose the queue in the event of a system crash.<p>That being said, as a rabid Django user, this is definitely going into my bookmarks!
评论 #655220 未加载
评论 #655510 未加载
diN0botalmost 16 years ago
I've been reading through the documentation on the celery github page. I haven't been able to figure out the appropriate task breakdown. That is, I'm trying to do some crawling and ingestion, and I'm wondering if I should be pushing a dozen small tasks onto the queue every second, or push larger tasks (possibly with subtasks broken out like it suggests) every minute or hour.<p>This sounds like a dumb question to my own ears, but I just don't have to familiarity to know the proper use case. I essentially want continuous crawling and ingestion with the potential to spread the load across multiple servers one day.<p>(presumably the ingestors would be populating local databases, with a query getting farmed out to each server+database, but I haven't figured that part out, either....ummm, sounds like a task I could put into the queue, as well. Are these things really nails?)<p>I'd be grateful if anyone can point me to some examples or provide a bit of context.
评论 #656461 未加载
amixalmost 16 years ago
I have often wondered why not use a MySQL table as a "queue" (or more tables if needed). Basically, you get great performance (MySQL is really fast), you get great language support (a LOT of languages can add tasks via simple SQL) and you get such things like easy backups and replication.
评论 #655324 未加载
评论 #655330 未加载
mshafriralmost 16 years ago
Google App Engine needs something like this.
评论 #656199 未加载