python - How to limit the number of concurrent Http requests accross asynchronous Celery tasks -
i crawling many websites asynchronously using celery tasks ( + python requests , phantomjs) use crawlera proxy account has limit of 100 concurrent requests.
i wondering how best way this. know can use redis (or else) semaphore , re-trigger task when acquire() fails after random number of seconds think approach not good.
what makes think approach not good? may or may not situation - depends entirely on details of implementation , environment.
an alternative approach constrain number of simultaneous connections 1 worker can make, , constrain number of workers/tasks total number of connections never exceeds 100.
Comments
Post a Comment