Handling exceptions from urllib2 and mechanize in Python -
i novice @ using exception handling. using mechanize module scrape several websites. program fails because connection slow , because requests timeout. able retry website (on timeout, instance) 5 times after 30 second delays between each try.
i looked @ this stackoverflow answer , can see how can handle various exceptions. see (although looks clumsy) how can put try/exception inside while loop control 5 attempts ... not understand how break out of loop, or "continue" when connection successful , no exception has been thrown.
from mechanize import browser import time b = browser() tried=0 while tried < 5: try: r=b.open('http://www.google.com/foobar') except (mechanize.httperror,mechanize.urlerror) e: if isinstance(e,mechanize.httperror): print e.code tried += 1 sleep(30) if tried > 4: exit() else: print e.reason.args tried += 1 sleep(30) if tried > 4: exit() print "how can here after first successful b.open() attempt????"
i appreciate advice (1) how break out of loop on successful open, , (2) how make whole block less clumsy/more elegant.
you don't have repeat things in except block in either case.
from mechanize import browser import time b = browser() tried=0 while true: try: r=b.open('http://www.google.com/foobar') except (mechanize.httperror,mechanize.urlerror) e: tried += 1 if isinstance(e,mechanize.httperror): print e.code else: print e.reason.args if tried > 4: exit() sleep(30) continue break
also, may able use while not r:
depending on browser.open
returns.
edit: roadierich showed more elegant way with
try: dosomething() break except: ...
because error skips except block.
Comments
Post a Comment