python 2.7 - Removing Duplicate Tag Content Using BeautifulSoup -

August 15, 2011

i made script getting every h1 tag 76 pages of website. in process program copy specific line "current affairs january 2015" line present in every page. can edit code print 1 time ?

here's code:

from bs4 import beautifulsoup bs import urllib   in range(2,77):     url1="http://currentaffairs.gktoday.in/month/current-affairs-january-2015/"+"page/"+str(i)     soup = bs(urllib.urlopen(url1))     link in soup.findall('h1'):         print link.string

thanks in advance.

from bs4 import beautifulsoup bs import urllib   in range(2,77):     url1="http://currentaffairs.gktoday.in/month/current-affairs-january-2015/"+"page/"+str(i)     soup = bs(urllib.urlopen(url1))     ulinks = soup.findall('h1')     index, item in enumerate(ulinks):            if == 2:                                 print(item.string)                           if != 2:                 if index != 0:                                                                   print(item.string)

Search This Blog

Live one

python 2.7 - Removing Duplicate Tag Content Using BeautifulSoup -

Comments

Post a Comment

Popular posts from this blog

php - XML feed for Wordpress Social Board plugin modifications -

php - Wordpress website dashboard page or post editor content is not showing but front end data is showing properly -

javascript - Twitter Bootstrap - how to add some more margin between tooltip popup and element -