python - Regular expressions from a list previously specified -
i trying following: each article print month located in either 4th or 5th line. way attempting by:
m = 'january', 'february', 'march', 'april', 'may' 'june', 'july', 'august', 'september', 'october', 'novemeber', 'december' in range(len(sections)): date = re.search(r"[m]",sections[i][1:5]) print(date) first problem. not know how search regular expression in list "m". second problem, want focus search in lines 0-5 of each article.
given:
>>> txt='''\ ... line 1 ... line 2 ... line 3 ... line 4 ... line 5 april''' you can i through j line .splitlines()[i:j]:
>>> txt.splitlines()[0:3] ['line 1', 'line 2', 'line 3'] now construct pattern finds months. sure use \b find whole word matches:
>>> months=['january', 'february', 'march', 'april', 'may' 'june', 'july', 'august', 'september', 'october', 'novemeber', 'december'] >>> pat=re.compile("|".join([r"\b{}\b".format(m) m in months]), re.m) then search pattern in slice of target lines:
>>> pat.search("\n".join(txt.splitlines()[0:5])) <_sre.sre_match object @ 0x107a2a9f0> if want capture line appears on, might this
Comments
Post a Comment