regex - Padding multiple character with space - python -
in perl, can following pad punctuation symbols spaces:
s/([،;؛¿!"\])}»›”؟%٪°±©®।॥…])/ $1 /g;` in python, i've tried this:
>>> p = u'،;؛¿!"\])}»›”؟%٪°±©®।॥…' >>> text = u"this, sentence weird» symbols… appearing everywhere¿" >>> in p: ... text = text.replace(i, ' '+i+' ') ... >>> text u'this, sentence weird \xbb symbols \u2026 appearing everywhere \xbf ' >>> print text this, sentence weird » symbols … appearing everywhere ¿ but there way use sort of placeholder symbol, e.g. $1 in perl can same in python 1 regex?
python version of $1 \1, should use regex substitution instead of simple string replace:
import re p = ur'([،;؛¿!"\])}»›”؟%٪°±©®।॥…])' text = u"this, sentence weird» symbols… appearing everywhere¿" print re.sub(p, ur' \1 ', text) outputs:
this , sentence weird » symbols … appearing everywhere ¿
Comments
Post a Comment