python - How to regex hashtag with digits after in python2? -

i want find hashtag 1 6 digits after python 2.7 , regex doesn't match properly.

here example :

chaine = "[url=http://forum.darkgyver.fr/t27142-probleme-boitier-papillon-faisceau#265132:<uid>]http://forum.darkgyver.fr/t27142-probleme-boitier-papillon-faisceau#265132[/url:<uid>]" regex = re.compile('http://forum.darkgyver.fr/(.*)\#(\d{1-6})') match = regex.search(chaine) if match:         pos1 = match.start()         pos2 = match.end() else:             pos1 = -1         pos2 = -1  print "pos1 %d" % pos1 print "pos2 %d" % pos2 url_tempo = chaine[pos1:pos2] print "url_tempo %s" % url_tempo             pospost = pos1 + url_tempo.find('#') + 1 numpost = chaine[pospost:pos2] print "numpost %s" % numpost

this first regex returns "no match". perhaps hashtag not declared properly.

so changed regex follows:

regex = re.compile('http://forum.darkgyver.fr/(.*)\#([0-9]+(:| |    |\n|\[|$))')

which matches wrong position pos2=161 should pos2=80

how can fix regex hashtag , 1 6 digits behind?

you attempting extract hashtag url. string have given, seem more logical try , extract digits between # , : characters. if had hashtag 7 digits, want 7 digits or want not match it? in case, guess not want first 6 digits.

by using grouping operator (), if there match, hashtag can seen using qroup(1) command, avoiding need try , extract using string slicing.

the following shows 1 possible way extract hashtag:

chaine = "[url=http://forum.darkgyver.fr/t27142-probleme-boitier-papillon-faisceau#265132:<uid>]http://forum.darkgyver.fr/t27142-probleme-boitier-papillon-faisceau#265132[/url:<uid>]"  re_hashtag = re.search(re.escape("http://forum.darkgyver.fr") + ".*?#(\d+):", chaine)  print re_hashtag.start()     print re_hashtag.end() print re_hashtag.group(1)

this display following:

5 80 265132

the start position of 5 because starts matching chosen http.

note, have used escape() function make sure url correctly escaped. if print following see how initial regular expression should have been written:

print re.escape("http://forum.darkgyver.fr")

giving:

http\:\/\/forum\.darkgyver\.fr

Search This Blog

Guide

python - How to regex hashtag with digits after in python2? -

Comments

Post a Comment

Popular posts from this blog

dns - Dokku server hosts two sites with TLD's, both domains are landing on only one app -

c# - ajax - How to receive data both html and json from server? -

ajax - ERR_CONNECTION_REFUSED in Chrome while loading jQuery DataTable server side -