python - xpath query on id //*[@id="page"] returns two elements -
i'm trying scrap site ketabejam.ir i'm using python3.4.1 , parsing use lxml 3.4.1
way parsed lxml.html.fromstring method
when load document on interpreter , ask following query number of pages , can handle pagination:
s = doc.xpath("//*[@id='page']")
surprisingly result:
>>>len(s) == 2 true
i got address of element firebug's minimal xpath, when choose normal xpath , query run smoothly
bug, or i'm doing wrong??
looking @ page source page linked, there 2 elements id
in page. 1 of top of table, , other 1 of bottom of table.
the copy minimal
xpath version of firebug works based on id
of element. available elements have id
tag , creates xpath in format -
//*[@id="elementid"]
which getting.
ideally, in every html page , there should 1 element particular id
, id
should unique across page. , seem firebug
's minimal xpath depends on that.
in context, think both elements return same link, can use either continue scraping. or indicated , can use normal xpath that.
Comments
Post a Comment