python 2.7 - Scrapy - not retuning data -

i new scrapy , trying extract tweets https://twitter.com/abctv. know there api , learning exercise. code returns 0 tweets. item definition:

import scrapy  class tweet(scrapy.item):            username = scrapy.field()     data = scrapy.field()     created_at= scrapy.field()     retweet_count = scrapy.field()

my crawler definition:

class twitterspider(scrapy.spider):     name = "twitter"     allowed_domains = ["https://twitter.com/mycpl"]     start_urls = ["https://twitter.com/mycpl"]      def parse(self, response):         sel=selector(response)         tweets=sel.xpath('//div[@class="content"]')         items = []       tweet in tweets:         item = tweet()         item['username']=tweet.xpath('.//*[starts-with(@class,"username")]//text()').extract()         item['created_at']=tweet.xpath('.//*[starts-with(@class,"_timestamp")]//@data-time-ms').extract()         item['retweet_count']=tweet.xpath('.//*[starts-with(@class,"profiletweet-actioncountforpresentation")]/text()').extract()         item['data']=tweet.xpath('p//text()').extract()         items.append(item)      return items

please assist learning , explain why not receiving responses when run scrapy crawl twitter -o test -t json.

edited -- fixed xpath code , works (only 20 because of scroll infinite scroll issue)

Search This Blog

Guide

python 2.7 - Scrapy - not retuning data -

Comments

Post a Comment

Popular posts from this blog

dns - Dokku server hosts two sites with TLD's, both domains are landing on only one app -

c# - ajax - How to receive data both html and json from server? -

ajax - ERR_CONNECTION_REFUSED in Chrome while loading jQuery DataTable server side -