Python爬取房產(chǎn)數(shù)據(jù)，在地圖上展現(xiàn)！

小伙伴，我又來了，這次我們寫的是用python爬蟲爬取烏魯木齊的房產(chǎn)數(shù)據(jù)并展示在地圖上，地圖工具我用的是 BDP個人版-免費(fèi)在線數(shù)據(jù)分析軟件，數(shù)據(jù)可視化軟件，這個可以導(dǎo)入csv或者excel數(shù)據(jù)。

創(chuàng)新互聯(lián)公司是專業(yè)的松原網(wǎng)站建設(shè)公司，松原接單;提供成都網(wǎng)站設(shè)計、成都網(wǎng)站制作、外貿(mào)網(wǎng)站建設(shè),網(wǎng)頁設(shè)計,網(wǎng)站設(shè)計,建網(wǎng)站,PHP網(wǎng)站建設(shè)等專業(yè)做網(wǎng)站服務(wù);采用PHP框架,可快速的進(jìn)行松原網(wǎng)站開發(fā)網(wǎng)頁制作和功能擴(kuò)展;專業(yè)做搜索引擎喜愛的網(wǎng)站,專業(yè)的做網(wǎng)站團(tuán)隊,希望更多企業(yè)前來合作!

首先還是分析思路，爬取網(wǎng)站數(shù)據(jù)，獲取小區(qū)名稱，地址，價格，經(jīng)緯度，保存在excel里。再把excel數(shù)據(jù)上傳到BDP網(wǎng)站，生成地圖報表

本次我使用的是scrapy框架，可能有點大材小用了，主要是剛學(xué)完用這個練練手，再寫代碼前我還是建議大家先分析網(wǎng)站，分析好數(shù)據(jù)，再去動手寫代碼，因為好的分析可以事半功倍，烏魯木齊樓盤,2017烏魯木齊新樓盤,烏魯木齊樓盤信息 - 烏魯木齊吉屋網(wǎng) 這個網(wǎng)站的數(shù)據(jù)比較全，每一頁獲取房產(chǎn)的LIST信息，并且翻頁，點進(jìn)去是詳情頁，獲取房產(chǎn)的詳細(xì)信息(包含名稱，地址，房價，經(jīng)緯度)，再用pipelines保存item到excel里，最后在bdp生成地圖報表，廢話不多說上代碼：

JiwuspiderSpider.py

 
 
 
 
  
  
  
  # -*- coding: utf-8 -*-   
  
  
  from scrapy import Spider,Request   
  
  
  import re   
  
  
  from jiwu.items import JiwuItem   
  
  
     
  
  
     
  
  
  class JiwuspiderSpider(Spider):   
  
  
      name = "jiwuspider"   
  
  
      allowed_domains = ["wlmq.jiwu.com"]   
  
  
      start_urls = ['http://wlmq.jiwu.com/loupan']   
  
  
     
  
  
      def parse(self, response):   
  
  
          """   
  
  
          解析每一頁房屋的list   
  
  
          :param response:    
  
  
          :return:    
  
  
          """   
  
  
          for url in response.xpath('//a[@class="index_scale"]/@href').extract():   
  
  
              yield Request(url,self.parse_html)  # 取list集合中的url  調(diào)用詳情解析方法   
  
  
     
  
  
          # 如果下一頁屬性還存在，則把下一頁的url獲取出來   
  
  
          nextpage = response.xpath('//a[@class="tg-rownum-next index-icon"]/@href').extract_first()   
  
  
          #判斷是否為空   
  
  
          if nextpage:   
  
  
              yield Request(nextpage,self.parse)  #回調(diào)自己繼續(xù)解析   
  
  
     
  
  
     
  
  
     
  
  
      def parse_html(self,response):   
  
  
          """   
  
  
          解析每一個房產(chǎn)信息的詳情頁面，生成item   
  
  
          :param response:    
  
  
          :return:    
  
  
          """   
  
  
          pattern = re.compile('.*?lng = \'(.*?)\';.*?lat = \'(.*?)\';.*?bname = \'(.*?)\';.*?'   
  
  
                               'address = \'(.*?)\';.*?price = \'(.*?)\';',re.S)   
  
  
          item = JiwuItem()   
  
  
          results = re.findall(pattern,response.text)   
  
  
          for result in results:   
  
  
              item['name'] = result[2]   
  
  
              item['address'] = result[3]   
  
  
              # 對價格判斷只取數(shù)字，如果為空就設(shè)置為0   
  
  
              pricestr =result[4]   
  
  
              pattern2 = re.compile('(\d+)')   
  
  
              s = re.findall(pattern2,pricestr)   
  
  
              if len(s) == 0:   
  
  
                  item['price'] = 0   
  
  
              else:item['price'] = s[0]   
  
  
              item['lng'] = result[0]   
  
  
              item['lat'] = result[1]   
  
  
          yield item

item.py

 
 
 
 
  
  
  
  # -*- coding: utf-8 -*-   
  
  
     
  
  
  # Define here the models for your scraped items   
  
  
  #   
  
  
  # See documentation in:   
  
  
  # http://doc.scrapy.org/en/latest/topics/items.html   
  
  
     
  
  
  import scrapy   
  
  
     
  
  
     
  
  
  class JiwuItem(scrapy.Item):   
  
  
      # define the fields for your item here like:   
  
  
      name = scrapy.Field()   
  
  
      price =scrapy.Field()   
  
  
      address =scrapy.Field()   
  
  
      lng = scrapy.Field()   
  
  
      lat = scrapy.Field()   
  
  
     
  
  
      pass

pipelines.py 注意此處是吧mongodb的保存方法注釋了，可以自選選擇保存方式

 
 
 
 
  
  
  
  # -*- coding: utf-8 -*-   
  
  
     
  
  
  # Define your item pipelines here   
  
  
  #   
  
  
  # Don't forget to add your pipeline to the ITEM_PIPELINES setting   
  
  
  # See: http://doc.scrapy.org/en/latest/topics/item-pipeline.html   
  
  
  import pymongo   
  
  
  from scrapy.conf import settings   
  
  
  from openpyxl import workbook   
  
  
     
  
  
  class JiwuPipeline(object):   
  
  
      wb = workbook.Workbook()   
  
  
      ws = wb.active   
  
  
      ws.append(['小區(qū)名稱', '地址', '價格', '經(jīng)度', '緯度'])   
  
  
      def __init__(self):   
  
  
          # 獲取數(shù)據(jù)庫連接信息   
  
  
          host = settings['MONGODB_URL']   
  
  
          port = settings['MONGODB_PORT']   
  
  
          dbname = settings['MONGODB_DBNAME']   
  
  
          client = pymongo.MongoClient(host=host, port=port)   
  
  
     
  
  
          # 定義數(shù)據(jù)庫   
  
  
          db = client[dbname]   
  
  
          self.table = db[settings['MONGODB_TABLE']]   
  
  
     
  
  
      def process_item(self, item, spider):   
  
  
          jiwu = dict(item)   
  
  
          #self.table.insert(jiwu)   
  
  
          line = [item['name'], item['address'], str(item['price']), item['lng'], item['lat']]   
  
  
          self.ws.append(line)   
  
  
          self.wb.save('jiwu.xlsx')   
  
  
     
  
  
          return item

最后報表的數(shù)據(jù)

mongodb數(shù)據(jù)庫

地圖報表效果圖：https://me.bdp.cn/share/index.html?shareId=sdo_b697418ff7dc4f928bb25e3ac1d52348

文章標(biāo)題：Python爬取房產(chǎn)數(shù)據(jù)，在地圖上展現(xiàn)！
瀏覽路徑：http://uogjgqi.cn/article/cdjosee.html

掃二維碼與項目經(jīng)理溝通

我們在微信上24小時期待你的聲音

解答本文疑問/技術(shù)咨詢/運(yùn)營咨詢/技術(shù)建議/互聯(lián)網(wǎng)交流

av激情亚洲男人的天堂国语,日韩欧美精品一中文字幕,无码av一区二区三区无码,国产又色又爽又刺激的a片,国产又色又爽又刺激的a片

Python爬取房產(chǎn)數(shù)據(jù)，在地圖上展現(xiàn)！

掃二維碼與項目經(jīng)理溝通

其他資訊

行業(yè)動態(tài)

企業(yè)網(wǎng)站建設(shè)的重要性！

服務(wù)項目

網(wǎng)站建設(shè)

移動端/APP

微信/小程序

技術(shù)支持

其它服務(wù)

更多服務(wù)項目

聯(lián)系吧在百度地圖上找到我們

電話：13518219792

av激情亚洲男人的天堂国语,日韩欧美精品一中文字幕,无码av一区二区三区无码,国产又色又爽又刺激的a片,国产又色又爽又刺激的a片

Python爬取房產(chǎn)數(shù)據(jù)，在地圖上展現(xiàn)！

掃二維碼與項目經(jīng)理溝通

其他資訊

行業(yè)動態(tài)

企業(yè)網(wǎng)站建設(shè)的重要性！

服務(wù)項目

網(wǎng)站建設(shè)

移動端/APP

微信/小程序

技術(shù)支持

其它服務(wù)

更多服務(wù)項目

聯(lián)系吧 在百度地圖上找到我們

電話：13518219792

Python爬取房產(chǎn)數(shù)據(jù)，在地圖上展現(xiàn)！

企業(yè)網(wǎng)站建設(shè)的重要性！

聯(lián)系吧在百度地圖上找到我們