11.2. scrapy 命令

  1. 云栖社区>
  2. Netkiller>
  3. 博客>
  4. 正文

11.2. scrapy 命令

玄学酱 2018-01-11 13:50:00 浏览686
展开阅读全文
		
neo@MacBook-Pro ~/Documents/crawler % scrapy     
Scrapy 1.4.0 - project: crawler

Usage:
  scrapy <command> [options] [args]

Available commands:
  bench         Run quick benchmark test
  check         Check spider contracts
  crawl         Run a spider
  edit          Edit spider
  fetch         Fetch a URL using the Scrapy downloader
  genspider     Generate new spider using pre-defined templates
  list          List available spiders
  parse         Parse URL (using its spider) and print the results
  runspider     Run a self-contained spider (without creating a project)
  settings      Get settings values
  shell         Interactive scraping console
  startproject  Create new project
  version       Print Scrapy version
  view          Open URL in browser, as seen by Scrapy

Use "scrapy <command> -h" to see more info about a command
		
		

11.2.1. 

			
neo@MacBook-Pro ~/Documents % scrapy startproject crawler 
New Scrapy project 'crawler', using template directory '/usr/local/lib/python3.6/site-packages/scrapy/templates/project', created in:
    /Users/neo/Documents/crawler

You can start your first spider with:
    cd crawler
    scrapy genspider example example.com
			
			

11.2.2. 新建 spider

			
neo@MacBook-Pro ~/Documents/crawler % scrapy genspider netkiller netkiller.cn
Created spider 'netkiller' using template 'basic' in module:
  crawler.spiders.netkiller
			
			

11.2.3. 列出可用的 spiders

			
neo@MacBook-Pro ~/Documents/crawler % scrapy list
bing
book
example
netkiller			
			
			

11.2.4. 运行 spider

			
neo@MacBook-Pro ~/Documents/crawler % scrapy crawl netkiller
			
			

运行结果输出到 json 文件中

			
neo@MacBook-Pro ~/Documents/crawler % scrapy crawl netkiller -o output.json					
			
		





原文出处:Netkiller 系列 手札
本文作者:陈景峯
转载请与作者联系,同时请务必标明文章原始出处和作者信息及本声明。

网友评论

登录后评论
0/500
评论
玄学酱
+ 关注
所属云栖号: Netkiller