Scrapy documentation
This documentation contains everything you need to know about Scrapy.
Getting help
Having trouble? We’d like to help!
Basic concepts
- Command line tool
- Learn about the command-line tool used to manage your Scrapy project.
- Items
- Define the data you want to scrape.
- Spiders
- Write the rules to crawl your websites.
- Selectors
- Extract the data from web pages using XPath.
- Scrapy shell
- Test your extraction code in an interactive environment.
- Item Loaders
- Populate your items with the extracted data.
- Item Pipeline
- Post-process and store your scraped data.
- Feed exports
- Output your scraped data using different formats and storages.
- Link Extractors
- Convenient classes to extract links to follow from pages.
Built-in services
- Logging
- Understand the simple logging facility provided by Scrapy.
- Stats Collection
- Collect statistics about your scraping crawler.
- Sending e-mail
- Send email notifications when certain events occur.
- Telnet Console
- Inspect a running crawler using a built-in Python console.
- Web Service
- Monitor and control a crawler using a web service.