Scrapy is finally 1.0 !!!
Jun 29, 2015 • 2 minutes to read • Last Updated: Oct 25, 2017When I started using Scrapy I was totally blown away by how easily you could write scrapers and start scraping within minutes. You might be familiar with Scrapy v0.24 if you have started scraping recently just like me. But what is great about this version is that over 7 years of testing later, Pablo and his team have finally moved to their first production release.
So here are some points about the next release that you should be excited about if you are an existing Scrapy developer. Some of these code snippets can be found on the official release notes.
scrapy.Item replaced by python dictionaries
You know how you needed to import scrapy.Item
in your custom item classes ? Well in v1.0 you can forget about it all! Heres a code snippet to return your item without the importing an extended scrapy Item
class.
No more doing this :
Instead do this in your scrapy Spider class:
Custom spider settings
Remember the settings you needed to write down in your settings.py file that would make it confusing for you to keep track of when you looked at your spider. Well, you have a choice now to write down your settings in a scrapy Spider instead.
Logging with the python logger
Scrapy is built on top of this marvellous thing that you have never heard of called Twisted. Scrapy does this work so efficiently that you will never need to read the logs that Twisted has created. I would definitely recommend reading about Twisted in your free time. It’s beautiful how someone thought about using a single threaded application to work around the python GIL. Okay, back to logging, you don’t need to use import log
from scrapy
module anymore, instead use the basic python logger. Code snippet can be found here.
There are other changes in the log that you can read about but the most essential ones are covered above. Let me know how you are planning to scale or start using Scrapy 1.0 version.
I am writing a book!
While I do appreciate you reading my blog posts, I would like to draw your attention to another project of mine. I have slowly begun to write a book on how to build web scrapers with python. I go over topics on how to start with scrapy and end with building large scale automated scraping systems.
If you are looking to build web scrapers at scale or just receiving more anecdotes on python then please signup to the email list below.