Website Scraping with Python
This book is 80% complete
Last updated on 2016-09-24
About the Book
This book is the follow-up of my previous one: "XML processing and website scraping in Java". There I looked at ways and tools to process XML and HTML in Java, did some performace comparisons and introduced some new programming concepts to make things even better.
In this book I take a closer look at website scraping with the two tools used nowadays: BeautifulSoup and Scrapy.
I create the sample application from the Java book -- now in Python, use the two tools for parsing, show examples how to export CSV files in Python.
As a bonus I will compare the two tools for their runtime, try to tweak where possible and I will give a quick introduction on plotting the runtimes as charts.
The book is planned to be finished in Spring 2017. Until that you can buy the book for a discounted price. The final book will be around $35.
I will write about the following topics in this book:
- Performance comparison
- Plotting in Python
- Functional programming with Python
- Parallel code execution with Python
- Sample application to gather Amazon data
- Other real-life projects (source code coming soon into the package)
- Update for Scrapy's release and Python 3 (coming soon)
The Book + Source Code for the last chapter
This bundle contains the book "Website Scraping with Python" and the source code for the example project created along with the last chapter "Extra! Extra! Read all about it!".
These are the source codes for three projects in the book for the chapters "Two real-life projects " and "Extra! Extra! Read all about it!". These chapters contain spiders created with either BeautifulSoup or Scrapy to gather information from the web.
Little Free Libraryhttp://www.littlefreelibrary.org
Our mission is to promote literacy and the love of reading by building free book exchanges worldwide and to build a sense of community as we share skills, creativity and wisdom across generations.
The Leanpub Unconditional, No Risk, 100% Happiness Guarantee
Within 45 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.
See full terms