Quick Start

  • Install scrapy-do using pip:

    $ pip install scrapy-do
    
  • Start the daemon in the foreground:

    $ scrapy-do -n scrapy-do
    
  • Open another terminal window and store the server’s URL in the client’s configuration file so that you don’t have to type it all the time:

    $ cat > ~/.scrapy-do.cfg << EOF
    > [scrapy-do]
    > url=http://localhost:7654
    > EOF
    
  • Download the Scrapy’s Quotesbot example and push the code to the server:

    $ git clone https://github.com/scrapy/quotesbot.git
    $ cd quotesbot
    $ scrapy-do-cl push-project
    +----------------+
    | spiders        |
    |----------------|
    | toscrape-css   |
    | toscrape-xpath |
    +----------------+
    
  • Schedule some jobs:

    $ scrapy-do-cl schedule-job --project quotesbot \
        --spider toscrape-css --when 'every 5 to 15 minutes'
    +--------------------------------------+
    | identifier                           |
    |--------------------------------------|
    | 0a3db618-d8e1-48dc-a557-4e8d705d599c |
    +--------------------------------------+
    
    $ scrapy-do-cl schedule-job --project quotesbot --spider toscrape-css
    +--------------------------------------+
    | identifier                           |
    |--------------------------------------|
    | b3a61347-92ef-4095-bb68-0702270a52b8 |
    +--------------------------------------+
    
  • See what’s going on:

    $ scrapy-do-cl list-jobs
    +--------------------------------------+-----------+--------------+-----------+-----------------------+---------+----------------------------+------------+
    | identifier                           | project   | spider       | status    | schedule              | actor   | timestamp                  | duration   |
    |--------------------------------------+-----------+--------------+-----------+-----------------------+---------+----------------------------+------------|
    | b3a61347-92ef-4095-bb68-0702270a52b8 | quotesbot | toscrape-css | RUNNING   | now                   | USER    | 2018-01-27 08:32:19.781720 |            |
    | 0a3db618-d8e1-48dc-a557-4e8d705d599c | quotesbot | toscrape-css | SCHEDULED | every 5 to 15 minutes | USER    | 2018-01-27 08:29:24.749770 |            |
    +--------------------------------------+-----------+--------------+-----------+-----------------------+---------+----------------------------+------------+