Installation

The easy way

The easiest way to install Scrapy Do is using pip. You can then create a directory where you want your project data stored and just start the daemon there.

$ pip install scrapy-do
$ mkdir /home/user/my-scrapy-do-data
$ cd /home/user/my-scrapy-do-data
$ scrapy-do scrapy-do

Yup, you need to type scrapy-do twice. That’s how twisted works, don’t ask me. After doing that, you will see some content in this directory including the log file and the pidfile of the Scrapy Do daemon.

A systemd service

Installing Scrapy Do as a systemd service is a far better idea than the easy way described above. It’s a bit of work that should really be done by a proper Debian/Ubuntu package, but we do not have one for the time being, so I will show you how to do it “by hand.”

  • Although not strictly necessary, it’s a good practice to run the daemon under a separate user account. I will create one called pydaemon because I run a couple more python daemons this way.

    $ sudo useradd -m -d /opt/pydaemon pydaemon
    
  • Make sure you have all of the following packages installed:

    $ sudo apt-get install python3 python3-dev python3-virtualenv
    $ sudo apt-get install build-essential
    
  • Switch your session to this new user account:

    $ sudo su - pydaemon
    
  • Create the virtual env and install Scrapy Do:

    $ mkdir virtualenv
    $ cd virtualenv/
    $ python3 /usr/lib/python3/dist-packages/virtualenv.py -p /usr/bin/python3 .
    $ . ./bin/activate
    $ pip install scrapy-do
    $ cd ..
    
  • Create a bin directory and a wrapper script that will set up the virtualenv on startup:

    $ mkdir bin
    $ cat > bin/scrapy-do << EOF
    > #!/bin/bash
    > . /opt/pydaemon/virtualenv/bin/activate
    > exec /opt/pydaemon/virtualenv/bin/scrapy-do "\${@}"
    > EOF
    $ chmod 755 bin/scrapy-do
    
  • Create a data directory and a configuration file:

    $ mkdir -p data/scrapy-do
    $ mkdir etc
    $ cat > etc/scrapy-do.conf << EOF
    > [scrapy-do]
    > project-store = /opt/pydaemon/data/scrapy-do
    > EOF
    
  • As root, create the following file with the following content:

    # cat > /etc/systemd/system/scrapy-do.service << EOF
    > [Unit]
    > Description=Scrapy Do Service
    >
    > [Service]
    > ExecStart=/opt/pydaemon/bin/scrapy-do --nodaemon --pidfile= \
    >           scrapy-do --config /opt/pydaemon/etc/scrapy-do.conf
    > User=pydaemon
    > Group=pydaemon
    > Restart=always
    >
    > [Install]
    > WantedBy=multi-user.target
    > EOF
    
  • You can then reload the systemd configuration and let it manage the Scrapy Do daemon:

    $ sudo systemctl daemon-reload
    $ sudo systemctl start scrapy-do
    $ sudo systemctl enable scrapy-do
    
  • Finally, you should now be able to see that the daemon is running:

    $ sudo systemctl status scrapy-do
    ● scrapy-do.service - Scrapy Do Service
       Loaded: loaded (/etc/systemd/system/scrapy-do.service; enabled; vendor preset: enabled)
       Active: active (running) since Sun 2017-12-10 22:42:55 UTC; 4min 23s ago
     Main PID: 27543 (scrapy-do)
    ...
    

I know its awfully complicated. I will do some packaging work when I have a spare moment.