Installation¶
The easy way¶
The easiest way to install Scrapy Do is using pip
. You can then create a
directory where you want your project data stored and just start the daemon
there.
$ pip install scrapy-do $ mkdir /home/user/my-scrapy-do-data $ cd /home/user/my-scrapy-do-data $ scrapy-do scrapy-do
Yup, you need to type scrapy-do
twice. That’s how twisted works, don’t ask me. After doing that, you
will see some content in this directory including the log file and the pidfile
of the Scrapy Do daemon.
A systemd service¶
Installing Scrapy Do as a systemd service is a far better idea than the easy way described above. It’s a bit of work that should really be done by a proper Debian/Ubuntu package, but we do not have one for the time being, so I will show you how to do it “by hand.”
Although not strictly necessary, it’s a good practice to run the daemon under a separate user account. I will create one called
pydaemon
because I run a couple more python daemons this way.$ sudo useradd -m -d /opt/pydaemon pydaemon
Make sure you have all of the following packages installed:
$ sudo apt-get install python3 python3-dev python3-virtualenv $ sudo apt-get install build-essential
Switch your session to this new user account:
$ sudo su - pydaemon
Create the virtual env and install Scrapy Do:
$ mkdir virtualenv $ cd virtualenv/ $ python3 /usr/lib/python3/dist-packages/virtualenv.py -p /usr/bin/python3 . $ . ./bin/activate $ pip install scrapy-do $ cd ..
Create a bin directory and a wrapper script that will set up the virtualenv on startup:
$ mkdir bin $ cat > bin/scrapy-do << EOF > #!/bin/bash > . /opt/pydaemon/virtualenv/bin/activate > exec /opt/pydaemon/virtualenv/bin/scrapy-do "\${@}" > EOF $ chmod 755 bin/scrapy-do
Create a data directory and a configuration file:
$ mkdir -p data/scrapy-do $ mkdir etc $ cat > etc/scrapy-do.conf << EOF > [scrapy-do] > project-store = /opt/pydaemon/data/scrapy-do > EOF
As root, create the following file with the following content:
# cat > /etc/systemd/system/scrapy-do.service << EOF > [Unit] > Description=Scrapy Do Service > > [Service] > ExecStart=/opt/pydaemon/bin/scrapy-do --nodaemon --pidfile= \ > scrapy-do --config /opt/pydaemon/etc/scrapy-do.conf > User=pydaemon > Group=pydaemon > Restart=always > > [Install] > WantedBy=multi-user.target > EOF
You can then reload the systemd configuration and let it manage the Scrapy Do daemon:
$ sudo systemctl daemon-reload $ sudo systemctl start scrapy-do $ sudo systemctl enable scrapy-do
Finally, you should now be able to see that the daemon is running:
$ sudo systemctl status scrapy-do ● scrapy-do.service - Scrapy Do Service Loaded: loaded (/etc/systemd/system/scrapy-do.service; enabled; vendor preset: enabled) Active: active (running) since Sun 2017-12-10 22:42:55 UTC; 4min 23s ago Main PID: 27543 (scrapy-do) ...
I know its awfully complicated. I will do some packaging work when I have a spare moment.