Simple Recipe For Grabbing Live Timing Data Using Cron Jobs on a Digital Ocean Droplet

A quick recipe for grabbing live race data, in this case GPS data from WRC live maps.

Create a Remote Server on Digital Ocean

First, we need a server somewhere. A quick way to do this is to create a simple Digital Ocean droplet:

A simple Linux box with the minimum spec, such as a min. spec. 1GB machine should do:

You can get some free Digital Ocean credit by signing up for the Github education pack; you get a smaller amount of credit ($10) if you sign up with my affiliate link – sign up for Digital Ocean and get $10 credit – and then if you end up spending $25 or more of your own cash on that account, I get a $25 h/t.

Connect via ssh

You can log in to the droplet using a web terminal or via ssh (Digital Ocean droplet – Connect with SSH).

To set up the local ssh certificate to simplify login to the Droplet (from a Mac):


We can now add our public key to the droplet as an ssh key.

On the local machine, get your public key:

cat ~/.ssh/

which should return something like:


Add the SSH key content (ssh-rsa AAAA___RANDOM/STUFF____fZ) to the droplet with the Name (USER@MACHINE).

Alternatively, if the droplet is already created, then:

ssh-copy-id root@DROPLET.IP.ADDRESS

(Under this route, I had to reset the droplet password first.)

Now we should be able to log in without a password directly over ssh:


Update packages so we can find required packages:

apt-get update

Install python:

apt install -y python3-pip

and any required packages:

pip3 install requests

Create a script to grab the data; if we are logging into a directory, make dure we create it:

mkdir -p times/raw

For an example logging script:


then write a simple logger script, such as:
import time
import requests
import json

import os


if not os.path.exists(d):

while competing:
  except: continue
  if '_entries' in j:
    for x in j['_entries']:
      if 'status' in  x and x['status']=='Competing': competing=True
    with open('{}/_{}.txt'.format(d,ts),'w') as outfile:

​Ctrl-x and save to get out of the nano editor.

Alternatively, from a pre-existing data logger script on your local machine, copy into the droplet using scp. For example:

scp root@DROPLET.IP.ADDRESS:/root/

The script probably doesn’t need to run all the time. Instead, we can start it at a particular time as a cron job, and run it under timeout [LIMIT] to stop it after a certain amount of time.

To simplify setting cron jobs, it makes sense to reset the server time to the local timezone. For example, to set the timezone to Sydney local time:

timedatectl set-timezone Australia/Sydney

You can find timezones here.

Then restart the machine:

/sbin/shutdown -r now

and log back in when it’s had time to reboot. Then just to make sure:

service cron restart

Check the local time of the droplet just to make sure we know how local time in the droplet compares to our local time:


To set up the cron job in the droplet:

crontab -e

Cron tabs are written in the form:

# m h dom mon dow command

To start at the top of hour at 5am (droplet local time) on 28th July, for example, and run for 5 hours:

0 5 28 7 * timeout 5h python3

Or to start at 11.45am on 28/7 for half an hour, something like:

45 11 28 7 * timeout 30m python3 /root/

If we leave the droplet running, the datagrabber should run over the desired periods and save the data inside the droplet.

When we’re done, zip the files that we saved inside the droplet:

Install the zipper:

apt install -y zip

And zip the files:

zip -q -r times/raw

Then on local machine (type exit to close the ssh session), copy the zip file from the remote droplet back to the local host:

scp root@DROPLET.IP.ADDRESS:/root/ ./

PS by the by, to launch a really simple webserver on the remote machine, install flask (pip3 install uwsgi flask):
from flask import Flask
app = Flask(__name__)

def hello_world():
    return 'Hello, World!'

if __name__ == "__main__":'')

Then run with e.g. python3

PS to do: scheduling things automatically; timezone handling, python-crontab, timezonefinder.

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

%d bloggers like this: