Periodically running python programsKiran Koduru • Apr 23, 2018 • 8 minutes to read
When I learned to program, one of the things I wanted to do was run tasks, jobs, functions, programs etc. on a periodic basis. On my journey with python I learned that there are multiple techniques to accomplish this. I learned that I could run a program in the background in the command line or run it as a cron. There are also a combination of python packages that are available which can be useful in running background tasks. I will try to walk to some of them during this post. Forewarning, this is a long post so if you would like to put a pin in it and comeback later, I totally understand. I will try and break in down in multiple posts so it’s easier to come back to. For beginners to python and Unix I would recommend reading the whole series in order to get an idea of working with periodic tasks in python or Unix environments.
Before we begin, have you ever wondered, after you have done writing a program, how would you go about trying to run the program periodically? In my earlier days when I started to program, I was excited I wrote a program that worked but, I wanted to run it at regular intervals. One such program I wrote was scraping data from multiple websites. I wondered, how could I scrape information periodically in-order to have the freshest data always. I’ll try and explain in a series of posts how would you go about setting up running a basic cron to a bit of advanced workflows used in python that simulate periodic tasks via celery. So I hope this helps you in some ways.
I need to start with a program. Something that can be run as background task. My side project needed emails to be sent out periodically, so what better way than automating the schedule for the emails to be sent.
I created a module named send_emails.py. In python files are called modules and packages are folders with files. Weird, I know. So, my module emails the latest blog post to an email list and if there aren’t any new posts published, then I won’t be sending out any emails.
My database has 2 tables, namely
blogpost that I need to get data from. The
user table has fields
blogpost table has fields
Just incase you are wondering here’s what the 2 tables look like
Next, my module send_email.py that’ll send the new blog post to the email list in
Now this is what my directory structure looks like
. └── my_emailer ├── __init__.py ├── db_connection.py ├── models.py └── send_emails.py
Running program in the background
The novice implementation of running the above program as a cron would be to run the program in the backgroud. You could add a
while True loop with a delay using python’s
time.sleep() function. I will add a 5 second delay for sending emails. You can probably calculate the time in seconds you would like to send the email at for larger intervals.
To run the program above on Unix I added the
& symbol when calling it and it’ll run in the background.
python send_emails_bg.py &
To view the list of running programs, I used the
jobs command. I can also bring it to the foreground with the
fg command and finally quit the program with the
Ctrl + C command.
user@home:~/test_project $ python send_emails.py &  19173 user@home:~/test_project $ jobs + Running python send_emails.py & user@home:~/test_project $ fg ^C
Or if I am not interested in coming back to the process then I could use
nohup python send_emails_bg.py &
Running my program like this would probably be okay to test something out but I wouldn’t recommend doing this for larger programs. My process would definitely stop running if my machine was rebooted. Hence the alternative, using cron!
Setting up the cron
Cron gives me the assurance that the program will run daily, hourly, weekly or monthly. It wouldn’t be affected by reboots. I could either edit the existing cron tab file with
crontab -e command or edit the
/etc/crontab file on my Unix environment.
# ┌───────────── minute (0 - 59) # │ ┌───────────── hour (0 - 23) # │ │ ┌───────────── day of month (1 - 31) # │ │ │ ┌───────────── month (1 - 12) # │ │ │ │ ┌───────────── day of week (0 - 6) (Sunday to Saturday; # │ │ │ │ │ 7 is also Sunday on some systems) # │ │ │ │ │ # │ │ │ │ │ * 7 * * * admin cd /my/path/to/folder;python send_emails.py >> test.log 2>&1
Now the above command will run as the user
admin at the 7th hour of the day(i.e. 7 am) daily. The command will pipe the
stdout and the
stderr to the test.log file.
If you are looking for an implementation on Windows then you can probably find some answers here.
cron has other implementations that I haven’t looked into like Anacron. From what I have seen it allows to schedule tasks to run on next reboot if the system is turned off. Might be useful for someone who wants to run their job atleast once even if not on schedule.
Hope this tutorial was useful in a way to get to start with cron and running code in the background. In my next post, I will talk about how to daemonize your python code so you can run it as a service.
I am writing a book!
While I do appreciate you reading my blog posts, I would like to draw your attention to another project of mine. I have slowly begun to write a book on how to build web scrapers with python. I go over topics on how to start with scrapy and end with building large scale automated scraping systems.
If you are looking to build web scrapers at scale or just receiving more anecdotes on python then please signup to the email list below.