A Python "RSS Feed Reader" script I use in my Radio Pi system

I just updated my Radio Pi “RSS Feed” script, and in short, here is the source code:

#!/usr/bin/python

import feedparser
import time
from subprocess import check_output
import sys

#feed_name = 'TRIBUNE'
#url = 'http://chicagotribune.feedsportal.com/c/34253/f/622872/index.rss'

feed_name = sys.argv[1]
url = sys.argv[2]

db = '/var/www/radio/data/screensaver/feeds.db'
limit = 12 * 3600 * 1000

#
# function to get the current time
#
current_time_millis = lambda: int(round(time.time() * 1000))
current_timestamp = current_time_millis()

def post_is_in_db(title):
    with open(db, 'r') as database:
        for line in database:
            if title in line:
                return True
    return False

# return true if the title is in the database with a timestamp > limit
def post_is_in_db_with_old_timestamp(title):
    with open(db, 'r') as database:
        for line in database:
            if title in line:
                ts_as_string = line.split('|', 1)[1]
                ts = long(ts_as_string)
                if current_timestamp - ts > limit:
                    return True
    return False

def clean_string(string):
    return string.encode('utf-8').strip()

#
# get the feed data from the url
#
feed = feedparser.parse(url)

#
# figure out which posts to print
#
posts_to_print = []
posts_to_skip = []

for post in feed.entries:
    # if post is already in the database, skip it
    # TODO check the time
    title = clean_string(post.title)
    if post_is_in_db_with_old_timestamp(title):
        posts_to_skip.append(title)
    else:
        posts_to_print.append(title)
    
#
# add all the posts we're going to print to the database with the current timestamp
# (but only if they're not already in there)
#
f = open(db, 'a')
for title in posts_to_print:
    title_cleaned = clean_string(title)
    if not post_is_in_db(title_cleaned):
        f.write(title_cleaned + "|" + str(current_timestamp) + "\n")
f.close
    
#
# output all of the new posts
#
count = 1
blockcount = 1
for title in posts_to_print:
    if count % 5 == 1:
        print("\n" + '((( ' + feed_name + ' - ' + str(blockcount) + ' )))')
        print("-------------------\n")
        blockcount += 1
    title_cleaned = clean_string(title)
    print(title_cleaned + "\n")
    count += 1

ipAddr = check_output(["hostname", "-I"])
print "\n"
print '---------------------------------'
print 'IP Address: ' + ipAddr.strip()
print '---------------------------------'
print "\n"

When run from a crontab entry like this:

get_feed.py NPR http://www.npr.org/rss/rss.php?id=100

this script produces output like this:

((( NPR - 1 )))
-------------------

Putin Faces Frosty Reception At G20 In Australia

The Wondrous World Of Tom Thumb Weddings

Hong Kong Democracy Leaders Barred From Traveling To Beijing

Not My Job: Ron Perlman, Who Played The Beast, Gets Quizzed On Beauty

In NPR Interview, Bill Cosby Declines To Discuss Assault Allegations


((( NPR - 2 )))
-------------------

The Good Listener: For Thanksgiving, Is There Music Everyone Can Agree On?

Fresh Air Weekend: Jon Stewart, Peter Mendelsund And A Review Of Bob Dylan

Gen. Dempsey Lands In Iraq As U.S. Presence Starts To Grow


---------------------------------
IP Address: 10.0.1.9
---------------------------------

That code and output depend on some database files, but if you know what my Radio Pi project is, the code will make a little more sense.

As one note, I get and print the IP address of the local system because the output from this script goes to a monitor on my Radio Pi system, and it’s often helpful to know the IP address of my Raspberry Pi, because I usually don't have a keyboard or mouse attached to it.