A document from MCS 260 Fall 2021, instructor David Dumas. You can also get the notebook file.

MCS 260 Fall 2021 Worksheet 13 solution

  • Course instructor: David Dumas

Topics

This worksheet focuses on networks in general and using HTTP in Python (servers and clients).

Resources

The main resources to refer to for this worksheet are:

(Lecture videos are not linked on worksheets, but are also useful to review while working on worksheets. Video links can be found in the course course Blackboard site.)

0. Flask install

In this worksheet you'll be developing some programs that use Flask, a web framework for Python. It isn't part of Python's standard library, so you'll need to install it.

Lecture 35 contains install instructions. Try them out now if you haven't done so already.

1. What's that show called again? Cephalopod puzzle?

Write a Python program showsearch.py that takes a command line argument, the search string, and looks for a TV show with a name similar to that string. The program should print a list of possible matches with info about each one, ordered from most likely match to least likely. This could be used to convert a partially-remembered name (e.g. "I remember it had the word 'foundation' in it") to detailed info like title, date, and streaming service. Here's an example of what it should look like when it is run:

$ python3 showsearch.py foundation
Foundation from Apple TV+, premiered 2021-09-24
The Foundation from SHOWTIME Showcase, premiered 2009-09-13
Better Life Foundation from YouTube, premiered 2016-06-07

For simplicity we'll assume that the search string is a single word, as in this example.

This sounds like a nearly impossible task, but there's a free and open API that does all the heavy lifting. TVMaze provides a HTTP API where the URL specifies a TV show search string, and the response contains a JSON object that is a list of possible matches. Each one has a score (indicator of likelihood of match to the search term), and lots of data about the show (genre, start date, duration of episodes, web page, description, etc.). All you need to do is fetch the data using urllib.request.urlopen(...), process the JSON, and extract the info you need to print the possible matches.

The TVMaze show search API is described at https://www.tvmaze.com/api#show-search. Here is a concrete example. When I load

http://api.tvmaze.com/search/shows?q=foundation

I'm asking TVMaze to search for shows with names similar to "foundation" (i.e. containing that word, or a similar word). The response is a big JSON object, similar to what is shown below. (I've removed some data about each show to make the text shorter.)

[
  {
    "score": 0.909483,
    "show": {
      "id": 35951,
      "url": "https://www.tvmaze.com/shows/35951/foundation",
      "name": "Foundation",
      "type": "Scripted",
      "genres": [
        "Drama",
        "Science-Fiction"
      ],
      "status": "Running",
      "averageRuntime": 55,
      "premiered": "2021-09-24",
      "officialSite": "https://tv.apple.com/us/show/foundation/umc.cmc.5983fipzqbicvrve6jdfep4x3?l=en",
      "rating": {
        "average": 7
      },
      "network": null,
      "webChannel": {
        "id": 310,
        "name": "Apple TV+",
        "country": null
      },
      "summary": "<p>When revolutionary Dr. Hari Seldon predicts the impending fall of the Empire, he and a band of loyal followers venture to the far reaches of the galaxy to establish The Foundation in an attempt to rebuild and preserve the future of civilization. Enraged by Hari's claims, the ruling Cleons – a long line of emperor clones – fear their grasp on the galaxy may be weakening as they're forced to reckon with the potential reality of losing their legacy forever.</p>",
      "updated": 1636796350,
      "_links": {
        "self": {
          "href": "https://api.tvmaze.com/shows/35951"
        },
        "previousepisode": {
          "href": "https://api.tvmaze.com/episodes/2147514"
        },
        "nextepisode": {
          "href": "https://api.tvmaze.com/episodes/2147515"
        }
      }
    }
  },
  {
    "score": 0.68217707,
    "show": {
      "id": 39012,
      "url": "https://www.tvmaze.com/shows/39012/the-foundation",
      "name": "The Foundation",
      "type": "Scripted",
      "genres": [
        "Comedy"
      ],
      "status": "Ended",
      "averageRuntime": 25,
      "premiered": "2009-09-13",
      "officialSite": "https://www.showcase.ca/thefoundation/",
      "rating": {
        "average": null
      },
      "network": {
        "id": 1492,
        "name": "SHOWTIME Showcase",
        "country": {
          "name": "United States",
          "code": "US",
          "timezone": "America/New_York"
        }
      },
      "webChannel": null,
      "summary": "<p><b>The Foundation</b> is an irreverent comedy series about an uncharitable man at the helm of a charitable organization. The series revolves around an irresponsible, corrupt man holding the reins of a powerful non-profit organization. Michael Valmont-Selkirk is an impressive hypocrite doing much wrong in the face of great righteousness. <i>The Foundation</i> wags its fat cat finger at the well-heeled philanthropists and donators who earnestly relieve their guilt in the high stakes world of philanthropy and non-profit charity fund-raising. In this world of so many worthy causes, Michael Valmont-Selkirk has but one cause dear to his heart - himself.</p>",
      "updated": 1539063511,
      "_links": {
        "self": {
          "href": "https://api.tvmaze.com/shows/39012"
        },
        "previousepisode": {
          "href": "https://api.tvmaze.com/episodes/1543181"
        }
      }
    }
  },
  {
    "score": 0.5299132,
    "show": {
      "id": 20709,
      "url": "https://www.tvmaze.com/shows/20709/better-life-foundation",
      "name": "Better Life Foundation",
      "type": "Scripted",
      "genres": [
        "Comedy"
      ],
      "status": "Ended",
      "averageRuntime": 24,
      "premiered": "2016-06-07",
      "officialSite": "https://www.youtube.com/playlist?list=PLFM6PquCUdebrv_iQmSHlQwOERwsFCPA2",
      "rating": {
        "average": null
      },
      "network": null,
      "webChannel": {
        "id": 21,
        "name": "YouTube",
        "country": null
      },
      "summary": "<p>A documentary crew follows a group of five passionate NGO workers and a reluctant volunteer, trying to make the world a better place.</p>",
      "updated": 1633462155,
      "_links": {
        "self": {
          "href": "https://api.tvmaze.com/shows/20709"
        },
        "previousepisode": {
          "href": "https://api.tvmaze.com/episodes/1513014"
        }
      }
    }
  }
]

As you can see, it is a 3-element list, each element is a dictionary with two keys: "score" which maps to a float (larger means closer match), and "show" which maps to a dictionary of data about the show. That dictionary has its own complicated hierarchy of data.

For example, if that JSON object were loaded into a variable data, then I could get the name of the highest-score match as follows:

bestmatch = max(data, key = lambda x:x["score"])
print("The best match show name is:",bestmatch["show"]["name"])

But in your program, you'll probably want to sort by score rather than just getting the top-scored match.

Also note that shows appearing on traditional TV networks will have a "network" key, while those only available on streaming services will instead have a "webChannel" key with similar info.

I suggest you only print matches with a score of 0.3 or higher, since lower scores are generally not good matches for the search term.

Warning

Don't make more than 2 requests to the TVMaze API per second. One way to guarantee that is to add

import time

time.sleep(0.5)

near the top of your script, so it pauses for 0.5 seconds before doing anything else.

Solution

In [ ]:
# Content of showsearch.py

"""Use TVMaze API to find show by name"""
# MCS 260 Fall 2021 Worksheet 13 solution
import sys
import json
import urllib.request

if len(sys.argv) < 2:
    print("Usage: {} SEARCHWORD".format(sys.argv[0]))
    exit(1)

term = sys.argv[1] # additional command line args are ignored.

# NOTE: The next line only supports cases where `term` is a single word
# consisting of letters A-Za-z. If `term` contains spaces or other characters
# that are special in URLs then this next request will fail.  Python's
# standard library has functions to handle "quoting" so any text can be sent
# in a URL, but we haven't covered them yet.
res = urllib.request.urlopen("https://api.tvmaze.com/search/shows?q="+term)
data = json.load(res)
res.close()

data.sort(key=lambda x:-x["score"])
for searchhit in data:
    if searchhit["score"] < 0.3:
        break
    show = searchhit["show"]
    showname = show["name"]
    # The distribution method (streaming, traditional TV) is indicated
    # by which of the keys "network" or "webChannel" has an associated
    # value that is not `None`.
    if show["network"]:
        # the show is on a traditional TV network
        where = show["network"]["name"]
    elif show["webChannel"]:
        # the show is on a streaming service
        where = show["webChannel"]["name"]
    else:
        where = "unknown source"
    startdate = show["premiered"]
    print("{} from {}, premiered {}".format(showname,where,startdate))

2. Startup idea API

Remember back on Worksheet 5, problem 2, when you wrote a simple script to generate and print a random startup idea (e.g. "a carpet that fires plastic darts")? (If you don't remember, take a look now. This problem is based on that one.)

Convert that into a Flask API that has the following routes (i.e. these are the resources it allows a client to GET using HTTP):

  • /startup/random/ - Returns a random startup ideas as in Worksheet 5, e.g. "a carpet that fires plastic darts"
  • /startup/random/product/ - Returns a random product idea (one part of making a random startup idea), e.g. "that fires plastic darts"
  • /startup/random/feature/ - Returns a random special product feature (one part of making a random startup idea), e.g. "a carpet"

This API should return strings but as JSON objects, using flask.jsonify. Configure Flask so it listens for connections on port 3000.

Solution

In [ ]:
"""API that provides startup ideas"""
# MCS 260 Fall 2021 Worksheet 13 solution
import flask
import random

product_ideas = [
    "a juice machine",
    "a carpet",
    "an office chair",
    "a coffee maker",
    "a haircut robot",
    "a Python course",
    "a toothbrush",
    "a pair of noise-cancelling headphones",
    "an oversized raccoon plush doll",
    "a detailed model of Boise, Idaho",
    "a university building"
]
special_features = [
    "that is controlled by a smartphone app",
    "that plays polka music",
    "made of polished titanium",
    "that also mines bitcoin",
    "scented with sandalwood and lime oil",
    "with a pleasant strawberry flavor",
    "run by an elite squad of monks trained in martial arts", 
    "without spiders",
    "with an integrated soap dispenser",
    "that is vegetarian and gluten-free",
    "enriched with vitamin D",
    "that fires plastic darts",
    "made of hexagons and confusion"
]

app = flask.Flask("Startup Idea API")

@app.route("/startup/random/")
def startup_idea():
    """Produce a random startup idea from a product and special feature"""
    return flask.jsonify(
        random.choice(product_ideas) + " " + random.choice(special_features)
    )

@app.route("/startup/random/product/")
def product_idea():
    """Produce a random product idea"""
    return flask.jsonify(
        random.choice(product_ideas)
    )

@app.route("/startup/random/feture/")
def feature_idea():
    """Produce a random feature idea"""
    return flask.jsonify(
        random.choice(special_features)
    )

app.run(port=3000)

3. Random startup idea: API client

Now, build a program that does exactly the same thing that was requested for Worksheet 5, problem 2, but which uses the API you created in problem 2 to do all the idea generation work. That is, it should use urllib.request.urlopen and json.load to fetch a startup idea when needed. As in that worksheet, it should keep retrieving and offering ideas until one is deemed acceptable by the user.

The idea of this exercise is that it shows how you might take an existing program and move part of the work to another machine, with the two parts communicating over a network. While in this class we run Flask in a way that requires the connections to come from the same machine, with a few minor changes the pair of programs written in these problems could be made to run on two different computers.

Note: You'll need to have the API server from problem 2 of this worksheet running while you test this program.

Bonus: Make this program robust to cases in which the API server from problem 2 cannot be reached, or if it replies with an error status code (4xx or 5xx). That is, have the program print an informative error message in such cases rather than exiting with an exception.

Solution

In [ ]:
"""
Produce startup ideas until user indicates one is acceptable
Fetch the ideas from the API in apistartupidea.py
"""
# MCS 260 Fall 2021 Worksheet 13 solution
import urllib.request
import json

url = "http://localhost:3000/startup/random/"

def startup_idea():
    """Get a random startup idea"""
    res = urllib.request.urlopen(url)
    data = json.load(res)
    res.close()
    return data # should be a string

acceptable = "n"
while acceptable != "y":
    print("Startup idea:",startup_idea())
    acceptable = input("Acceptable? (Y/N): ").lower()

Revision history

  • 2021-11-18 Initial release