Lecture 23

CSV and JSON

MCS 275 Spring 2022
David Dumas

Lecture 23: CSV and JSON

Course bulletins:

  • Homework 8 posted.
  • Project 3 posted. Check it out. Brief discussion in Monday's lecture.
  • Project 2 solutions posted. Please read.
  • Project 2 will be graded by Monday.

Install Pillow

Next week: Manipulating images with the Python package Pillow. To prepare, please

python3 -m pip install pillow

Or substitute the correct interpreter name for your platform.

If you have trouble, check the install instructions and let us know if you don't find a solution there.

Reading and writing data files

CSV

Comma separated values. A text file format like:


    State,Capital,Population
    Kentucky,Frankfort,25527
    South Dakota,Pierre,13646

Column headings in the first row (usually).

Untyped. Up to reader to figure out string/float/int/etc.


        District,Fin-Sub,Chrgbl Fin No,PO Name,Unit Name,Property Address,County,City,ST,ZIP Code,Property Status,Ownership,FDB ID (All),AMS Locale Key (All),FDB Facility Type (All),FDB Facility Subtype (All),Building Ownership Description,Land Desc,Space Certified Indicator,Bldg Occu Date,Int Sq Ft
        Greater Boston,431120-G01,431120,BARRINGTON,MAIN OFFICE,200 MIDDLE HWY,BRISTOL,BARRINGTON,RI,02806-9998,Active,Owned,1354095,V25837,Post Office,Main Post Office,"USPS Building, Transferred","USPS Land, Transferred",Yes,6/1/1974,"6,769"
        Greater Boston,432360-G01,432360,COVENTRY,MAIN OFFICE,1550 NOOSENECK HILL RD,KENT,COVENTRY,RI,02816-9998,Active,Owned,1359450,V25859,Post Office,Administrative Post Office (APO),"USPS Building, Const. by USPS","USPS Land, Not Prev. Leased",Yes,7/1/1990,"13,764"
        Greater Boston,434480-G01,434480,HARRISVILLE,MAIN OFFICE,131 HARRISVILLE MAIN ST,PROVIDENCE,HARRISVILLE,RI,02830-9998,Active,Owned,1366338,V25891,Post Office,Administrative Post Office (APO),"USPS Building, Transferred","USPS Land, Transferred",No,12/1/1951,"1,413"
        Greater Boston,436020-G01,436020,NEWPORT,MAIN OFFICE,320 THAMES ST STE 1,NEWPORT,NEWPORT,RI,02840-9998,Active,Owned,1375017,V25919,Post Office,Main Post Office,"USPS Building, Transferred","USPS Land, Transferred",Yes,8/1/1917,"35,637"
        Greater Boston,436090-G02,436090,NORTH KINGSTOWN,MAIN OFFICE,7715 POST RD,WASHINGTON,NORTH KINGSTOWN,RI,02852-9998,Active,Owned,1375354,V25921,Post Office,Administrative Post Office (APO),"USPS Building, Not Prev. Leased","USPS Land, Not Prev. Leased",Yes,10/1/1997,"32,629"
        Greater Boston,436580-G02,436580,PASCOAG,MAIN OFFICE,35 BRIDGE WAY,PROVIDENCE,PASCOAG,RI,02859-3132,Active,Owned,1376912,V25928,Post Office,Main Post Office,"USPS Building, Transferred","USPS Land, Transferred",No,6/1/1953,"3,290"
        Greater Boston,436723-G01,436723,PAWTUCKET,CUMBERLAND BR.,2055 DIAMOND HILL RD,PROVIDENCE,CUMBERLAND,RI,02864-9998,Active,Owned,1434572,V25862,Post Office,Branch,"USPS Building, Const. by USPS","USPS Land, Not Prev. Leased",Yes,3/1/1985,"9,452"
        Greater Boston,436720-G03,436720,PAWTUCKET,DARLINGTON,30 MONTICELLO RD,PROVIDENCE,PAWTUCKET,RI,02861-3810,Active,Owned,1360314,V25864,Post Office,Station,"USPS Building, Not Prev. Leased","USPS Land, Not Prev. Leased",Yes,,"16,137"
        Greater Boston,436720-G01,436720,PAWTUCKET,MAIN OFFICE,40 MONTGOMERY ST,PROVIDENCE,PAWTUCKET,RI,02860-9998,Active,Owned,1377003,V25929,Post Office,Main Post Office,"USPS Building, Transferred","USPS Land, Transferred",Yes,9/1/1933,"49,535"
        Greater Boston,436720-G01,436720,PAWTUCKET,MAIN OFFICE,40 MONTGOMERY ST,PROVIDENCE,PAWTUCKET,RI,02860-9998,Active,Owned,1434718,V28273,Post Office,Finance Station - No Delivery,"USPS Building, Transferred","USPS Land, Transferred",Yes,9/1/1933,"49,535"
        Greater Boston,436860-G01,436860,PORTSMOUTH,MAIN OFFICE,95 CHASE RD,NEWPORT,PORTSMOUTH,RI,02871-9998,Active,Owned,1378183,V25933,Post Office,Main Post Office,"USPS Building, Const. by USPS","USPS Land, Not Prev. Leased",Yes,8/1/1990,"12,048"
        Greater Boston,437140-G07,437140,PROVIDENCE,CORLISS PK. STA & VMF,55 CORLISS ST RM 100,PROVIDENCE,PROVIDENCE,RI,02904-9722,Active,Owned,1434581,V25907,Post Office,Carrier Annex (ANX),Unknown,Unknown,Yes,7/1/1980,"39,677"
        Greater Boston,437140-G07,437140,PROVIDENCE,CORLISS PK. STA & VMF,55 CORLISS ST RM 100,PROVIDENCE,PROVIDENCE,RI,02904-9722,Active,Owned,1434685,V27719,Vehicle Maintenance,Vehicle Maintenance Facility (VMF),Unknown,Unknown,Yes,7/1/1980,"39,677"
        Greater Boston,437178-G01,437178,PROVIDENCE,EAST PROVIDENCE BR.,17 GROVE AVE,PROVIDENCE,EAST PROVIDENCE,RI,02914-4506,Active,Owned,1434641,V27459,Post Office,Finance Branch,"USPS Building, Const. by USPS","USPS Land, Not Prev. Leased",Yes,6/1/1971,"6,085"
        Greater Boston,437166-G01,437166,PROVIDENCE,JOHNSTON BRANCH,1530 ATWOOD AVE,PROVIDENCE,JOHNSTON,RI,02919-9998,Active,Owned,1434577,V25901,Post Office,Branch,"USPS Building, Not Prev. Leased","USPS Land, Not Prev. Leased",Yes,,"17,034"
        Greater Boston,437170-G01,437170,PROVIDENCE,OLNEYVILLE STA,100 HARTFORD AVE,PROVIDENCE,PROVIDENCE,RI,02909-9998,Active,Owned,1376101,V25927,Post Office,Station,"USPS Building, Transferred","USPS Land, Transferred",Yes,6/1/1967,"10,733"
        Greater Boston,437141-G08,437141,PROVIDENCE,P&DC,24 CORLISS ST RM 100,PROVIDENCE,PROVIDENCE,RI,02904-2477,Active,Owned,1434586,V25936,Post Office,Main Post Office,"USPS Building, Not Prev. Leased","USPS Land, Not Prev. Leased",No,1/1/1961,"384,113"
        Greater Boston,437141-G08,437141,PROVIDENCE,P&DC,24 CORLISS ST RM 100,PROVIDENCE,PROVIDENCE,RI,02904-2477,Active,Owned,1441076,V25935,Mail Processing,Processing and Distribution Center/Facility (PDC/PDF),"USPS Building, Not Prev. Leased","USPS Land, Not Prev. Leased",No,1/1/1961,"384,113"
        Greater Boston,437141-G08,437141,PROVIDENCE,P&DC,24 CORLISS ST RM 100,PROVIDENCE,PROVIDENCE,RI,02904-2477,Active,Owned,1444535,8938,Administrative Office,Postal Inspection Service Field Division (USPIS),"USPS Building, Not Prev. Leased","USPS Land, Not Prev. Leased",No,1/1/1961,"384,113"
        Greater Boston,438260-G07,438260,WAKEFIELD,MAIN OFFICE,551 KINGSTOWN RD,WASHINGTON,WAKEFIELD,RI,02879-9998,Active,Owned,1386177,V25967,Post Office,Main Post Office,"USPS Building, Not Prev. Leased","USPS Land, Not Prev. Leased",Yes,11/1/1997,"31,358"
        Greater Boston,438260-G01,438260,WAKEFIELD,NARRAGANSETT BR.,15 MEMORIAL SQ,WASHINGTON,NARRAGANSETT,RI,02882-3391,Active,Owned,1434584,V25916,Post Office,Finance Branch,"USPS Building, Transferred","USPS Land, Transferred",Yes,6/1/1916,"6,408"
        Greater Boston,438540-G01,438540,WARREN,MAIN OFFICE,53 CHILD ST,BRISTOL,WARREN,RI,02885-9998,Active,Owned,1386428,V25968,Post Office,Main Post Office,"USPS Building, Transferred","USPS Land, Transferred",Yes,8/1/1931,"9,484"
        Greater Boston,438680-G01,438680,WARWICK,APPONAUG STATION,3205 POST RD,KENT,WARWICK,RI,02886-7157,Active,Owned,1353208,V27496,Post Office,Finance Station,"USPS Building, Transferred","USPS Land, Transferred",Yes,10/1/1941,"6,097"
        Greater Boston,438680-G03,438680,WARWICK,POST OFFICE ANNEX-STORAGE,3205 POST RD REAR,KENT,WARWICK,RI,02886-7141,Active,Owned,,,,,"USPS Building, Prev. Leased","USPS Land, Prev. Leased",Yes,10/1/1957,"10,131"
        Greater Boston,439380-G02,439380,WEST WARWICK,MAIN OFFICE,100 WASHINGTON ST,KENT,WEST WARWICK,RI,02893-9998,Active,Owned,1386937,V25977,Post Office,Main Post Office,"USPS Building, Const. by USPS","USPS Land, Not Prev. Leased",Yes,,"17,482"
        Greater Boston,439100-G01,439100,WESTERLY,HIGH STREET STATION,5 HIGH ST,WASHINGTON,WESTERLY,RI,02891-1878,Active,Owned,1366914,V25892,Post Office,Station,"USPS Building, Transferred","USPS Land, Transferred",No,6/1/1914,"8,448"
        Greater Boston,439100-G02,439100,WESTERLY,MAIN OFFICE,110 TOM HARVEY RD,WASHINGTON,WESTERLY,RI,02891-9998,Active,Owned,1387262,V25978,Post Office,Administrative Post Office (APO),"USPS Building, Const. by USPS","USPS Land, Not Prev. Leased",Yes,12/1/1988,"14,236"
        Greater Boston,439800-G02,439800,WOONSOCKET,MAIN OFFICE,127 SOCIAL ST STE 1,PROVIDENCE,WOONSOCKET,RI,02895-9998,Active,Owned,1388287,V25983,Post Office,Administrative Post Office (APO),"USPS Building, Const. by USPS","USPS Land, Not Prev. Leased",No,3/1/1976,"34,476"
        Greater Boston,439800-G02,439800,WOONSOCKET,MAIN OFFICE,127 SOCIAL ST STE 1,PROVIDENCE,WOONSOCKET,RI,02895-9998,Active,Owned,1448921,11464,Administrative Office,Office of the Inspector General Office (OIG),"USPS Building, Const. by USPS","USPS Land, Not Prev. Leased",No,3/1/1976,"34,476"
    

Source: USPS

Reading CSV


        with open("datafile.csv","r",newline="",encoding="UTF-8") as fp:
            rdr = csv.DictReader(fp)
            rownum = 1
            for row in rdr:  # reader objects are iterable (ONCE!)
                # row is a dict like {"State": "Kentucky", ...}
                print("Row",rownum)
                rownum += 1
                for colname in row:
                    print("{}: {}".format(colname,row[colname]))
    

Writing CSV


        with open("courses.csv","w",newline="",encoding="UTF-8") as fp:
            w = csv.DictWriter(fp, fieldnames=["course","instructor"])
            # Write the column headers
            w.writeheader()
            # Now write the rows of data
            w.writerow({"course": "MCS 260",
                        "instructor": "Dumas"})
            w.writerow({"course": "MCS 275",
                        "instructor": "Dumas"}
        

Writing CSV

        
            with open("courses.csv","w",newline="",encoding="UTF-8") as fp:
                w = csv.writer(fp)
                # Write the column headers
                w.writerow(["course","instructor"])
                # Now write the rows of data
                w.writerow(["MCS 260","Dumas"])
                w.writerow(["MCS 275","Dumas"])
        

JSON

JSON stands for JavaScript object notation. It is a text-based format for typed hierarchical data.


        {
            "title": "Fighting robotic wasps",
            "author": "Paolo Cortázar",
            "year": 2026,
            "tags": ["nonfiction","self-help"],
            "checked out": true,
            "avg star rating": 4.89
        }
    

        {
            "newsFeedItemList": [
              {
                "title": "Planning and research grants available through IDOT ",
                "type": "Press Release",
                "date": "Thursday, March 03",
                "year": "2022",
                "description": "SPRINGFIELD - The Illinois Department of Transportation announced today that $10 million is available to distribute as grants for select projects and activities that help advance the state’s long-range transportation and planning goals. The deadline to apply is March 30.",
                "thumbnail": "https://www2.illinois.gov/IISNewsImages/rollupimages/DOT/IDOTlogo.jpg",
                "url": "/content/soi/illinois/en/news/press-release.24573.html",
                "altText": ""
              },
              {
                "title": "2021 Marion County Final Multiplier Announced",
                "type": "Press Release",
                "date": "Thursday, March 03",
                "year": "2022",
                "description": "SPRINGFIELD, IL, - Marion County has been issued a final property assessment equalization factor of 1.0000, according to David Harris, Director of the Illinois Department of Revenue.",
                "thumbnail": "https://www2.illinois.gov/IISNewsImages/rollupimages/DOR/DORlogosm.gif",
                "url": "/content/soi/illinois/en/news/press-release.24572.html",
                "altText": ""
              },
              {
                "title": "2021 Vermilion County Final Multiplier Announced",
                "type": "Press Release",
                "date": "Thursday, March 03",
                "year": "2022",
                "description": "SPRINGFIELD, IL, - Vermilion County has been issued a final property assessment equalization factor of 1.0000, according to David Harris, Director of the Illinois Department of Revenue.",
                "thumbnail": "https://www2.illinois.gov/IISNewsImages/rollupimages/DOR/DORlogosm.gif",
                "url": "/content/soi/illinois/en/news/press-release.24575.html",
                "altText": ""
              }
            ]
          }
    

Source: illinois.gov home page

JSON value types

  • string — must use double quotes.
  • number — float, int, other? Up to reader.
  • boolean — lower case names true, false.
  • null — like Python None.
  • array — like Python list. Brackets and commas.
  • object — like Python dict. Curly braces, colons, and commas. Keys must be strings.

Reading JSON


        with open("in.json","r",encoding="UTF-8") as fp:
            val = json.load(fp) # read from file
        
        # OR if you have a string
        val = json.loads(s)
    

The object returned can be hard to use if you don't have documentation for the layout of the file. But since it has keys and values, it is at least explorable.

Writing JSON


        val = { 
                "temperature": 451.3,
                "primes": [2,3,5,7,11],
                "awesome": True,
                "starter": "charmander"
              }
        with open("out.json","w",encoding="UTF-8") as fp:
            json.dump(val,fp) # save exactly one object to file
        
        # OR if you just want the JSON as a string
        s = json.dumps(val)
    

Key JSON features

  • Does not require data to be tabular.
  • Has excellent standardization and cross-language support.
  • Most HTTP APIs (e.g. data portals) return JSON.
  • Semi-readable and semi-writeable for humans.

Conversion table for Python → JSON

  • dictobject
  • list or tuplearray
  • int or floatnumber
  • boolboolean
  • Nonenull

References

Revision history

  • 2021-03-04 Initial publication