3 Museums, 4 APIs

Apr 9, 2020

Ave atque Vale

I wrote this out and managed to delete everything I had already written. D’oh.This is my second try.

Cast your mind back to our first meeting, in January. That day, I had received some bad news concerning the health of a friend of mine. I was pretty flat, and did a poor job in class. I told you about it, later, and said something like, ‘y’know, there’s nothing we can’t roll with in this course, as long as we’re talking’.

And then the world threw a pandemic at us. I mean, what. the. actual. ….

So the last few weeks were not as I would have wanted. I was deeply disappointed to have to cancel the heritagejam, which is what we were all driving at. But I was really pleased to see so many of you doing what you could to continue the journey in this course, when you had the energy and motivation, and even when you didn’t.

I could not have asked for a better gang of students, and I am deeply grateful for you all.

Under the most exceptional of conditions, each group has produced something worthwhile. Y’all digitized the original paper documents with your phone cameras, you got a rough-and-ready transcription done via machine learning and old-fashioned elbow grease, you cleaned it all up and arranged it in tables, and you turned it into databases with human and machine readable facets.

None of this was easy, what y’all accomplished.

But you did it - despite everything that was thrown at us. And I am proud of you!

I told you there was nothing we couldn’t roll with in this course.

3 Museums, 4 APIs

The connective tissue between our work at the Canadian Museum of Nature, the Canadian Museum of History, and the Canada Science and Technology Museum was ‘the land.’ We encountered a number of collections that for various reasons had yet to be digitized, and/or could not yet be accessed online. We looked at the comparative zooarchaeological collection at Nature, the archives of the Cataraqui Archaeological Research Foundation’s excavation at Fort Frontenac in Kingston, and the collection of land survey markers.

These records were transcribed using the Microsoft Azure Vision Cognitive Service (instructions here) and turned into text files. These were cleaned up, and organized according to Tidy Text principles (although we did not do any R based statistical work) into tables. We then used the datasette data publishing tool to convert these tables into sqlite databases and to build a public-facing website with its own API. These were pushed online to the Heroku platform.

The goal was also to publish paradata websites that would detail the decisions taken at each step of that process which affected the amount and kind of data that we were able to publish. Unfortunately, the world had other plans. Subsequent iterations of this course will continue the work, so stay tuned.

The Museum of Nature

This API makes available archaeofaunal evidence from Place Royale. There are two tables, with over 100 and 500 rows respectively. This group also produced a digitized the guide to the abbreviations https://place-royale-latrine.herokuapp.com/

For reference, this was all pushed online with a few more flags set than were in my original instructions:

datasette publish heroku --name place-royale-latrine All-Specimens.db Cuts-burns-toothmarks.db --about "By students in HIST4806a Museums and Digital History Carleton University Winter 2020" --source="Canadian Museum of Nature" --title="Place Royale Latrine ZIC Project #98"

The Museum of History

This API makes available some 400 record cards from the CARF excavations. http://fort-frontenac-excavation.herokuapp.com/.

Like the Nature gang, this group ran into the same issue with the datasette heroku publish command on Windows machines. Because they both ran into the same error on different machines, but I couldn’t reproduce the error on Macs, this leads me to suspect that the issue is in the underlying codebase.

The Science and Technology Museum

This API makes available data concerning 134 markers at the CSTM. https://cstm-demo2.herokuapp.com/

Open Object Database from the Science and Technology Museum

I built this version of the CSTM Artifact Open Data Set as a demo for the class. It contains over 100 000 items. https://cstm-demo.herokuapp.com/. More CSTM data is available here.

So what can we do with an API?

For that, I would suggest looking at the amazing work of Tim Sherratt and his GLAM-Workbench.

But, to get started, here’s a Jupyter Notebook showing how to get data out of our APIs as json. You can run it in your browser by launching it in Binder; click ‘notebooks’ once that’s launched, and look for the Datasette example.

Acknowledgements

I would like to acknowledge the support and encouragement of the staff from the three museums, their willingness to experiment, and their enthusiasm for working with my students - David Pantalony, Tanya Anderson, Stacey Girling-Chrisie, Colleen McGuire, Sean Tudor, Sara McFarlane, Ryan Dodge, Fiona Smith Hale, Kristy von Moos, Brian Dawson, Shyong en Pan, Laura Smyk, and Chantal Dussault. My heartfelt apologies if I’ve missed anyone. What a wonderful thing to be able to work in this city with so many excellent colleagues!