A FactMiners' Fact Cloud for the British Library Image Collection


I was thrilled to read the announcement this week in the British Library Digital Scholarship blog about the Library's uploading to the Flickr Commons of over 1 million Public Domain images scanned from 17th, 18th, and 19th century books in the Library's physical collections. The Flickr image collection makes the individual images easily available for public use. Currently the meta-data about each image includes the most basic source information but nothing about the image itself. In the words of project tech lead Ben O'Steen:

We may know which book, volume and page an image was drawn from, but we know nothing about a given image. Consider the image below. The title of the work may suggest the thematic subject matter of any illustrations in the book, but it doesn't suggest how colourful and arresting these images are.

See more from this book: "Historia de las Indias de Nueva-España y islas de Tierra Firme..." (1867)

We plan to launch a crowdsourcing application at the beginning of next year, to help describe what the images portray. Our intention is to use this data to train automated classifiers that will run against the whole of the content. The data from this will be as openly licensed as is sensible (given the nature of crowdsourcing) and the code, as always, will be under an open license.

Ben went on to explain, "Which brings me to the point of this release. We are looking for new, inventive ways to navigate, find and display these 'unseen illustrations'."

Well, Ben's challenge got me thinking... What would be the value of creating a FactMiners' Fact Cloud Companion to the British Libary Public Domain Image Collection?

And that's when I had my latest "Eureka Moment" about why the FactMiners social-game ecosystem is such a compelling idea (at least to me and a few others at this point :-) ). First, let me briefly describe what a Fact Cloud Companion would look like for the British Library Image Collection before exploring why this is such an exciting and potentially important idea.

A FactMiners Fact Cloud for Images: What?

When Ben laments that the Library's image collection does not know anything about the content of the individual images, I believe he 'undersold' that statement by alluding to the metadata not informing us how colorful or arresting this image is. But there is a much more significant truth underlying his statement.

Images are incredible "compressed storage" of all the "facts" (verbal assertions) that we instantly understand when we humans look at an image. The image Ben referenced above of the man in a ceremonial South American tribal regalia is chuck full of "facts" like:

  • The man is wearing a mask.
  • The man is wearing a blue tunic.
  • The man is holding a long, pointed, wavy stick.
  • The man has a feathered shield in his left hand.
  • The man is standing on a fringed rug.
  • The man has a beaded bracelet on his right arm.

I've written briefly about how an Open Source graph database, like Neo4j, is an ideal technology for capturing FactMiners' Fact Clouds. So I won't belabor the point by drilling down here on these example 'image facts' to the level of graph data insertions or related queries. Suffice to say that the means are readily available to design and capture a reasonable and useful graph database of facts/assertions about what is "seen" in the "unseen illustrations" of the British Library image collection.

Rather, I want to move on quickly to the "A-ha Moment" I had about why creating a Fact Cloud Companion to the British Library Image Collection could be a Very Good Thing.

A FactMiners Fact Cloud for Images: Why?

Every time we look at an image, our brains decompress that in an "explosion of facts." By bringing image collections into the FactMiners' "serious play arena" we are, in effect, capturing that "human image decompression" process as a sharable artifact rather than it being a transient individual cognitive event. In other words, every child goes through the learning process of "seeing" what's in a picture. When these "little learning machines" do a proportion of that natural childhood learning activity by playing FactMiners at the British Library Image Collection, we get a truly interesting 'by-product' in the Fact Cloud Companion.


Beyond the obvious use of a Fact Cloud for folksonomy-class applications supporting source collection public and researcher access, a FactMiners Fact Cloud Companion of the British Library Public Domain Image Collection would be an invaluable resource for that new emerging museum and archive visitor base... robots. Well, not so much the fully anthropomorphized walking/talking robots, at least not so much just yet. I'm thinking here more like machine-learning programs, specifically those with any form of 'image vision' capability – whether by crude file/data 'input' or real-time vision sensors.

Upon entering the British Library Image Collection, our robot/machine-learning-program visitors would find a rich 'playground' in which to hone their vision capabilities. All those Fact Cloud 'facts' about what is 'seen' in the collection's previously 'unseen images' would be available at machine-thinking/learning speed to answer the litany of questions – "What's that?", "Is that a snake?", "Is that boy under the table?" – questions that a machine-learning program might use to refine its vision capabilities.

So while the primary intent of the project is making these images available for Open Culture sharing and use, there may be some equally valuable side effects of this project. The British Library Image Collection and its Fact Cloud Companion could become a "go-to" stop for any vision-capable robot or machine-learning program that aspires to better understand the world it sees.

A FactMiners Fact Cloud for Images: How?

As the good folks at the British Library well know, just getting a good folksonomy social-tagging resource developed for such a huge collection is itself no small task. This is why museums and archives, like the British Library and those collaborating in the steve project, are turning to crowdsourcing methods to get the 'heavy-lifting' of these tasks done. Crowdsourcing goes hand-in-hand with gamification in this regard. If we can't pay you to help us out, at least we can make the work fun, right?


Well, you don't have to think too hard to realize that if creating a folksonomy is a big chore, then creating a useful Fact Cloud representing at least a good chunk of the 'seen' in the previously 'unseen illustrations' of the British Library Image Collection is a Way Too Big Chore. And this might be true. But I think that there is some uniquely wonderful 'harness-able labor' to be tapped in this regard.

I know we can make a really fun app where parents and older folks can help kids learn by playing; building fact-by-fact a valuable resource at the British Library, for one. A learning child is a torrent of cognitive processing. Let a stream of that raw learning energy run through the FactMiners game at the British Library Image Collection and you'd have critical mass in a Fact Cloud faster than you can say, "Danger, Will Robinson!"

And where might this lead? Well, where this all might lead Big Picture wise is beyond the scope of this post. But I can see it leading to a new, previously unimagined game to add to the mix of social games available to FactMiners players... and it's a bit of a doozy. :-)

If the British Library creates a FactMiners Fact Cloud Companion to its Image Collection, and if that Fact Cloud becomes useful to robots (machine-learning programs) as a vision-learning resource, I can see where we would want to add a 'Seeing Eye Child' Robot Adoption Agency Game to the FactMiners game plug-ins. What would that game be like?


Well, as good as an Image Collection Fact Cloud might be to learn from, and as smart as a machine-learning program might be as a learner, a robot's learning to see isn't likely to be a fully automated process. So we create a game where one or more kids 'adopt' a robot/machine-learning program to help it learn. In this case, the FactMiners player would gain experience points, badges, etc. by being available for 'vision training' sessions with the adopted robot. The FactMiners player is, in effect, the referee and coach to the robot as it learns to see.

It doesn't take much imagination to see how this could lead to schools fielding teams in contests to take a 'stock' robot/machine-learning-program and train it to enter various vision recognition challenges. And when I let my imagination run with these ideas, it gets very interesting real fast. But any run, even of one's imagination, starts with a first step.

Will we get a chance to make a Fact Cloud Companion to the British Library Image Collection? I don't know. This week the British Library took a million first steps toward making their vast digital image collection available to all for free. Perhaps the first step of posting this article will lead us on a path where we will have some serious fun working with the Library to help kids who help robots learn to see and understand our world.

--Jim Salmons--
Cedar Rapids, Iowa USA

Update: An encouraging reply of exploratory interest from the good folks at the British Library Labs has juiced my motivation to further explore the potential for the 'Seeing Eye Child' Robot Adoption Agency as a FactMiners plug-in game.