New Work: MoMA Exhibition Spelunker

Photograph of Rousseau installed at MoMA
Installation photograph from The Museum Collection of Painting and Sculpture exhibition, held 20 June 1945 to 13 February 1946

On September 7, 2016, The Museum of Modern Art (MoMA) in New York cracked open a new repository on the code repository service, Github. In addition to the release of their popular collection dataset, the museum also made the unprecedented decision to share the exhibition history of the museum (from 1929 to 1989), which also included a ton of archival material like press releases, installation photographs, and catalogues. You can see the official version on

MoMA asked Good, Form & Spectacle to make a spelunker to showcase this data, and we were thrilled to oblige.



First Steps

As with all our previous spelunkers, the first step we usually take is to map out a basic plan for the information and how you’ll move around in it. This is often a basic List -> Item pattern, where List can be made up from any of the main data elements. In the case of MoMA, that was Exhibitions, Roles, People/Orgs, and Departments. From each list view, you can move to a single instance of that type of thing, and from that single instance, back out into others, like from an exhibition to the year it happened, or to an artist in that exhibition. (We also get a Python project running on Heroku, and pop our own first commits on Github.)

The main reason we enjoyed making the MoMA Exhibition Spelunker so much was because the data was of a different nature to a collection dataset. As well as listing each exhibition over that period, the data also shows all the people who were involved, both the artists, but also, interestingly, museum staff and other collaborators. We’ve looked a little at how metadata can represent institutional dynamics with Week 1 of the What’s in the Library? project with Wellcome Library, but this was altogether more fine-grained. Who are the actors in this institution? Can  you see their influence?

First Impressions

So, once we have the basic List -> Item scaffold in place, it’s really a matter of trying to answer the questions that come up for us as we poke around all the corners of the data. One of the first simple visuals we added as the basic timeline, to show quickly and clearly what happened, and when.


Straight away, even a simple graphic like this asks questions. What’s that peak in World War II? Why the dip in the 50s? Why was 1978 such a full year?

It’s important to say at this point that a lot of what I’ll write next about what I discovered in the MoMA Spelunker is pure conjecture and presumption, based on looking at this data quite a lot. It may not be true, at all. (See Open Data, Assumptions and Naïvety below.) 

The Museum & People Dynamics

We don’t often get to see or know the dynamics and politics that go on within the walls of a museum. It can be such a designed space, all about the art or objects, that often the people who made an exhibition come true are almost entirely invisible. It was a real pleasure to work on this data from MoMA specifically because it isn’t about the art, but about the people. When we first began the project, I was keen to be able to show the art, but that desire was quickly supplanted by interest in, and display of, who was doing what.

One of the visualisations we made was about directors of departments, how long they were directors, and in some cases, showing where people went as they moved around the museum, in some cases in careers spanning 40 years. You can see in this screenshot what we can highlight a single individual as they move around too, so you can see who skipped where and when, like John Elderfield


It was bordering on titillating to imagine who might have worked together and how such long term tenures really did shape the museum as it is today. But, it’s also crystal clear that a little map like this can reveal nothing about why people moved around. You’ll need to look in the Archives to learn those stories.

MoMA Characters Emerge

I was enjoying the way that showing this kind of data over time helps you spot blobs and trends and gaps in the data. I was interested to try to uncover whether we could show who the key staff were that really got MoMA off the ground in the early years. By taking a role – Curator – and drawing it over time, you can see quickly see who was kicking things off, in this case, the founding director, Alfred H. Barr, Jr., curated 36 exhibitions, and worked there for about 40 years.


Something interesting happens when you change how that list is sorted. We can show the same data, but order by who curated the most exhibitions, to get a really different picture that shows the curators who’ve worked on the most exhibitions, and how much they made in any given year. It’s there that you start to notice the effort of curators like William S. Lieberman, who led three different departments over his career, or Dorothy C. Miller, who worked at the museum for about 30 years, and was head of the Department of Painting and Sculpture for a relatively short time, too. Did they work together? Did they like each other?


We liked this “most appearances” view much more, so set that as the default.

This list view is cool too, when you’re looking at artists. There’s a lot of info squished into that single list view, and the overall impression is mostly that MoMA has an incredible collection full of heavyweights (and very good relationships with collectors and other museums), and shares it with the world a lot! Here are the most exhibited artists in the 1929-1989 data… could you infer Jasper Johns blasted on to the scene at some point there?


One small design thing I introduced was to show a small  symbol in big lists of people, so you could spot women quickly. This revealed a little gap in the data, where gender isn’t always noted. Since perfect is the enemy of good, rather than remove the feature, I’m trying to help by slowly making additions/corrections to a copy of the data, which MoMA is welcome to. (Want to help?)

World War II and The Responsive Museum

defenseflyer008Something that piqued my curiosity almost immediately was about how the museum operated in World War II, 1939-1945. They were exhibiting a lot, and browsing 1942 and 1943 in particular, there were clearly lots of exhibitions related to the war.

I first noticed the annual exhibitions called Useful Objects of American Design under $10.00 that opened in 1939, and repeated for a few years thereafter. I found myself wondering if this was austerity-related, but then realised I didn’t know the equivalent of $10 in today’s money! Nevertheless…

Useful Objects of American Design under $10.00

Then I noticed several others that were much more specific to war, starting to emerge in 1940 and many more in 1942/1943, and beyond, like War Comes to the People: A Story Written With The LensNational Defense Poster CompetitionArt in War: OEM Purchases from a National CompetitionWartime HousingRoad to VictoryCamouflage for Civilian DefenseUnited Hemisphere Poster CompetitionThe Museum and the WarNational War Poster CompetitionArt Education in WartimeWar Caricatures by Hoffmeister and PeelAirways to PeaceMagazine Cover Competition: Women in Necessary Civilian Employment, and more.

Airways to Peace (and a cool interactive dymaxion globe!)

With a little further digging beyond our dataset, I quickly discovered that “numerous exhibitions at The Museum of Modern Art were produced in collaboration with the United States government,” as it continued to exhibit the very best in modern art, including Starry Night, which MoMA acquired in 1941, and exhibited soon after.


I simply don’t know how data like ours could possibly show that Victor D’Amico chaired the Committee on Art in American Education and Society, which established as “art education’s answer to Fascism and its contempt for creative art. We hope to mobilise the art educators and students of America, combining all their art efforts, large and small, throughout the nation to work for victory.” It’s just one or two steps away from his basic data, and you always have to stop making new columns in data at some point, don’t you?


The committee worked throughout the year to keep art on the educational agenda. There’s a lot more background about the museum and the war available in this archival finding aid online: The Museum and the War Effort: Artistic Freedom and Reporting for “The Cause”. (It’s worth a read.)

I have to say it makes me wonder how and if museums in the USA (and elsewhere) are mobilising to produce a “vast program of art activity” in this way to combat the 45th president. Ahem. In any case, I actually removed a bunch of other “this was interesting” and “I enjoyed this” links and stuff. You can find your own way!

What did the critics say?

NATURE and social satire are the themes of two current shows here. Both are important and well worth seeing.

Jacob Deschin, on Elliott Erwitt: Improbable Photographs

Remember mashups? When publicly accessible code-level interfaces (or APIs) became a thing back in the day, the fantasy was that all manner of mashups could be made to combine and recombine data from all over into compelling new presentations.

One of our early ideas for this spelunker was to try to bring in content from the fabulous New York Times archive, a treasure trove of history around New York City and its surrounds from 1851 to the present. We knew that The New York Times regularly reviews events and exhibitions at MoMA, so it was simple step to try to combine that with the MoMA exhibition data. Luckily, too, the MoMA team had already done a lot of the work to connect exhibitions to articles, which was hugely helpful.

Could we show what The New York Times critics said about MoMA through the twentieth century? Yes! Critics such as Edward Alden Jewell, who was watching from the start in 1929, or Jacob Deschin, who often wrote about photography exhibitions from the late 1940s to the late 1960s, also form part of the fabric of the exhibition history at MoMA.

(Note that you can’t see full articles unless you’re a subscriber. We can show first paragraphs, which is a start.)

Open Data, Assumptions & Naïvety

This work for MoMA has been interestingly different from previous spelunkers we’ve made. In other projects, we’ve made exploratory interfaces into object-level metadata, which is (arguably) simply factual, representing objects and their attributes. While there may be errors or omissions in this kind of metadata, each object is as well-described as possible. Sometimes, viewing this data in the aggregate can bring insights — like seeing instantly that prints form the largest group of things by type at the Victoria & Albert Museum — but there’s really not much ‘colour’ to it.

Part of the provocation of the spelunker concept is to challenge the notion that people know what they’re looking for when they encounter a new museum’s collection, and I’m wondering if that could be extended to museum datasets. It seems to me that “drawing” this data makes it easier  to hypothesise about and ask questions of than examining a big .CSV file. You follow your instincts or an image that appeals or a person you recognise or a theme you’re into, and, I think, start to form your own opinions pretty quickly. The challenge is that this metadata can be quite rich, but, at least in our experience so far, also pretty superficial, so the picture that’s drawn for you is just a surface view. Perhaps though, it’s like a physical exhibition where you don’t read any labels (or there aren’t any), and you’re left to make your own judgements, some of which may be wrong, but all are personal.

As we were fleshing out the interface to the MoMA spelunker, I found myself making all sorts of assumptions about the institutional dynamics at the museum related to who was working when, and why they might have moved around, their areas of interest or speciality, and things like that. I’d written about some of that sort of stuff on the About page, but the kind folk at MoMA archives were good enough to let me know that some of those assumptions were just plain wrong! Maybe it was more that the data we were able to display only gives you a tiny glimpse into the actual dynamics, and it’s simply a must to explore more deeply with experts or other source material. I don’t want to draw wrong conclusions, and experts in house, who live with this information day to day, surely don’t want to express things that are wrong. Of course not. I think what I’m coming around to is that these sorts of explorers help orient viewers towards questions they’d like to answer, once they’re acclimatised to the terrain. That seems good!

More broadly speaking, we’ve now made six spelunkers at G,F&S, and it’s probably about time I had a proper think about how well they’ve worked and whether they’re useful. More on that later…

The team for this project was George Oates (design, project lead), and Phil Gyford (engineering). Thanks, Phil!

Update Splat: Speaking Gigs, Visiting Researchers and Advice

Wha! A month since the last post. Sorry about that.

  • We’ve now finished up the Wellcome Library’s What’s In The Library project, a four-week blast with the team there to explore the scope and contents of the brilliant Wellcome Library collection. It turns out that in addition to being a library about the history of medicine, its tendrils reach into all kinds of other subject areas like cakes, Jamaica, English ballads, and obelisks. We also had fun in the final week putting up small advertisements inside the building on Euston Road, which led to loads of staff looking at the project. We wrote loads of progress blog posts as we worked too – you might like to glance over our That’s A Wrap post from August 12.
  • I’m thrilled to tell you that I’ve been appointed to the Advisory Board for British Library Labs, a great initiative to help more people use more stuff from one of the greatest libraries on the planet.
  • If you’re in Australia or New Zealand look out for Museums and The Web Asia and the NZ National Digital Forum, both in October. I’ll be delivering the closing plenary at both conferences. The working title is Assumption, Attention and Articulation. I’m enjoying spending a little client-related downtime putting together the first 45-60 minute speech I’ve given for a while – it’s a nice opportunity to look back and reflect on how far we’ve come. It’ll also be nice to return to the NZ NDF, where I last spoke about the Flickr Commons back in 2008!
  • It may have passed you by that the first visiting researcher to G,F&S, Thalia Neilson, wrote a great blog post over at The Small Museum blog about her research on another small museum, The Little Museum of Dublin, and how that relates (and will absolutely inform) our work on The Small Museum as it continues. With Thalia’s visit, and another visit in the works from G,F&S collaborator, social designer and artist, Eliza Gregory, it feels like it would be good to think about creating a Visiting Researcher programme here. I know we’d benefit from different, curious, interesting visitors! If you know of good programmes along these lines, do please leave a comment.
  • The company is about six months old now. We’ve done a ton of stuff, and worked with a bunch of people, and things feel like they’re going well – *knock on wood* – After being able to work from the fab Offset Labs out of the Moo office in Shoreditch for the last few months, we’re now out of there, and it feels like it might be time to try to find a new home. Having a space that we can manipulate will accelerate our progress on The Small Museum initiative, and,well, it’ll be nice to just spread out. I wonder if we’ll have our first office space before we’re one…
  • On a personal note, as I started this new endeavour, as far back as June/July last year, I’ve had advice and counsel from so many of my friends and colleagues. Perhaps I’m of a certain age, but, I’ve felt profoundly supported and enriched by the support I’ve received about moving forward with my own passions, moving into a new (London) network from San Francisco, and reminding myself about all the things I need to be aware of as I build this new company. I’ve been fantasizing about having a new space, and a little opening party with lots of champagne and nibbles to say thank you to all of you for your support and encouragement. Richard, Elizabeth, Bill, Harriet, Adrian, Alex, Matt, Cassie, Eric, Felix, Simon, Gio, Rachel, Anno, Chris, Chris, Chris, Jenn, Alex, Frankie, Julie, Anne, Russell, Jonathan, Utku, Tom, Tom, Tom, Tom, Lina, Kathryn, Eliza, Margarette, Nick, Breandán, Annette, Kati, Delia, Miles, Julia, Gill, Nick, Ben, and… wow. See?

Internal R&D Project #4: Two Way Street

It’s pretty late on a Friday afternoon, possibly the dumbest time to launch something, but, my conspirators and I decided to only work on this this week, so, we kind of have to launch it now.

The thing is called Two Way Street, and it’s a new way to explore The British Museum collection. It’s truly a museum of the world for the world, and we think Two Way Street is fantastic for looking around. Our team was George Oates, Tom Armitage, Frankie Roberto, with a cameo data-munging appearance by computer scientist, Tom Stuart. Thanks also to Harriet Maxwell and Tom Flynn for working on the (unsuccessful) proposal to NESTA for funding, Felix Ostrowski for RDF-to-JSON advice, and Barry Norton for restarting the BM SPARQL endpoint.

Two Way Street is basically an exploded view of the catalogue. Once we’d processed the big catalogue into a format that was easier for us to work with, we built just a few simple template views on top of the catalogue. We also skewed the user experience towards learning about the acquisition history of the museum. There are some really interesting trends and people involved in the formation of the institution. The British Museum was founded in 1753, and is the world’s first public museum.

Here’s the home page, where we introduce the first of a handful of visualisations, acquisitions over time, by decade.


We’re also able to display a bunch of facets we selected as interesting. You can use them as leaping-off points into the collection. There’s another subtle visualisation there to show you which facets are well-understood in the metadata.


Here, you can see a list of all the people (or institutions) who found, excavated, or collected things…


Like Chloe Sayer, who found/excavated/collected 6,296 things in the later decades of the Twentieth century…


It’s a Ruby, Elastic Search, Heroku, AWS-y thing. We’re also making use of the British Museum’s data dump from last August, and hitting their SPARQL endpoint (possibly a bit harder than everyone is used to). I like to think we’re some kind of “cultural white hats” that might actually be able to constructively help the museum to understand and develop the infrastructure it needs to serve more external development.

There’s a little more about it all on the site’s About page, if you’d like to go and have a look. Tom’s going to follow up, too, with some thoughts on using Elasticsearch instead of a database, which we all through was pretty cool.

Sketching and Engineering

This is a guest post from Tom Armitage, our collaborator on the V&A Spelunker. It’s our second internal R&D project, and we released it last week.

Early on in the process of making the V&A Spelunker – almost a few hours in – I said to George something along the lines of “I’m really trying to focus on sketching and not engineering right now“. We ended up discussing that comment at some length, and it’s sat with me throughout the project. And it’s what I wanted to think about a little now that the Spelunker is live.

For me, the first phase of any data-related project is material exploration: exploring the dataset, finding out what’s inside it, what it affords, and what it hints at. That exploration isn’t just analytical, though: we also explore the material by sketching with it, and seeing what it can do.

The V&A Spelunker is an exploration of a dataset, but it’s also very much a sketch – or a set of sketches – to see what playing with it feels like: not just an analytical understanding of the data, but also a playful exploration of what interacting with it might be like.

Sketching is about flexibility and a lack of friction. The goal is to get thoughts into the world, to explore them, to see what ideas your hand throws up autonomously. Everything that impedes that makes the sketching less effective. Similarly, everything that makes it hard to change your mind also makes it less effective. It’s why, on paper, we so often sketch with a pencil: it’s easy to rub out and change our mind with, and it also (ideally) glides easily, giving us a range of expression and tone. On paper, we move towards ink or computer-based design as our ideas become more permanent, more locked. Those techniques are often slower to change our minds about, but they’re more robust – they can be reproduced, tweaked, published.

Code is a little different: with code, we sketch in the final medium. The sketch is code, and what we eventually produce – a final iteration, or a production product – will also be code.

As such, it’s hard to balance two ways of working with the same material. Working on the Spelunker, I had to work hard to fight the battle against premature optimisation. Donald Knuth famously described premature optimisation as ‘the root of all evil‘. I’m not sure I’d go that far, but it’s definitely an easy put to fall into when sketching in code.

The question I end up having to answer a lot is: “when is the right time to optimise?” Some days, even in a sketch, optimisation is the right way to go. If we want to find out how many jumpers there are in the collection – well, that’s just a single COUNT query; it doesn’t matter if it’s a little slow.

I have to be doubly careful of premature optimisation when collaborating, and particularly sketching, and remember that not every question or comment is a feature request. My brain often runs off of its own accord, wondering whether I should write a large chunk of code, when really, the designer in me should be just thinking about answering that question. The software-developer part of my brain ought to kick in later, when the same question has come up a few times, or when it turns out the page to answer that question is going to see regular use.

For instance, the Date Graph is also where the performance trade-offs of the Spelunker are most obvious. By which I mean: it’s quite slow.

Why is it slow?

I’d ingested the database we’d been supplied as-is, and just built code on top of it. I stored it in a MySQL database simply because we’d been given a MySQL dump. I made absolutely no decisions: I just wanted to get to data we could explore as fast as possible.

All the V&A’s catalogue data – the exact dataset we had in the MySQL dump – is also available through their excellent public API. The API returns nicely structured JSON, too, making an object’s relationships to attributes like what it’s made of really clear. A lot of this information wasn’t readily available in the MySQL database. The materials relations, for instance, had been reduced to a single comma-separated field – rather than the one-to-many relationship to another table that would have perhaps made more sense.

I could have queried the API to get the shape of the relationships – and if we were building a product focused around looking up a limited number of objects at a time, the API would have been a great way to build on it. But to begin with, we were interested in the shape of the entire catalogue, the birds’ eye view. The bottleneck in using the API for this would be the 1.1 million HTTP requests – one for each item; we’d be limited by the speed of our network connection, and perhaps even get throttled by the API endpoint. Having a list of the items already, in a single place – even if it was a little less rich – was going to be the easiest way to explore the whole dataset.

The MySQL database would be fine to start sketching with, even if it wasn’t as rich as the structured JSON. It was also a little slow due to the size of some of the fields – because the materials and other facets were serialized into single fields, they were often quite large field types such as LONGTEXT, which were slow to query against. Fine for sketching, but it’s not necessarily very good for production in the long-term – and were I to work further on this dataset, I think I’d buckle and either use the API data, or request a richer dump from the original source.

I ended up doing just enough denormalizing to speed up some of the facets, but that was about it in terms of performance optimisation. It hadn’t seemed worthwhile to optimize the database at that point until I knew the sort of questions we want answered.

That last sentence, really, is a better answer to the question of why it is slow.

Yes, technically, it’s because the database schema isn’t quite right yet, or because there’s a better storage platform for that shape of data.

But really, the Spelunker’s mainly slow because it began as a tool to think with, a tool to generate questions. Speed wasn’t our focus on day one of this tiny project. I focused on getting to something that’d lead to more interesting questions rather than something that was quick. We had to speed it up both for our own sanity, and so that it wouldn’t croak when we showed anybody else – both of which are good reasons to optimise.

The point the Spelunker is right now turns out to be where those two things were in fine balance. We’ve got a great tool for thinking and exploring the catalogue, and it’s thrown up exactly the sort of questions we hoped it would. We’ve also begun to hit the limits of what the sketch can do without a bit more ground work: a bit more of the engineering mindset, moving to code that resembles ink rather than pencil.

“Spelunker” suggest a caving metaphor: exploring naturally occurring holes. Perhaps mining is a better metaphor, and the balance that needs to be struck digging your own hole in the ground. The exploration, the digging, is exciting, and for a while, you can get away without supporting the hole. And then, not too early, and ideally not too late, you need to swap into another other mode: propping up the hole you’ve dug. Doing the engineering necessary to make the hole robust – and to enable future exploration. It’s a challenge to do both, but by the end, I think we struck a reasonable balance in the process of making the V&A Spelunker.

If you’re an institution thinking about making your catalogue available publicly:

  • API access and data dumps are both useful to developers depending on the type of work they’re doing. Data dumps are great for getting a big picture. They can vastly reduce traffic against your API. But a rich API is useful for integrating into existing platforms, especially if they make relatively few queries per page against your API (and if you have a suitable caching strategy in place). For instance, an API is the ideal interface great for supplying data about a few objects to a single page somewhere else on the internet (such as a newspaper article, or an encyclopedia page).
  • If you are going to supply flat dumps, do make sure those files are as rich as the API. Try not to flatten structure or relationships that’s contained in the catalogue. That’s not just to help developers write performant software faster; it’s also to help them come to an understanding of the catalogue’s shape.
  • Also, do use the formats of your flat dump files appropriately. Make sure JSON objects are of the right type, rather than just lots of string; use XML attributes as well as element text. If you’re going to supply raw data dumps from, say, an SQL database, make sure that table relations are preserved and suitable indexes already supplied – this might not be what your cataloguing tool automatically generates!
  • Make sure to use as many non-proprietary formats as possible. A particular database’s form of SQL is nice for developers who use that software, but most developers will be at least as happy slurping JSON/CSV/XML into their own data store of choice. You might not be saving them time by supplying a more complex format, and you’ll reach a wider potential audience with more generic formats.
  • Don’t assume that CSV is irrelevant. Although it’s not as rich or immediately useful as structured data, it’s easily manipulable by non-technical members of a team in tools such as Excel or OpenRefine. It’s also a really good first port of call for just seeing what’s supplied. If you are going to supply CSV, splitting your catalogue into many smaller files is much preferable to a single, hundreds-of-megabytes file.
  • “Explorer” type interfaces are also a helpful way for a developer to learn more about the dataset before downloading it and spinning up their own code. The V&A Query Builder, for instance, already gives a developer a feel for the shape of the data, what building queries looks like, and clicking through to the full data for a single object.
  • Documentation is always welcome, however good your data and API! In particular, explaining domain-specific terms – be they specific to your own institution, or to your cataloguing platform – is incredibly helpful; not all developers have expert knowledge of the cultural sector.
  • Have a way for developers to contact somebody who knows about the public catalogue data. This isn’t just for troubleshooting; it’s also so they can show you what they’re up to. Making your catalogue available should be a net benefit to you, and making sure you have ways to capitalize on that is important!

Internal R&D Project #2: V&A Spelunker

The second of our internal R&D projects, the V&A Spelunker is a new fun way to explore the corners of the vast Victoria and Albert Museum.

This is a copy of a blog post I wrote for the Victoria & Albert Museum Digital Media blog. I thought it would be nice to pop a copy here for posterity.

As part of our ongoing research practice, we’ve made a new toy to help you explore the wondrous Victoria and Albert Museum’s catalogue, the V&A Spelunker.

Spelunking is an American word for exploring natural, wild caves. You might also say caving, or potholing here in the UK. I hope using this thing we’ve made feels a bit like exploring a dark cave with a strong torch and proper boots. It’s an interface to let you wander around a big dataset, and it’s designed to show everything all at once, and importantly, to show relationships between things. Your journey is undetermined by design, defined by use.

The V&A Spelunker’s Skeleton

In some ways, the spelunker isn’t particularly about the objects in the collection — although they’re lovely and interesting — it now seems much more about the shape of the catalogue itself. You eventually end up looking at individual things, but, the experience is mainly about tumbling across connections and fossicking about in dark corners.

The bones of the the spelunker are pretty straightforward. It’s trying to help you see what’s connected to what, who is connected to where, and what is most connected to where, at a very simple level. You have the home page, which shows you a small random selection of things of the same type, like these wallpapers:

You can also look around a list of a few selected facets:

And at some point, you’ll find yourself at a giant list of any objects that match whatever filter you’ve chosen, like hand-sewn things, or all the things from Istanbul:

Just yesterday, we added another view for these lists to show you any/all images a little larger, and with no metadata. It’s a lovely way to look at all the robes or Liberty & Co. Ltd. fabrics or things in storage.

If you see something of interest, you can pull up a dumb-as-a-bag-of-hammers catalogue record view which is just that. Except that it also links through to the V&A API’s .JSON response for that object, which shows you some of the juicy interconnection metadata. Here’s a favourite I stumbled on:

(Incidentally, I was thrilled but slightly frightened to see this “Water cistern with detatchable drinking cup, modelled as a chained bear clasping a cub to its breast” from Nottingham in person in the absolutely stunning ceramics gallery, in person.)

The Beauty of an Ordered List

If you choose one of the main facets like Artist, or Place, you’ll get to a simple ordered list of results for that facet. It’s nice because you can see a lot of information about the catalogue at a glance.

You can see that the top four artists in the catalogue are Unknown (roughly 10%), Worth (as in the House of Worth, famous French couturiers), Hammond, Harry (‘founding father of music photography‘) and the lesser-known unknown.

I was curious to learn, at a glance, that most of the collection appears to come from the United Kingdom. (I might be showing my ignorance here, but this was a surprise to me.)

Here are the most common 20 places, with UK in bold:

London 59,661
England 42,178
Paris 36,890
Britain 32,388
Great Britain 27,486
France 23,540
Italy 11,562
Staffordshire (hello, Wedgwood?) 11,007
Germany 6,666
China 5,275
Europe 5,260
Japan 4,005
Royal Leamington Spa (3,857 hand-coloured fashion plates, from the ‘Pictorial History of Female Costume from the year 1798 to 1900’) 3,859
Iran 3,411
India 3,302
Jingdezhen (known for porcelain) 3,261
United Kingdom 3,098
United States 3,045
Rome 2,961
Netherlands 2,943

Catalogue Topology

Those simple sorts of views and lists start to help you make suppositions about the collection as a whole. Perhaps you can start to poke at the stories hidden in the catalogue about the institution itself. I found myself wanting to try to illustrate some other aspects of the catalogue that just its contents, and that’s when this happened…

The Date Graph

The Date Graph has three simple inputs, all date related. The V&A sets two creation dates for each object: year_start and year_end. Each record also gets a museum_number, which, as in the case of our weird bear, looks like this:


Those last four digits there normally represent the year the object was acquired. So, we snipped out that date and drew all the three dates together in a big graph.

The more you look around the date graph, the more you start to see what might be patterns, like this big set of stuff all collected in the same year. Often these blocks of objects are related, like prints by the same artist, or fragments of the same sculptural feature:

Some objects in the collection have very accurate creation date ranges, while some are very broad, even hundreds of years wide. The very accurate ones are often objects that have a date on it, like coins:

It’s also interesting to see how drawing a picture like this date graph can show you small glitches in the catalogue metadata itself. Now, I don’t know enough about the collections, but perhaps this sort of tool could help staff find small errors to be corrected, errors that are practically impossible to spot when you’re looking at big spreadsheets, or records represented in text. Here’s an example, from the graph that shows objects in the 2000-2014 range… see those outliers that look as if they were acquired before they were created?

Asking Different Questions

I kept finding myself wondering if the Date Graph style of view could show us answers to some questions that are specific to the internal workings of the V&A. Could we answer different sorts of questions about the institution itself?

  • When do cataloguers come and go as staff? Do they have individual styles?
  • Can we see the collecting habits of individual curators?
  • Does this idea of “completeness” of records reveal how software could change the data entry habits used to make the catalogue?
  • Do required fields in a software application affect the accuracy of “tombstone” records?

A new feature I’d like to build would be a way to add extra filters on the date graph, like show me all the Jumpers acquired by the museum that were made between 1900-1980.

It’s a Sketch

No Title by Jamini, Roy

Even though we put in a good effort to make this, it’s still a rough sketch. Now that it’s built and hanging together there are all sorts of things we’d like to do to improve it. If anything, it’s teased out more questions than answers for us, and that’s exactly what this sort of thing should do. My collaborator, Tom Armitage, is also going to write a post over on the Good, Form & Spectacle Work Diary about “Sketching and Engineering” in a little while too, so stay tuned for that.

We hope you enjoy poking around, and we’d love to hear of any interesting discoveries you make. Please tweet your finds to us @goodformand.

Go spelunking!


theresa-going-in by Theresa – CC BY-NC-ND