Phase 1: Sketching in code, drawing the data
About 11 months ago we started working with the web team at Wellcome Library on a project called What’s In The Library?. We were installed on site at Wellcome for a month and had been asked to explore four themes, one per week: Scope of the Catalogue, Show the Thing, Content/Context, and Scaling the Work. It was a fast-moving, exploratory month, and we thoroughly enjoyed it. You can see the work we made at whatsinthelibrary.com, and we also blogged something like once a day about what we were doing. (I love that we have this record of that now.)
For background, you could poke at the project outline from the G,F&S blog back in July 2015, or have a look at our That’s a Wrap post of Phase 1; a write-up I did of our final presentation to the staff.
It was great to see the Wellcome Library staff engage via comments on the What’s in the Library blog too, like June, who works directly with the library catalogue and surrounding metadata standards. (Emphasis mine.)
Speaking on behalf of the cataloguing team, I just wanted to say how useful we have found this work to be for data clean up. Inevitably, in a vast bibliographic database where records are drawn from a variety of external sources, errors can creep in. Thanks to the work of the project team, we have been made aware of these errors and have been able to correct most of the genuine ones by running global updates. It would be fantastic if the tools to do this continued to be available after the project has ended.
However, it should be noted that some things, such as the problematic dates, are not really errors at all. Commonly, resources present without dates and have to be recorded as ‘date unknown’ (that’s when a ‘u’ would be supplied) in order to avoid recording inaccurate information. As a rule of thumb, we feel it is better to give no information than to give the wrong information.
It is certainly true that using external data services is not the answer to everything although it can help to speed up our work enormously. Library resources are complex which means that the task of describing them is difficult and relies on cataloguer judgement more than machine processing. I hope we can bring together all of the tools and skills needed to make our collections more visible and interesting to Wellcome Library users.
I thought June’s reaction was spot on. It was great to hear that she felt there was potential in the sorts of visualisations we were presented, but I could totally understand her point that “the task of describing [library resources] is difficult and relies on cataloguer judgement more than machine processing”.
There’s a very interesting challenge wrapped up in that. We were deliberately naive as we began working with the metadata, just visualising it very straightforwardly and steering clear of the intense detail that the humans have encoded in it. So often, when I ask a simple question of folks who work with this kind of data every day, their answer begins with “it depends…”, which I love. It is difficult to throw computing at this sort of subjectivity. And, I like that it is deeply subjective, even though there’s such a strong desire to work with agreed standards. One thing I continue to try to reveal is that exactly that: there are humans in this work, and always have been. That simple fact implies that catalogues will be inevitably different, because we all have personal, subjective, educated ways of interpreting what’s in front of us. That’s what makes cultural heritage so special and interesting.
Phase 2: Simple performance improvements, basic editing
Anyway. We returned to the library around Christmas 2015 to continue the work. We had been very careful to describe our initial spurt of work as rough, and very much not production-ready. It was much more about sketching and exposition, than making something intended for anyone other than the Wellcome team themselves. Everyone was also aware that there’s sometimes a tendency to rush “production-ising” a cool web thing, perhaps particularly in the cultural heritage sector. Even knowing that, we returned to work on making the explorer we ended up with in Week 4 a little more stable, and a little more performant. We added some monitoring and fixed some bugs that hung around after Phase 1.
Our main exploring pivots were entities that already existed in the data, but weren’t particularly surfaced in the “classic” catalogue:
- People – there are 00,000s of “actors” in the Wellcome collection. From 15th century Italian scholars to 20th century German painters to a contemporary British neurologist, surfing around the people in the system helped me to explore the beginnings of medicine as a discipline, and see subject matter experts and the things they made. No search required.
- Subjects – it was really fun to explore the Wellcome-specific subject headings. It’s quite an eclectic reflection of Wellcome’s interests in medical history, travel, and other cultures.
Frankie is going to write a little more about how we were able to construct these entities and connect them with other services like Wikipedia, LC, and VIAF, from a technical point of view.
Showing the Work
Everyone who works with a catalogue every day has great muscle memory using it, and that can make it really hard to re-imagine it from a standing start, or to witness features that new visitors don’t understand. We were keen to show some normal humans the alpha, so we did an afternoon of user research in the Viewing Room at the Wellcome Collection, luring people with the promise of free chocolate.
It was quick and cheap and hugely informative. I quite enjoyed seeing people as they found the room where the free chocolate was. It reminded me a little bit of raccoons wandering looking for tasty morsels. It’s surprising what humans will do for a) free stuff, and b) chocolate. We were able to make simple modifications that day and the next in direct response to people’s questions, and that was very satisfying.
The challenge of rigid, centralised tools
It’s difficult for the broader Wellcome team to make small modifications or corrections to the catalogue if/when you spot them, and their using tools they don’t specifically control, purchased from big vendors in other countries who have lots of clients and a long list of bugs to fix. This is a huge issue, and not just for Wellcome. I found myself encouraging the idea that if Wellcome was able to hire a crack team like us that they could start to assume more control over their own data, instead of off-shoring it. It’s great to see they’ve just put out a bunch of job ads looking to hire more software folks.
In any case, we wanted the site we were making (and now beginning to call “Alpha”) provide a little flexibility. We added basic tools for staff to do editorial things like picking popular subjects so they show up on the homepage, or connecting a person to a blog post or exhibition about them. It’s not about correcting metadata yet, because Alpha is not connected to the live catalogue database, so it’s just a little way to give the web team the ability to highlight fun and timely subject matter, like Beards, Sex, and Florence Nightingale.
Phase 3: More exploring, more collections, real-time activity, “alpha”!
We left Phase 2 with a set of nice-to-have features to keep working on. It’s funny how that always seems to happen when you make software… In Phase 3, we built a set of new features:
- Collections – to show existing sets of digitised things, like a handy list of the art in the Reading Room
- Types of thing – also known as “format” although that can be a tricky term to understand, and
- New Stuff! feed – designed to illustrate all the digitisation activity at Wellcome. That turns out to be a great way to stumble on things. Thanks to long-term Wellcome collaborator, Tom Crane at Digirati, for helping us with a special hook into this feed.
Now our little alpha sketching crew had developed something which was starting to show potential for broader deployment. It was also at the point where the higher ups wanted to see that our alpha concept was actually an improvement on the existing online experience. I was a bit hesitant about trying to demonstrate this, mainly because all in all, we’d worked on our site for a total of about 6 weeks, and the main library site was a mature, long-standing product. We had to figure out a simple way to try to compare the two systems, and it seemed like it should be something about the efficiency of getting people to stuff.
One more general thing I’ve noticed in the stats for the other explorer-y projects we’ve made at Good, Form & Spectacle is that these “generous interfaces” increase people’s time on the site. What we don’t know is their state of mind as they’re exploring, but we like to think that they’re happy bouncing around looking at things that attract them, and not just looking for a search box! So, time on site is a key metric we’re using to compare the “classic” site and this new alpha. Another simple goal to try to track is about how many people see 3+ “items” using each system – that’s a work in progress. Chloe from Wellcome introduced us to Optimizely, which looks great for filtering different users into different versions of your site. We’ve set it up so you might enter the Alpha site from the existing www website to try to get more traffic into the Alpha.
Finally, some questions from Jenn Phillips-Bacher
Jenn has been our key collaborator at Wellcome, and towards the end of Phase 3, we chatted about what we would like to hear from the other. You can also have a look at Jenn’s presentation about the experience from her perspective, or follow her on Twitter @MrsAudiac.
These were Jenn’s prompts for me:
1) What’s it like to be embedded in a team?
It’s great. Getting to know everyone was a big part of this work, and to have direct access to people literally across the desk was very helpful. If there was a question about a data field or a process, we could just ask. I will say, though, I don’t think we’d set up the project particularly to analyse and/or report on what we saw in terms of the organisation and its workflows, and this is something I’ll be thinking about for future projects. I suppose it’s the blessing and curse of the consulting designer who comes up with all sorts of ideas about processes they witness but haven’t created a space to report back on observations. I know that’s a bit coy, but it’s turning into something I might try to give a talk on.
In any case, I’ve found the people I’ve met at Wellcome to be hugely welcoming, forthcoming and flexible, and that’s been a real pleasure. I’m really happy we’ve come this far with the work, and that’s largely because room has been made for it, even though it’s quite experimental.
2) What kind of data or content were you expecting?
I’ve had a little exposure to library data thanks to my job running the Open Library project a while back. We were dealing with a system that contained some 30 million records, mostly in MaRC format, with lots of duplicates or empties. It was fun to see Frankie and Tom encounter library land for the first time, and, pragmatic as they both are, ask perfectly reasonable questions about why things are the way they are. That was another good part about working on site, because we were able to chat with June and Branwen, who know the catalogue backwards, and could explain its idiosyncrasies.
In terms of content, I didn’t really know what to expect. Generally speaking, the amount of available content is normally dwarfed by the available metadata in a catalogue, especially one as large as Wellcome’s. I didn’t know before I started that Wellcome has an amazing medically oriented art collection though, lovingly collected over the last 40 years or so by one person at the institution. You can get a taste for it through the digitised Art Collection view. We stumbled on loads of kooky, beautiful, old, anatomical bits and pieces as we built out the explorer. It’s a fun, unique collection.
3) How would you compare Museum vs Library generous interfaces*?
At their hearts, they’re sort of similar. They contain people, subjects, chronology, and items. A simple difference is that with libraries you end up at a book, which is a little slower to digest than a thing. Often libraries also have objects or artworks in them (like Wellcome does), and museums that hold books, so it’s often a bit of a data melange anyway.
I also enjoy that in libraries, chances are you’re describing a copy of something instead of a unique object, which is perhaps more common in, say, an art museum. I think that leads to interesting descriptive challenges to try to say that one library has a “manifestation” of this “edition” of a “work”, and another library has another “manifestation” of that same “edition” of the same “work”. And there you have a totally weird but sort of useful “conceptual entity relationship model” called Functional Requirements for Bibliographic Records (FRBR) which is trying to address this puzzle. It was also interesting to touch the surface of the Wellcome Archives in this project too. We realised pretty quickly that our base interface for exploring the library was simply unsuitable for showing the more narrative descriptive style, vast hierarchies, and not necessarily single items per record that the archives data seemed to show. That’s work for another day, and I’ll be interested to try to design an explorer interface that’s specific to the nature of archives…
So, there you have it.
We’re really happy that what started out as a very R&D and sketching sort of project has now transformed into this Alpha. The next little while is about trying to observe how people use it, and see if there’s really really real potential to move this type of catalogue explorer into mainstream adoption.
If you’ve read all that and still possess the mettle to visit the Alpha, here’s a link, and we hope you enjoy it!
* A “generous interface” is a term coined by Mitchell Whitelaw. It’s about not forcing people to search, but helping them explore instead.