[http://www.nypl.org/collections/labs]

Ben Vershbow, New York Public Library

I had trouble distilling this. My challenges have kind of churned and mixed into the conversations we had earlier. I have tried to group them into a few categories and the challenges are around the why. Why are we doing this? There are many answers to that question depending on the project, on the discipline, on the sector that you are working with, on the communities that you are trying to engage or eventually serve. But this can come down to brass tacks: What are we doing with the data produced through these projects?

Coming from primarily a cultural heritage, digital humanities, library kind of space, but very deliberately experimenting in adjacent zones like citizen science and even with journalism, I am thinking about the different reasons for doing this. Are you generating a free-standing research data set? Are you trying to return data back to a corpus who will enrich it and make it more discoverable—that round-trip question of the data? I think we are past that first, “Will this even work?” phase or “Will anyone even do this?” I think we need to start asking some very big questions about why we are doing this and how to open up a path to integrating them or getting the data out for research and other purposes.

There are questions around the how: How are we doing this? I think this gets to questions about the team and staffing needs. What are the professional work categories that need to be created to actually support this work beyond, “Let’s try a trial phase”? I think that is something we are very much doing with NYPL, where we have demonstrated some models, we have been collaborating with various partners working in this space, and I think we have proven that there is a lot of potential here. But we are still kind of funding it and resourcing it as an additive. There isn’t a hard examination of: Wow, maybe this is what cataloging looks like, or a part of what cataloging looks like. Maybe this is what metadata teams need to do, they need to be figuring out a kind of data life cycle that goes through various publicly engaged iterations and then returns back for the verification, validation, and quality control paces that it needs to be put through.

Then you start to expand to: How can we leverage computational methods? How can we explore computer-human collaboration? How can we do that not just in the now but in training for more advanced computational purposes beyond this work eventually? There are just so many questions.

I think you could ask questions about new kinds of librarianship, new kinds of curatorial work, new kinds of archival work, and those need to be addressed and not just seen as additive or tacked on to largely unreconstructed institutional structures.

And very much drawing from both of those is the bigger story. How do we go from a very nascent, “Let’s try things out, and see what kind of larger paradigm this points to,” which may be in very faint outlines right now. Language has been thrown around about what does a national crowdsourcing platform look like, and there are platforms being built. Zooniverse already has a platform and is now building an even more robust technical platform where you can spin out projects without any technical assistance, and other platforms have emerged. When we talk about platforms from a cultural heritage space, I think there needs to be a narrative around that. I think with a lot of digital projects we’ve tried at NYPL, we’ve tried to create really compelling narratives that gesture at that why, that explain enough of the how for people to see where they’re fitting in.

I think on a project basis, a collection basis, we have seen good examples of those narratives being well done, even narratives that have just been reported. For example, when the social layer around the Shelley-Godwin archive is built they are already, as a sector, thinking of the national, integrated, interlinked corpora that we are trying to build. What is the big narrative around that? I think we need to figure that out. I sometimes think of this as a kind of generational process of migrating data forward, migrating knowledge to a new medium. That may be a little abstract and may not appeal to everyone, but it is a place to start.

What are we all collectively doing? What is this big public works project that we are all engaging in? Some of those bigger metanarratives might help us both think more broadly in a longer range way about what we are building together and how to link all of those efforts together, but also how to engage the public.

I was talking to Sharon Leon earlier today about the survey data that comes in in response to: “Why are you participating?” A lot of people articulate something very broad: “I see it as my civic duty.” That’s really interesting. Maybe we can tap into that one.

This presentation was a part of the workshop Engaging the Public: Best Practices for Crowdsourcing Across the Disciplines. See the full report here.