"On Google Scholar, the name xeno-canto appears in 5,000 papers"
How volunteering took wing: xeno-canto and the power of open data
A website with recordings of nature sounds, xeno-canto, was one of the three winners of the Dutch Data Prize 2024. Initiator Bob Planqué tells how he and his companion Willem-Pier Vellinga made the website a figurehead of FAIRness. "We have become a kind of purveyor of recordings with extensive metadata."
The winning research datasets stand out in terms of discoverability, accessibility, interoperability and reusability (FAIR). After xeno-canto received the Dutch Data Prize in the Life Sciences and Health category, congratulations poured in from all over the world. "That was fun," says Planqué. "Then it becomes apparent how big a field we serve with our site." This success is the result of decades of work. Unlike most candidates at the Dutch Data Prize who have been up and running for a few years at most, xeno-canto has been around since 2005.
Shoeboxes of cassette tapes
It was - and is - mainly voluntary work. Planqué and his associate Willem-Pier Vellinga work in STEM sciences, but in their spare time, they study bird sounds. "In those days, you still bought CDs, sometimes DVDs, with bird recordings. Many people also had a shoebox of self-recorded cassette tapes. The problem was that the more recordings you collected, the less you got out of them. Because, especially in the tropics, you are dealing with hundreds of species. Are you going to listen to all those recordings one by one to know which bird you are hearing?"
"At that time, a revolutionary product was published by another Dutchman, the now deceased Sjoerd Mayer. He programmed software that allowed you to unfold those sounds, as it were: you could scroll through them and compare sounds. But that only worked with limited collections on DVD."
Critical mass
"I had learned a bit of programming. We thought: we'll just make something for the web. Then we'll put our own recordings on there and send an email around to other bird watchers: who's in?"
That set-up proved unexpectedly successful. "Pretty soon, Mayer and two of his collaborators proved willing to add their collections. With that, we immediately had 40 per cent of the four thousand species for South America! So we were immediately relevant. And everything was available for free to anyone with Internet." Xeno-canto had reached its critical mass. "People thought: now I can finally do something with that old shoebox full of tapes. That's how that little ball started rolling."
List of principles
A few years later, the open nature of xeno-canto was formalised. "For a conference in Brazil, I was invited to talk about the project. I thought: maybe it would be nice to draw up a list of principles. What do we stand for? One of those principles is that we use Creative Commons licences. Recordings that are on our website are always freely shared with others. That's what we call FAIR these days."
"Xeno-canto is a digital project that's not growing as fast as the Internet"
"Another principle: there is no authority. Anyone can always participate in discussions about recordings, even those who do not provide recordings themselves. You have to be a member, because we want to know who you are. But everything can always be questioned. And as long as there is discussion about the identification of a species, the recording in question will be set aside."
People with little ego
Membership of xeno-canto is free of charge. This means that birders from less wealthy countries can also join. That's what Planqué is most proud of: "Those people can post their recordings with us and have their say. So they can make a name for themselves internationally as experts on their own fauna."
This set-up of xeno-canto has created a very egalitarian atmosphere, Planqué says. "There are no validators who have the final say on identifications. Members are people with relatively little ego, because they know that their authority might be challenged. In fact, our forum functions as a peer review, with all members being peers. Of course, mistakes will sometimes occur, but we know that the margin of error is no different with us than elsewhere."
Systematic raiding
Twenty years of equality and openness have paid off. Xeno-canto is heavily used, not only by enthusiasts but also for scientific research. "On Google Scholar, the name xeno-canto appears in 5,000 papers," Planqué mentions with satisfaction.
A recent, fast-growing application is machine learning: algorithms learn to recognise birds automatically from the recordings of xeno-canto. "In the field of nature sounds, we have become a kind of court supplier of recordings with extensive metadata. That's why many groups around the world come to us."
"If we had collected moving images instead of sound, we wouldn't exist anymore"
So Xeno-canto was already ready for the future? "In fact, it was. Our site has been accessible with software for a long time, through an API we provide. So you can systematically loot the site; we allow that. Our aim now is simply to facilitate the work of others. The better we can do that, the more success we have." The algorithms thus trained can be found in mobile phone apps such as BirdNet. With these, you can see what you hear while walking through the forest or standing on your balcony.
Misguided frugality
Are there things that worry Planqué? "Yes. In science, it is almost impossible to get money to maintain collections like xeno-canto. That's a global problem. If you have a good idea for a collection, you usually manage to get it off the ground. But after that, the problems start. People also foolishly fail to realise that collections are often irreplaceable. They represent enormous capital and that can disappear in the blink of an eye."
That xeno-canto has existed for almost 20 years is partly due to the support of Naturalis, which pays for hosting and maintenance. Although in its own collection management, that institution also suffers from misguided thrift among funders.
Lucky break
The second explanation for xeno-canto's survival is technological: "We have been around for so long because we cost so little. Xeno-canto is a digital project that is not growing as fast as the Internet. By the time we need more storage, for example, that storage has become correspondingly cheaper. Few projects can be run this way, so we have been lucky. If we had collected moving images instead of sound, we wouldn't exist now."
"Being nominated for the Dutch Data Prize puts you in a stronger position with funders"
Partly for this reason, Planqué strongly advises anyone with a data collection to enter the Dutch Data Prize. "If you are nominated you are in a stronger position. The people who decide about the money are more likely to realise that your work is important."
Prize money
Of course, it's also nice to be able to take home the prize money. What will xeno-canto do with that €3,500? Planqué: "In principle, the money is intended to make the collection FAIR, but xeno-canto is already maximally open. However, we also have plans for which we are working with partners abroad. A lot of consultation can be done via Zoom, but every now and then you still need to speak to people in person. We can pay for those trips more easily with the prize money now."
What do these plans entail? "Machine-learning! Together with the people at Naturalis, we are training our own algorithm. With this, we hope to be able to dive much deeper into the data in the coming years. Which sounds are very similar and which are not? What song types can be distinguished and what geographical variation is there in bird song? In this way, xeno-canto can give a huge boost to this area of science."
Text: Aad van de Wijngaart
Photos: Vera Duivenvoorden
Bob Planqué (1977)
2000 - 2004: PhD at Centre for Mathematics & Computer Science
2005 - 2006: research associate at University of Bristol
2006 - present: successively lecturer and senior lecturer at Vrije Universiteit (Mathematics Department)
2023 - present: member of Amsterdam Center for Dynamics and Computation
2005 - present: xeno-canto