This site was developed to support the Government 2.0 Taskforce, which operated from June to December 2009. The Government responded to the Government 2.0 Taskforce's report on 3 May 2010. As such, comments are now closed but you are encouraged to continue the conversation at

Recognising the volunteers: Jhempenstall is my hero – who is yours?

2009 September 29
by Nicholas Gruen

I’ve been aware for some time of the National Library’s project for digitising old Australian newspapers.¬†But I only recently read the great story of the project told in this article by Rose Holley (pdf) who was appointed in 2007 to manage the¬†program.

From establishing the project at the beginning of 2007 with no idea about inviting the public in to correct errors in the optical character recognition (OCR) done by machines on contract to the NLA, the project is growing into a fabulously successful venture in which unpaid volunteers from the public play a major role in correcting the errors that fancy OCR software can’t get right (though it’s much improved from an aborted attempt to digitise newspapers in 1996).

Here are some highlights from Rose Holley’s write-up.

  • In the first month of use over 200,000 lines of text was corrected in 12,000 articles, by the end of 6¬†months 2 millions lines of text had been corrected in 100,000 articles.
  • At no point since release of beta has there been a time when text correction is not taking place. It¬†continues 24 hours a day 7 days a week.
  • 78% of users were based in Australia but there was also a growing international community with¬†users in the United Kingdom, United States of America, New Zealand and Canada. One of the¬†top ten correctors was based in USA.
  • The top ten text correctors were correcting significantly more text than all other users spending up¬†to 45 hours a week on the activity. The top corrector at the end of 6 months had corrected 101,481¬†lines in 2594 articles. The same correctors remained in the top five for the first 6 months.
  • No vandalism of text was detected in 6 months so no roll back to previous versions or moderation¬†was required.

Reading about it, I’m struck by the way in which the NLA stumbled upon the idea. If they’d not got a $10 million allocation for what is undoubtedly a very worthwhile program, would the structures of the public service have been flexible enough, would they have encouraged innovation from the ‘bottom up’ sufficiently to have allowed something like this to have gradually emerged from low level experimentation without some imperative to do the project from above? ¬†I mention it because Wikipedia had been around for a good while before the project got going – so one imagines some people had thought of it somewhere. And what’s the best institutional arrangement to spread the skills that the NLA have acquired with this project. ¬†The NLA itself seems to be keen to spread the value of its accomplishments, posting the code it has developed which seems a great start. ¬†But would housing less of the project within the NLA also be a good move? ¬†Might some more generic unit within government (or perhaps outside it) provide a better way of spreading those skills? ¬†I ask those questions quite naively and in full blown enthusiasm for the achievements of the project, not by way of criticism.

But one thing I want to do here is less¬†tentative and more specific. I wanted to pay tribute to the volunteers without which, quite literally, none of this would be possible. The ten biggest contributors to the project – volunteers from outside that is – contribute more than the nearly 1,300 other volunteers who make their own valuable contributions. ¬†(I’m one of them as of a couple of days ago, but so far I’ve only corrected a few lines!) ¬†Go and sign up yourself!

Naturally I want to pay tribute to those people out of gratitude that they’re serving the public interest. I expect you do too. Like Wikipedians, they do it for a variety of motives. For some of them it just bugs them when they can see an error! ¬†But the fact that what they are doing is of public benefit is a substantial motivator for many if not most. And when asked how the project can be made better, as Holley’s paper makes clear, many of them say things like this.

Recognize achievement ‚Äź Make a point to recognize achievements one‚Äźon‚Äźone and also in group settings. We like to think we are being noticed and are making a difference. Show us how we fit into the big picture.

So that’s what I want to do here. ¬†Here is a table of the top eight contributors at the time Rose Holley published her article.

1 Jhempenstall 101,481 lines corrected, 2594 articles

2 Cmdevine 90,823 lines corrected 1585 articles

3 Fwalker13 80,437 lines corrected 642 articles

4 Mrbh 79,248 lines corrected 1439 articles

5 Maurielyn 72,129 lines corrected 1192 articles

6 John F Hall 59,111 lines corrected 1632 articles

7 Jdickson2 28,796 lines corrected 2407 articles

8 JamesGibney 25,106 lines corrected 479 articles

So good on you all you good people. ¬†Good on you Julie Hempenstall from Bendigo whom the NLA tells me is now up to more than a quarter of a million lines. I’ve seen a list of the current top five and Julie’s held her top spot, with all of the rest having been stayers, who were also in the top eight above. ¬†The least this Taskforce can do is to acknowledge your fantastic work. I think that one thing we (the community) should definitely do is to encourage a culture of recognition and public support and approbation for such efforts.

But of course this new world we’re in of open source endeavour is full of such people making their contribution. I wanted to invite readers to nominate other leaders in other projects who have selflessly volunteered large amounts of their time to build the public goods of Web 2.0 in Australia.

If Julie Hempenstall is my hero, who is yours?

6 Responses
  1. 2009 September 29
    Neil Henderson permalink

    My hero is Maryantoinette Flumian (spelling might be not quite correct) – she is the deputy leader of Service Canada.
    She found out that government agencies were not finding out who had died amongst the Canadian population – in a timely & accurate manner. with teh result that people were being paid who were not in a position to go to the bank any more.
    So he designed a solution, which she knew government agency heads would not be comfortable with – so she took the idea to the politicians, got it funded, implemented the solution, saved the government a bundle of money and got the information in a more timely manner.
    This kernel turned into Service Canada which focuses on providing the services that citizens need irrespective of who provides them – not necessarily the same as the services government wants to provide.
    Well done Maryantoinette – I applaud your courage and success.

  2. 2009 September 29
    Brad Peterson permalink

    I’m struck by the way in which the NLA stumbled upon the idea.

    They “stumbled” upon it by looking at Project Guttenberg which has been doing the same thing for nearly 10 years.

  3. 2009 September 30

    Well done Rose Holley and her team and all the volunteers who make this initiative work – particularly the unnamed IT team member who actually suggested and ’sold’ the concept of public participation.

    Librarians (in the broader sense) were some of the earliest adopters and innovators on the internet and from all of my analysis are the most active public servants online, with more blogs, networks and online initiatives than the rest of the Australian public service combined.

    Why is this the case? There’s an unusual mix of skills and passion that is required by librarians. As custodians of our nation’s knowledge and with the specific goals of preserving and sharing the best – and worst – of our culture, but with a relatively low political risk environment it’s excellent soil for supporting innovation that finds difficulty rooting in other public sector environments.

    It’s no surprise to me that a disproportionate number of the online leaders in the public sector, like Rose and her team, have ties to the library and museum sector, including Bernard at Mosman City Council and Seb at the Powerhouse Museum.

    There are many examples of initiatives like Project Guttenberg in the not-for-profit and commercial sector which government could use as inspiration for crowdsourcing and transparency initiatives.

    The challenge, for both public and private sector organisations, is getting these initiatives championed, approved, adequately funded and having internal leadership with the stamina and passion to stay the course in the face of many barriers.

    An intrinsic disadvantage in the public sector is the limit on adequate mechanisms to reward the individuals and teams who drive innovation to compensate them for their blood, sweat and tears. While the inner glow of satisfaction when serving the public good and government is a motivator, it’s not the kind of driver that sees the level of effort put into so many innovative commercial start-ups.

    Possibly a good example to draw from is the SAVE Award, the latest US government initiative – where the best cost saving initiative suggested (online) by a Federal employee will be rewarded with recognition and a meeting with the US President. The agency with the highest participation level will also be awarded – kudos for its management – thereby encouraging US Federal Agencies to support, rather than frown on, participation.

    Here’s some examples that government can reflect on:

    Netflix prize – crowdsource improvements to systems, offering kudos and prize money to those who achieve significant improvements.

    Facebook translation application – crowdsource translations of content into multiple languages, using an automated comparison system to reduce errors, offering kudos for the most active translators.

    US Dept of Health and Human Services flu prevention competition – crowdsource advertising material for campaigns – in itself a way to raise public awareness. This may require modification of the Government Advertising Guidelines…

    SETI@Home – outsource the computing power required to solve major health and scientific issues through using unused cycles across millions of computers. This approach is used in a number of initiatives, such as genetic sequencing, and even if the government ran this type of project internally – across government-owned computers – it could leverage enormous computing power at low cost.

    Those are just top-of-mind ideas arrived at in a few minutes of thought. Many other examples exist that government could consider.

    As to the question in the original post – other leaders in other projects who have selflessly volunteered large amounts of their time to build the public goods of Web 2.0 in Australia.

    I’d nominate Matthew Landauer, Katherine Szuminska and the other volunteers involved in OpenAustralia who do work in this space for no finance reward whatsoever (and get little official recognition or support for their work).

    I’d also nominate Cheryl Hardy of Victoria’s eGovernment Resource Centre, managed by DIIRD, who has gone far beyond her paid job to make the Centre Australia’s best resource for aggregated information on Gov 2.0.

    Finally I’d nominate Professor Brian Fitzgerald, Tom Cochrane and the <a href="“>the team supporting Creative Commons Australia. Web 2.0 and Gov 2.0 initiatives require modern copyright regimes to underpin their success.

  4. 2009 September 30
    simonfj permalink

    Well, Rose is my heroine here,

    The other unsung ones are here. And don’t forget Warwick Cathro, who has been trying to break down the barriers in (the NLA at) Canberra for so long, particularly with these conferences. He’s a bit like’s Clerk of the House. Quiet, unassuming and willing to give anyone with a good idea their head, if only he could understand what the hell they’re saying.

    It helps that Rose is newish to the NLA = a Pom via NZ. Most of this progressive stuff always seems to require some new eyes moving into an established (do i hear stagnant?) culture. I can’t see how the corrections part of the newspaper digitialization project would have got a chance if she hadn’t have had bit the bullet and opened it up for PLU’s.

    What Craig said about (people with a) librarians’ (training) being the leaders in the public space is quite true (like most of what he says). It’s because of the nature of their profession (and the people it attracts). They can’t help but want to open things up. The great pity is that the ones who run uni libraries aren’t as progressive as the public ones, which is why we never get the money allocated for progressive media like what youse is doing.

    They’d rather spend (our) money for a third party publisher to aggregate the journals (of their uni’s authors). They spend billions every year, rather than communicating with their global peers and doing the aggregation themselves, and adding all the other I and C stuff which we all know about.
    Oh, and by the way, they (the majority) DO insist that all authors upload their papers to the uni (open journal) web site, and then sometimes dump the lot into a National box. Bone idle marketing which reinforces a silo mentality.

    I have lots of heroes and heroines around the world who work at building National and Global subject specific groups, and breaking down the silo culture; many take the Flumien approach. It’s great, although it always worries me because it’s a bit like taking a chip out of a monolith. These progressives always get burned out so fast, and no one can compensate them enough.

    In the ‘changing a culture’ space Fred’s the best. But he has the advantage of living in a culture which is a lot more open than the Australian one – you won’t find Lifelong Learning & OER on many agenda’s in Oz. In the space, you, Pia and Roxanne@aph’s library are my heroes Nic. Craig and Peter Alexander are a bit alright too.

    So, as I continually say to Pia, “don’t burn out.” Its a culture we’re trying to see changed and one can’t often point to the communication which is changing it, because not many people feel encouraged enough to talk above the radar like this. Also, there are are so few open places, especially in Oz, where it could take place. It’s out of our silotic scope.

  5. 2009 October 2
    Rose Holley permalink

    Thank you for your positive comments. I have not done this alone and would like to thank and acknowledge the whole Australian Newspapers team: Ninh Nguyen, Bronwyn Lee, Mark Raadgever, Cathy Pilgrim and most especially Kent Fitch who had the original idea of public text correction and was the lead system architect.

    My heroes are the ’super-text correctors’ members of the public who are working in their own time, almost full time on text correction. Julie Hempenstall is leading the field. Some of the others wish to remain anonymous so I won’t name them.

    Early this morning our wonderful volunteers have now corrected 6.23 million lines of text – quite amazing. And remember we have not yet done any serious publicity or promotion for the service, since the ‘official launch’ has not yet happened!!

    If you want to read about some other amazing people and initiatives I strongly recommend a book called “Here Comes Everybody” by Clay Shirky. Clay has been writing about the power of individual social engagement in the digital space for some time, but has recently noted that when people get together to achieve a big shared common goal the results are even more stunning. A great example of this (other than the Australian Newspapers) is the UK Guardian Expenses Scandal where the public worked flat out together to find information in digitised expenses records. Read about the lessons that were learnt- government, libraries and archives really need to wake up to opportunties like this.

    To help us wake up an unpaid hero of mine Liam Wyatt (VP of Wikimedia Australia) recently was the covenor of a large gathering in Canberra: the GLAM-Wiki event. People from Government, Galleries, Libraries, Archives and Museums were invited to get together with Wikimedia volunteers to discuss what we could all do for each other and things we thought we needed each other to do to enable more engagement to happen. The outcome was a list of recommendations which I hope the Gov 2.0 Taskforce are considering.

  6. 2009 October 7
    Mike Ridout permalink

    Not sure if hero is the right word but the contributions to the Oxford English Dictionary in the late 1800s by Dr William Minor was staggering (between ten and fifteen thousand illustrative quotations apparently) especially given he was incarcerated in an Asylum. I believe credit for structuring this early crowd-sourcing effort to populate the Oxford English Dictionary is due to Chevenix Trench. Simon Winchester writes about it all in ‘The Surgeon of Crowthorne’, a great read.

Comments are closed.