Crowdsourcing Project – Summary of Crowdsourcing Literature


Reading Crowds

This post summarizes research done by Piyush Kumar, Eugenia Ortiz, Chao Xu, Ajay Roopakalu, and Peter Organisciak.

What do we know about crowdsourcing? Certainly, its historically roots of group collaboration dig deep, but what of its current form? It is only in recent decades, as communications technologies became more efficient at connecting large groups of people together that we started seeing (if not yet noticing) ever larger, more dynamic creation and problem-solving by geographically dispersed people. Free software arrived, message boards arrived, wikis arrived… tech savvy people were becoming more comfortable working together while their tools for doing so became easier. Yet, still no satisfactory way of framing the big picture.

This is why when Wired published The Rise of Crowdsourcing in 2006, the term was quickly appropriated, a society collectively finding the words to tie this phenomenon together. In the article, Howe was looking at a very particular facet —crowd-power business— but only days after it was published, he admitted that the term had been appropriated, twisted and stretched. Crowdsourcing was the utilization of the wisdom of the crowds that James Surowiecki had popularly argued for. It also began to be used as an umbrella for many more specific facets that observers and theorists had been noticing. Most importantly, it was a verb. Crowdsourcing is not a product, it is a means; a way to do something.

As a tool, arguing about crowdsourcing is pointless without a context. It is not: Is crowdsourcing good, but rather, is crowdsourcing good for this or that task? The buzzword-free version of that question is simply, what sorts of things are good to asks large online groups to help with?

We’ve often heard of crowdsourcing projects, especially the venerable Wikipedia, as being impossible in theory but possible in practice. So, as theory is running to catch up to what is being observed, my team looked at what it is saying, looking for any sorts of patterns.


There are multiple ways that crowds can gather. These include:

  • Broadcast search: This type of crowdsourcing is when you broadcast a problem to a large number of people and they try to solve it. Essentially, the Internet here doesn’t change much from what you would do offline, except that you can ask more people. An example of broadcast search is contest sites (Innocentive, Crowdspring), where a party with a problem puts up a bounty for it to be solved, and the best design wins the prize. There have been different terms for this, but I find Karim Lakhani’s Broadcast Search (pdf) to be particularly apt.
  • Commons-based Peer Production: This is the term, coined by Yokai Benkler, to describe when people get together to create together, sidestepping traditional organizational structures.
  • Knowledge/Opinion Aggregation: When people combine what they know or experience, such as with Wikipedia. This is something truly unique to the Internet, at least on the scale that we can do it without it costing ridiculous amounts of money or bringing the project into chaos.

One consistent observation that crowdsourcing projects find is that just a small fraction of contributors make up a major fraction of contributions (Flickr Commons, Galaxy Zoo, Wikipedia, Australian Newspaper Digitisation Project). This means that, most basically, there are two types of users to think about when crowdsourcing: heavy users and casual users. The spectrum between these two is gradual, but that is what it ultimately boils down to. It’s easy to see how heavy users are assets, but the contributions of casual users, in large numbers, add up too. Also, it could be argued that if your system doesn’t cater to casual users, then there won’t be capacity to find and connect to those power users.

How to motivate continued involvement is still being considered, and most current theory approaches the topic on a case by case basis. Considering the breadth and diversity of crowdsourcing projects, this may be an appropriate method, as it does not appear that there are all-encompassing rules for doing it right. Still, this fuzzy area is slowly growing clearer.

Most often, crowd motivation is separated into two forms: intrinsic and extrinsic. Intrinsic motivation is when one does an activity for its inherent satisfaction, such as for the fun or challenge of the activity, while extrinsic motivation is when their actions are fueled by external rewards or pressures. These are not mutually exclusive, and multiple forms of either motivation can be present in a task.

Intrinsic motivation can be both enjoyment-based or obligation/community-based. One way enjoyment is maximized occurs is when a person’s skill matches the challenge of a task. A task that is too difficult may cause anxiety, while too easy may cause boredom (pdf). Another form that enjoyment is derived from is creative task accomplishment. Amabile, in proposing this link between motivation and creativity, defines a creative task as one that is heuristic (no identifiable path to a solution) instead of algorithmic (exact solutions are known) and novel and appropriate to the need. With obligation-based motivation, an individual is moved to act within the standards of the group within they’re working.

The importance of extrinsic motivation, however should not be underestimated, though. Money is one ever-reliable form. Projects absent from most other forms of motivation can still succeed with financial backing, albeit disingenuously. Another extrinsic motivator is in the product’s utility to the contributor. For example, somebody can contribute a function to an open-source product that they themselves need. Sometimes, there are also delayed benefits to participation; Lakhani and Wolf suggest the development of skills and career advancement. Recent, the development of social graphs online has encouraged the growth in popularity of achievements, with point and badge systems emergent from video games being applied to other types of systems. The gaming community is currently holding the discussion on achievements , but in the scope of my research, I found that achievements are nearly always a secondary motivator, strengthening the motivation of somebody already engaged rather than standing on their alone.

Though it may seem obvious, the passion of the users can’t be understated. If people like what the project is about, then they’ll participate. Galaxy Zoo recently classified its 60 millionth galaxy. Their success is greatly due to the interest of amateur astronomers. Australian Newspapers Project just reported a very successful pilot of crowd tagging and text correction in a project for digitizing Australian newspapers. There, amateur genealogists – a very dedicated community – found the project and realized that they were interested in its goals, contributing en masse to it.

Social movement theory talks about “mobilization potential” (Klandermans and Oegema, Motivations and Barriers 1987): how many people could potentially contribute and how many actually do. The idea is to focus on maximizing the numbers of those mobilized in relation to the potential. Don’t think about how to catch everybody, but rather: who would be interested, and how do *they end up at the site?*

“Those that view the crowd as a cheap labour force are doomed to fail.”” —Jeff Howe

Sincerity goes a long way: crowds don’t like being used. “For the common good” projects, such as charity, open-source, and education-based crowdsourcing, encourage people at least some of the way. Library of Congress, for example, found the altruistic angle to be successful. Lakhani and Wolf’s survey of Sourceforge users, however, found that it’s not just about the intrinsic reward of feeling good about contributing to Sourceforge; users often got something else that they needed (like money or a necessary product) out of it.

Building and engaging community is, for most projects, desirable. Howe talks about how Threadless and iStockPhoto put community first, commerce second. Consider the example of Netflix with the Netflix Prize, a million dollar reward for improvements to the recommendation algorithms. Netflix didn’t exert unfair, controlling rules on their participants, letting everybody keep their own intellectual property and simply license it to Netflix for the $1 million (and winners could go license it to competitors too). This may be a contributing factor to the remarkable generosity and communication that was seen among competing teams. These are but three examples, but the field is teeming with them; crowdsourcing relies on people, and thus works when participants are treated as people.

This is but a summary of what we found in reviewed the theory behind crowdsourcing. Understanding that this post just touches a facet of a sprawling area, we encourage you to post your thoughts or reactions in the comments. What do you think about crowdsourcing?