These are notes, thoughts and observations from reading, “Mass Collaboration Systems on the World Wide Web” by AhHai Doan, Raghu Ramakrishnan, and Alon Y. Halevy.
First attempt to define MC:
A Mass Collaboration System enlists a mass of users to explicitly collaborate to build a long-lasting artifact that is beneficial to the whole community.
This definition does not allow for ‘implicit’ collaborations that arise as secondary benefits to the primary purpose. Also, Mechanical Turk enlists people to carry out short-sighted tasks that are not long-lasting nor part of an overall artifact (as far as they can tell).
Next attempt to define MC as a general problem-solving method:
An MC System enlists a mass of humans to help solve a problem defined by the system owners and in so doing addresses four fundamental challenges:
(1) How to recruit and retain users?
(2) What contributions can users make?
(3) How to combine user contributions to solve the target problem?
(4) How to evaluate users and their contributions?
Traffic lights coordinate the behaviors of a mass of human drivers but are not looking to enlist more drivers but to reduce and/or coordinate them, thus such a system is referred to as a Mass Management system, instead.
This paper used a survey to explore MC Systems that addressed these four fundamental questions as applied to the Web. It classifies the nature of collaboration, type of target problem, and other dimensions.
Degree of manual effort. How much is asked of participants and of owners? Ratings vs Coding, for example. Do owners have to analyze aggregate data from users or the other way around?
Role of human users.
- Slaves - solve problem with divide-and-conquer approach while minimizing resources (Mechanical Turk).
- Perspective Providers - humans contribute different perspectives when combine into better solutions, such as writing reviews or aggregating bets to make predictions.
- Content Providers - humans are components in the target artifact as creators or community, so that the owner can monetize them (e.g. ads).
Humans can play multiple roles within a single MC system. Knowing these roles can determine how to recruit. For example, in a perspective providers system, you’d want cognitive diversity to avoid groupthink.
- Evaluating - users evaluate items with textual comments and numeric scores (GoodReads)
- Sharing - users share items such as products, services, textual knowledge and structured knowledge (YouTube)
- Networking - users construct a social network graph that the owner can exploit to provide services and users typically cannot edit one another’s content (Facebook)
- Building Artifacts - user inputs are merged tightly by requiring them to edit and merge one another’s inputs (Wikipedia)
- Executing Tasks - users engage in task completion exercises, such as searching for objects in images, folding proteins, cooperative debugging (crowd-souring parts)
Stand-alone vs Piggy-back systems. The latter involves building off of a well-established system to solve a target problem. For example, using reCaptcha’s that are partially known to figure out the missing pieces, or a recommendation engine based on customer purchases. Piggy-backs don’t have the problem of recruiting users but need to evaluate user inputs.
Stand-alone systems can be “games with a purpose” such that users are implicitly collaborating as a side effect in order to solve a problem. For example, prediction markets let users bet on events then aggregate those bets to make predictions, believing that there is ‘intuitive wisdom’ in the crowd. Massive online multi-player games invite users to solve problems together in campaigns with a target of growing user communities.
Challenges & Solutions
How to recruit and retain users?
- Require users to make contributions if we have the authority to do so
- Pay users
- Ask for volunteers
- Make users pay for service (directly or indirectly)
- Piggy-back on the user traces of well-established systems
- Instant gratification
- enjoyable experience or necessary experience
- establish, measure, and show fame/trust/reputation
- competitions to establish leader boards/awards/prizes
- provide ownership situations (user feels they ‘own’ part of the system and cultivates it)
What contributions can users make?
- How cognitively demanding are the contributions? Classify users into groups: guests, regulars, editors, admins, dictators, and give them tasks/access accordingly.
- What should be the impact of a contribution? In a knowledge based system, flagging a typo has less impact than altering policy that affects all pages; maximize total contribution of a few, high-ranking users by asking them to make potentially high-impact contributions whenever possible.
- What about machine contributions? If algorithms are used for tasks, then have human users make contributions that are difficult for machines.
- UIX should make it easy for users to contribute.
How to combine user contributions?
What to do when users differ? Weightings, voting, trustworthiness of users, manual dispute resolution… to determine outcome.
How to evaluate users and contributions?
Malicious users - blocking, detection, deterrence. Different roles for who can submit data and who can clean it up / manipulate it.
Use trusted monitors, distributed monitors, enlist ordinary users.
Punish malicious users… public shaming, private ratings (360 degree review by all users of one another).
Trends for the future:
- more generic platforms - MC systems that can be deployed easily into different domains
- more applications and structure - moving beyond data and services towards building db’s and structured services (employing automatic techniques)
- more users and complex contributions - broader range, scope, appeal and enabling naive users to make more complex contributions (such as creating software without writing code)