Workshop: Crowd Science Workshop: Remoteness, Fairness, and Mechanisms as Challenges of Data Supply by Humans for Automation

Daria Baidakova, Fabio Casati, Alexey Drutsa, Dmitry Ustalov

2020-12-11T08:00:00-08:00 - 2020-12-11T16:00:00-08:00
Abstract: Despite the obvious advantages, automation driven by machine learning and artificial intelligence carries pitfalls for the lives of millions of people: disappearance of many well-established mass professions and consumption of labeled data that are produced by humans managed by out of time approach with full-time office work and pre-planned task types. Crowdsourcing methodology can be considered as an effective way to overcome these issues since it provides freedom for task executors in terms of place, time and which task type they want to work on. However, many potential participants of crowdsourcing processes hesitate to use this technology due to a series of doubts (that have not been removed during the past decade).

This workshop brings together people studying research questions on

(a) quality and effectiveness in remote crowd work;
(b) fairness and quality of life at work, tackling issues such as fair task assignment, fair work conditions, and on providing opportunities for growth; and
(c) economic mechanisms that incentivize quality and effectiveness for requester while maintaining a high level of quality and fairness for crowd performers (also known as workers).

Because quality, fairness and opportunities for crowd workers are central to our workshop, we will invite a diverse group of crowd workers from a global public crowdsourcing platform to our panel-led discussion.

Workshop web site: https://research.yandex.com/workshops/crowd/neurips-2020

Paper submission portal: https://easychair.org/conferences/?conf=neurips2020crowd

All submissions must be in PDF format. The page limit is up to eight (8) pages maximum for regular papers and four (4) pages for work-in-progress/vision papers. These limits are for main content pages, including all figures and tables. Additional pages containing appendices, acknowledgements, funding disclosures, and references are allowed. You must format your submission using the NeurIPS 2020 LaTeX style file which includes a “preprint” option for non-anonymous preprints posted online. The maximum file size for submissions is 50MB. Submissions that violate the NeurIPS style (e.g., by decreasing margins or font sizes) or page limits may be rejected without further review.

As an author, you are responsible for anonymizing your submission. In particular, you should not include author names, author affiliations, or acknowledgements in your submission and you should avoid providing any other identifying information.

Video

Chat

Chat is not available.

Schedule

2020-12-11T08:00:00-08:00 - 2020-12-11T08:15:00-08:00
Introduction & Icebreakers
2020-12-11T08:15:00-08:00 - 2020-12-11T08:35:00-08:00
Data Excellence: Better Data for Better AI (by Lora Aroyo)
Lora Aroyo
2020-12-11T08:35:00-08:00 - 2020-12-11T08:45:00-08:00
Q&A with Lora Aroyo "Data Excellence: Better Data for Better AI "
2020-12-11T08:45:00-08:00 - 2020-12-11T09:00:00-08:00
A Gamified Crowdsourcing Framework for Data-Driven Co-creation of Policy Making and Social Foresight (by Andrea Tocchetti and Marco Brambilla)
Andrea Tocchetti
Over the last decades, communication between governments and citizens has become a remarkable problem. Governments’ decisions are not always aligned with the visions of the citizens about the future. Achieving such alignment requires cooperation between communities and public institutions. Therefore, it’s important to find a way to innovate governance and policymaking, developing new ways to harness the potential of public engagement and participatory foresight in complex governance decisions. In this paper we propose a comprehensive framework that combines crowdsourcing and machine learning, aiming to improve the collective engagement and contribution of the crowd in policy-making decisions. Our approach brings together social networking, gamification, and data analysis practices for extracting relevant and coordinated future vision concerning public policies. The framework is validated through two experiments with citizens and policy-making domain experts. The findings confirm the effectiveness of the framework principles and provide useful feedback for future development.
2020-12-11T09:00:00-08:00 - 2020-12-11T09:05:00-08:00
Q&A with Andrea Tocchetti and Marco Brambilla "A Gamified Crowdsourcing Framework for Data-Driven Co-creation of Policy Making and Social Foresight"
2020-12-11T09:05:00-08:00 - 2020-12-11T09:20:00-08:00
Conversational Crowdsourcing (by Sihang Qiu, Ujwal Gadiraju, Alessandro Bozzon and Geert-Jan Houben)
Ujwal Gadiraju, Alessandro Bozzon
The trend of remote work leads to the prosperity of crowdsourcing marketplaces. In crowdsourcing marketplaces, online workers can select their preferable tasks and then complete them to get paid, while requesters design and publish tasks to acquire their desirable data. The conventional user interface of the crowdsourcing task is the web page, where users provide answers using HTML-based web elements, and the task-related information (including instructions and questions) is displayed on a single web page. Although the conventional way of presenting tasks is straightforward, it could negatively affect workers' satisfaction and performance by causing problems such as boredom and fatigue. To address this challenge, we proposed a novel paradigm --- conversational crowdsourcing, which employs conversational interfaces to facilitate crowdsourcing task execution. With conversational crowdsourcing, workers receive task information as messages from a conversational agent, and provide answers by sending messages back to the agent. In this vision paper, we introduce our recent work in terms of using conversational crowdsourcing to improve worker performance and experience by employing novel human-computer interaction affordances. Our findings reveal that conversational crowdsourcing has important implications in improving the worker satisfaction and requester-worker relationship in crowdsourcing marketplaces.
2020-12-11T09:20:00-08:00 - 2020-12-11T09:25:00-08:00
Q&A with Sihang Qiu, Ujwal Gadiraju, Alessandro Bozzon and Geert-Jan Houben "Conversational Crowdsourcing"
2020-12-11T09:25:00-08:00 - 2020-12-11T09:35:00-08:00
Coffee Break
2020-12-11T09:35:00-08:00 - 2020-12-11T09:55:00-08:00
Quality Control in Crowdsourcing (by Seid Muhie Yimam)
Seid Muhie Yimam
2020-12-11T09:55:00-08:00 - 2020-12-11T10:05:00-08:00
Q&A with Seid Muhie Yimam "Quality Control in Crowdsourcing"
2020-12-11T10:05:00-08:00 - 2020-12-11T10:20:00-08:00
What Can Crowd Computing Do for the Next Generation of AI Technology? (by Ujwal Gadiraju and Jie Yang)
Ujwal Gadiraju,
The unprecedented rise in the adoption of artificial intelligence techniques and automation across several critical domains, is concomitant with shortcomings of such technology with respect to robustness, usability, interpretability, and trustworthiness. Crowd computing offers a viable means to leverage human intelligence at scale for data creation, enrichment, and interpretation, demonstrating a great potential to improve the performance of AI systems and increase the adoption of AI in general. Existing research and practice has mainly focused on leveraging crowd computing for training data creation. However, this perspective is rather limiting in terms of how AI can fully benefit from crowd computing. In this vision paper, we identify opportunities in crowd computing to propel better AI technology, and argue that to make such progress, fundamental problems need to be tackled from both computation and interaction standpoints. We discuss important research questions in both these themes, with an aim to shed light on the research needed to pave a future where humans and AI can work together seamlessly, while benefiting from each other.
2020-12-11T10:20:00-08:00 - 2020-12-11T10:25:00-08:00
Q&A with Ujwal Gadiraju and Jie Yang "What Can Crowd Computing Do for the Next Generation of AI Technology?"
2020-12-11T10:25:00-08:00 - 2020-12-11T10:40:00-08:00
Real-Time Crowdsourcing of Health Data in a Low-Income country: A case study of Human Data Supply on Malaria first-line treatment policy tracking in Nigeria (by Olubayo Adekanmbi, Wuraola Fisayo Oyewusi and Ezekiel Ogundepo)
Olubayo Adekanmbi, Wuraola Oyewusi
Malaria is one of the leading causes of high morbidity and mortality in Nigeria despite various policy interventions to frontally address the menace. While a national malaria policy agenda exists on the use of Artemisinin-based antimalarial as the first-line drug of choice for the treatment, there have been challenges in the implementation monitoring across various drug distribution layers, particularly the informal channels that dominate over eighty percent of the antimalarial drug distribution value chain. The lack of sustained policy monitoring through a structured and systematic surveillance system can encourage irrational drug usage, trigger antimalarial drug resistance, and worsen the disease burden in an economy where over ninety percent of the population live below the poverty line. We explored the use of real-time data collection through ordinary local residents, who leverages low-cost smartphones with an on-device app to run quick mystery shopping at drug outlets to check recommended malaria treatment drugs in four (4) states across the country. The instant survey data is collected via guided mystery shopping, which requires the volunteer participants to answer three basic questions after a 5 - 10 minutes in-store observation. Each submission is verified with the drug store picture and auto-generated location co-ordinates. The antimalarial policy compliance level is immediately determined and can be anonymously aggregated into a national map for onward sharing with pharmaceutical trade groups, government agencies and non-profits for immediate intervention via requisite stakeholder education. This crowd-sourcing effort provides an affordable option that can be scaled up to support healthcare surveillance and effective policy compliance tracking in developing nations, where there is a paucity of data as a result of high illiteracy and infrastructural inadequacy.
2020-12-11T10:40:00-08:00 - 2020-12-11T10:45:00-08:00
Q&A with Olubayo Adekanmbi, Wuraola Fisayo Oyewusi and Ezekiel Ogundepo: "Real-Time Crowdsourcing of Health Data in a Low-Income country: A case study of Human Data Supply on Malaria first-line treatment policy tracking in Nigeria"
2020-12-11T10:45:00-08:00 - 2020-12-11T11:00:00-08:00
Coffee Break
2020-12-11T11:00:00-08:00 - 2020-12-11T12:30:00-08:00
Panel Discussion "Successes and failures in crowdsourcing: experiences from work providers, performers and platforms"
2020-12-11T12:30:00-08:00 - 2020-12-11T13:00:00-08:00
Lunch Break
2020-12-11T13:00:00-08:00 - 2020-12-11T13:20:00-08:00
Modeling and Aggregation of Complex Annotations Via Annotation Distance (by Matt Lease)
Matt Lease
2020-12-11T13:20:00-08:00 - 2020-12-11T13:30:00-08:00
Q&A with Matt Lease: "Modeling and Aggregation of Complex Annotations Via Annotation Distance"
2020-12-11T13:30:00-08:00 - 2020-12-11T13:45:00-08:00
Active Learning from Crowd in Item Screening (by Evgeny Krivosheev, Burcu Sayin, Alessandro Bozzon and Zoltán Szlávik)
Evgeny Krivosheev, Burcu Sayin Günel, Alessandro Bozzon, Zoltan Szlavik
In this paper, we explore how to efficiently combine crowdsourcing and machine intelligence for the problem of document screening, where we need to screen a finite number of documents with a set of machine-learning filters. Specifically, we focus on building a set of machine learning classifiers that evaluate documents, and then screen them efficiently. It is a challenging task since the budget is limited and there are countless number of ways to spend the given budget on the problem. We propose a multi-label active learning screening specific sampling technique -objective-aware sampling- for querying unlabelled documents for annotating. Our algorithm takes a decision on which machine filter needs more training data and how to choose unlabeled items to annotate in order to minimize the risk of overall classification errors rather than minimizing a single filter error. Our results demonstrate that objective-aware sampling significantly outperforms the state of the art sampling strategies on multi-filter classification problems.
2020-12-11T13:45:00-08:00 - 2020-12-11T13:50:00-08:00
Q&A with Evgeny Krivosheev, Burcu Sayin, Alessandro Bozzon and Zoltán Szlávik: "Active Learning from Crowd in Item Screening"
2020-12-11T13:50:00-08:00 - 2020-12-11T14:05:00-08:00
Human Computation Requires and Enables a New Approach to Ethics (by Libuse Veprek, Patricia Seymour and Pietro Michelucci)
Libuše Vepřek, , Pietro Michelucci
With humans increasingly serving as computational elements in distributed information processing systems and in consideration of the profit-driven motives and potential inequities that might accompany the emerging thinking economy, we recognize the need for establishing a set of related ethics to ensure the fair treatment and wellbeing of online cognitive laborers and the conscientious use of the capabilities to which they contribute. Toward this end, we first describe human-in-the-loop computing in context of the new concerns it raises that are not addressed by traditional ethical research standards. We then describe shortcomings in the traditional approach to ethical review and a dynamic approach for sustaining an ethical framework that can continue to evolve within the rapidly shifting context of disruptive new technologies.
2020-12-11T14:05:00-08:00 - 2020-12-11T14:10:00-08:00
Q&A with Libuse Veprek, Patricia Seymour and Pietro Michelucci: "Human computation requires and enables a new approach to ethics"
2020-12-11T14:10:00-08:00 - 2020-12-11T14:20:00-08:00
Coffee Break
2020-12-11T14:20:00-08:00 - 2020-12-11T14:40:00-08:00
Bias in Human-in-the-Loop Artificial Intelligence (by Gianluca Demartini)
Gianluca Demartini
2020-12-11T14:40:00-08:00 - 2020-12-11T14:50:00-08:00
Q&A with Gianluca Demartini: "Bias in Human-in-the-loop Artificial Intelligence"
2020-12-11T14:50:00-08:00 - 2020-12-11T15:05:00-08:00
VAIDA: An Educative Benchmark Creation Paradigm using Visual Analytics for Interactively Discouraging Artifacts (by Anjana Arunkumar, Swaroop Mishra, Bhavdeep Sachdeva, Chitta Baral and Chris Bryan)
Anjana Arunkumar, Swaroop Mishra, Chitta Baral
We present VAIDA, a novel benchmark creation paradigm (BCP) for NLP. VAIDA provides realtime feedback to crowdworkers about the quality of samples as they are being created, educating them about potential artifacts and allowing them to update samples to remove the same. Concurrently, VAIDA supports backend analysts to review and approve submitted samples for benchmark inclusion, analyze the overall quality of the dataset, and resample splits to obtain and freeze the optimum state. VAIDA is domain, model, task, and metric agnostic, and constitutes a paradigm shift for robust, validated, and dynamic benchmark creation via human-and-metric-in-the-loop workflows. We demonstrate VAIDA's effectiveness by leveraging DQI (a data quality metric) over four datasets. We further evaluate via expert review and a user study with NASA TLX. We find that VAIDA decreases mental demand, temporal demand, effort, and frustration of crowdworkers (29.7%) and analysts(12.1%); it increases performance by 30.8\% and 26\% respectively.
2020-12-11T15:05:00-08:00 - 2020-12-11T15:10:00-08:00
Q&A with Anjana Arunkumar, Swaroop Mishra, Bhavdeep Sachdeva, Chitta Baral and Chris Bryan: " VAIDA: An Educative Benchmark Creation Paradigm using Visual Analytics for Interactively Discouraging Artifacts"
2020-12-11T15:10:00-08:00 - 2020-12-11T15:30:00-08:00
Secret Invited Talk
Praveen Paritosh
2020-12-11T15:30:00-08:00 - 2020-12-11T15:40:00-08:00
Q&A
2020-12-11T15:40:00-08:00 - 2020-12-11T16:00:00-08:00
Closing