International Journal of Crowd Science 2019, 3(1): 49-62 https://doi.org/10.1108/IJCS-12-2018-0026

Research paper |

Open Access | Issue | Published: 16 April 2019

Crowdsourcing for search engines: perspectives and challenges

Show Author's Information Hide Author's Information Mohammad Moradi(

)

Young Researchers and Elite Club, Qazvin Branch, Islamic Azad University, Qazvin, Iran

Keywords:

Crowdsourcing, Information retrieval, Human-computer interaction, Search engines, Web 2.0, Human computation

Cite this article:

Moradi M. Crowdsourcing for search engines: perspectives and challenges. International Journal of Crowd Science, 2019, 3(1): 49-62. https://doi.org/10.1108/IJCS-12-2018-0026

Download citation

EndNote(RIS)

BibTeX

553

Views

Downloads

Citations

Crossref

N/A

WoS

Scopus

N/A

CSCD

Abstract Full text About this article

Abstract

Purpose

As a relatively new computing paradigm, crowdsourcing has gained enormous attention in the recent decade. Its compliance with the Web 2.0 principles, also, puts forward unprecedented opportunities to empower the related services and mechanisms by leveraging humans’ intelligence and problem solving abilities. With respect to the pivotal role of search engines in the Web and information community, this paper aims to investigate the advantages and challenges of incorporating people – as intelligent agents – into search engines’ workflow.

Design/methodology/approach

To emphasize the role of the human in computational processes, some specific and related areas are studied. Then, through studying the current trends in the field of crowd-powered search engines and analyzing the actual needs and requirements, the perspectives and challenges are discussed.

Findings

As the research on this topic is still in its infancy, it is believed that this study can be considered as a roadmap for future works in the field. In this regard, current status and development trends are delineated through providing a general overview of the literature. Moreover, several recommendations for extending the applicability and efficiency of next generation of crowd-powered search engines are presented. In fact, becoming aware of different aspects and challenges of constructing search engines of this kind can shed light on the way of developing working systems with respect to essential considerations.

Originality/value

The present study was aimed to portrait the big picture of crowd-powered search engines and possible challenges and issues. As one of the early works that provided a comprehensive report on different aspects of the topic, it can be regarded as a reference point.

Full text

Abstract

Full text

Outline

About this article

Crowdsourcing for search engines: perspectives and challenges

Show Author's information Hide Author's Information Mohammad Moradi(

)

Young Researchers and Elite Club, Qazvin Branch, Islamic Azad University, Qazvin, Iran

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Keywords: Crowdsourcing, Information retrieval, Human-computer interaction, Search engines, Web 2.0, Human computation

References(78)

Almosalami, A., Jones, A., Tipparach, S., Leier, K. and Peterson, R. (2018), “Beachbot: crowdsourcing garbage collection with amphibious robot network”, Proceedings of CIEEE International Conferenc/IEEE International Conference on Human-Robot Interaction, ACM, pp. 333-334.https://doi.org/10.1145/3173386.3177832

DOI

Alonso, O., Rose, D.E. and Stewart, B. (2008), “Crowdsourcing for relevance evaluation”, ACM SIGIR Forum, Vol. 42 No. 2, pp. 9-15.

DOI Google Scholar

Alonso, O., Schenkel, R. and Theobald, M. (2010), “Crowdsourcing assessments for XML ranked retrieval”, Proceedings of European Conference on Information Retrieval, Springer, Berlin, Heidelberg, pp. 602-606.https://doi.org/10.1007/978-3-642-12275-0_57

DOI

Bennett, P.N., Chickering, D.M. and Mityagin, A. (2009), “Picture this: preferences for image search”, Proceedings of the ACM SIGKDD Workshop on Human Computation, ACM, pp. 25-26.https://doi.org/10.1145/1600150.1600157

DOI

Blanco, R., Halpin, H., Herzig, D.M., Mika, P., Pound, J., Thompson, H.S. and Tran Duc, T. (2011), “Repeatable and reliable search system evaluation using crowdsourcing”, Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, pp. 923-932.https://doi.org/10.1145/2009916.2010039

DOI

Bozzon, A., Brambilla, M. and Ceri, S. (2012a), “Answering search queries with crowdsearcher”, Proceedings of the 21st International Conference on World Wide Web, ACM, pp. 1009-1018.https://doi.org/10.1145/2187836.2187971

DOI

Bozzon, A., Brambilla, M. and Mauri, A. (2012b), “A Model-Driven approach for crowdsourcing search”, Proceedings of CrowdSearch 2012 Workshop at WWW, pp. 31-35.

Brabham, D.D. (2008), “Crowdsourcing as a model for problem solving: an introduction and cases”, Convergence, Vol. 14 No. 1, pp. 75-90.

DOI Google Scholar

Breazeal, C., DePalma, N., Orkin, J., Chernova, S. and Jung, M. (2013), “Crowdsourcing human-robot interaction: new methods and system evaluation in a public environment”, Journal of Human-Robot Interaction, Vol. 2 No. 1, pp. 82-111.

DOI Google Scholar

Callaghan, C.W. (2016), “A new paradigm of knowledge management: crowdsourcing as emergent research and development”, Southern African Business Review, Vol. 20 No. 1, pp. 1-28.

Google Scholar

Chang, J.C., Amershi, S. and Kamar, E. (2017), “Revolt: collaborative crowdsourcing for labeling machine learning datasets”, Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, ACM, pp. 2334-2346.https://doi.org/10.1145/3025453.3026044

DOI

Chen, W., Zhao, Z., Wang, X. and Ng, W. (2016), “Crowdsourced query processing on microblogs”, Proceedings of International Conference on Database Systems for Advanced Applications, Springer, Cham, pp. 18-32.https://doi.org/10.1007/978-3-319-32025-0_2

DOI

Ciceri, E., Fraternali, P., Martinenghi, D. and Tagliasacchi, M. (2016), “Crowdsourcing for top-k query processing over uncertain data”, Proceedings of IEEE 32nd International Conference on Data Engineering (ICDE), IEEE, pp. 1452-1453.https://doi.org/10.1109/ICDE.2016.7498370

DOI

Daniel, F., Kucherbaev, P., Cappiello, C., Benatallah, B. and Allahbakhsh, M. (2018), “Quality control in crowdsourcing: a survey of quality attributes, assessment techniques, and assurance actions”, ACM Computing Surveys, Vol. 51 No. 1, Article No. 7.

DOI Google Scholar

Del Prado, G.M. (2015), “Robots are terrible at these 3 uniquely human skills”, Business Insider, available at: www.businessinsider.com/things-humans-can-do-better-than-machines-2015-10/ (accessed 20 April 2018).

Dellermann, D., Lipusch, N., Ebel, P. and Leimeister, J.M. (2018), “Design principles for a hybrid intelligence decision support system for business model validation”, ElectronicMarkets, available at: https://doi.org/10.1007/s12525-018-0309-2

DOI

Deng, T. and Feng, L. (2011), “A survey on information re-finding techniques”, International Journal of Web Information Systems, Vol. 7 No. 4, pp. 313-332.

DOI Google Scholar

Difallah, D.E., Demartini, G. and Cudré-Mauroux, P. (2012), “Mechanical cheat: spamming schemes and adversarial techniques on crowdsourcing platforms”, Proceedings of The First International Workshop on Crowdsourcing Web search (CrowdSearch), pp. 26-30.https://doi.org/10.1145/2187836.2187900

DOI

Dimitrova, S. and Scarso, E. (2017), “The impact of crowdsourcing on the evolution of knowledge management: insights from a case study”, Knowledge and Process Management, Vol. 24 No. 4, pp. 287-295.

DOI Google Scholar

Dounias, G. (2015), “Hybrid computational intelligence”, Encyclopedia of Information Science and Technology, 3rd Edition, IGI Global, pp. 154-162.https://doi.org/10.4018/978-1-4666-5888-2.ch016

DOI

Ermagun, A., Fan, Y., Wolfson, J., Adomavicius, G. and Das, K. (2017), “Real-time trip purpose prediction using online location-based search and discovery services”, Transportation Research Part C: Emerging Technologies, Vol. 77, pp. 96-112.https://doi.org/10.1016/j.trc.2017.01.020

DOI

Fan, J., Lu, M., Ooi, B.C., Tan, W.C. and Zhang, M. (2014), “A hybrid machine-crowdsourcing system for matching web tables”, Proceedings of IEEE 30th International Conference on Data Engineering, IEEE, pp. 976-987.https://doi.org/10.1109/ICDE.2014.6816716

DOI

Folds, D.J. (2016), “Human executive control of autonomous systems: a conceptual framework”, Proceedings of IEEE International Symposium on Systems Engineering (ISSE), pp. 1-5.https://doi.org/10.1109/SysEng.2016.7753126

DOI

Franklin, M.J., Kossmann, D., Kraska, T., Ramesh, S. and Xin, R. (2011), “CrowdDB: answering queries with crowdsourcing”, Proceedings of the 2011 ACMSIGMOD International Conference on Management of Data, pp. 61-72.https://doi.org/10.1145/1989323.1989331

DOI

Gao, J., Li, Q., Zhao, B., Fan, W. and Han, J. (2016), “Mining reliable information from passively and actively crowdsourced data”, Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2121-2122.https://doi.org/10.1145/2939672.2945389

DOI

Halmes, M. (2013), “Measurements of collective machine intelligence”, arXiv preprint arXiv:1306.6649.

Hariri, N. (2013), “Do natural language search engines really understand what users want? a comparative study on three natural language search engines and google”, Online Information Review, Vol. 37 No. 2, pp. 287-303.

DOI Google Scholar

Harris, C.G. (2011), “Dirty deeds done dirt cheap: a darker side to crowdsourcing”, Proceedings of IEEE Third International Conference on Privacy, Security, Risk and Trust (PASSAT) and IEEE Third International Conference on Social Computing (SocialCom), IEEE, pp. 1314-1317.https://doi.org/10.1109/PASSAT/SocialCom.2011.89

DOI

Harris, C.G. and Srinivasan, P. (2013), “Comparing crowd-based, game-based, and machine-based approaches in initial query and query refinement tasks”, Proceedings of European Conference on Information Retrieval, Springer, Berlin, Heidelberg, pp. 495-506.https://doi.org/10.1007/978-3-642-36973-5_42

DOI

Holzinger, A., Plass, M., Holzinger, K., Crişan, G.C., Pintea, C.M. and Palade, V. (2016), “Toward interactive machine learning (iML): applying ant colony algorithms to solve the traveling salesman problem with the human-in-the-loop approach”, Proceedings of International Conference on Availability, Reliability, and Security, Springer, pp. 81-95.https://doi.org/10.1007/978-3-319-45507-5_6

DOI

Howe, J. (2006), “The rise of crowdsourcing”, Wired Magazine, Vol. 14 No. 6, pp. 1-4.

Google Scholar

Ikediego, H.O., Ilkan, M., Abubakar, A.M. and Victor Bekun, F. (2018), “Crowd-sourcing (who, why and what)”, International Journal of Crowd Science, Vol. 2 No. 1, pp. 27-41.

DOI Google Scholar

Jain, A., Das, D., Gupta, J.K. and Saxena, A. (2015), “Planit: a crowdsourcing approach for learning to plan paths from large scale preference feedback”, Proceedings of IEEE International Conference on Robotics and Automation (ICRA), pp. 877-884.https://doi.org/10.1109/ICRA.2015.7139281

DOI

Jansen, B.J., Booth, D.L. and Spink, A. (2007), “Determining the user intent of web search engine queries”, Proceedings of the 16th International Conference on World Wide Web, ACM, pp. 1149-1150.https://doi.org/10.1145/1242572.1242739

DOI

Jeong, J.W., Morris, M.R., Teevan, J. and Liebling, D. (2013), “A crowd-powered socially embedded search engine”, Proceedings of Seventh International AAAI Conference on Weblogs and Social Media, ICWSM, AAAI.

Kairam, S. and Heer, J. (2016), “Parting crowds: characterizing divergent interpretations in crowdsourced annotation tasks”, Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work and Social Computing, pp. 1637-1648.https://doi.org/10.1145/2818048.2820016

DOI

Kamar, E. (2016), “Directions in hybrid intelligence: complementing Ai systems with human intelligence”, Proceedings of IJCAI, pp. 4070-4073.

Kamar, E., Hacker, S. and Horvitz, E. (2012), “Combining human and machine intelligence in large-scale crowdsourcing”, Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems, pp. 467-474.

Kazai, G. (2011), “In search of quality in crowdsourcing for search engine evaluation”, Proceedings of European Conference on Information Retrieval, Springer, Berlin, Heidelberg, pp. 165-176.https://doi.org/10.1007/978-3-642-20161-5_17

DOI

Kazai, G., Kamps, J., Koolen, M. and Milic-Frayling, N. (2011), “Crowdsourcing for book search evaluation: impact of hit design on comparative system ranking”,Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 205-214.https://doi.org/10.1145/2009916.2009947

DOI

Kim, Y., Collins-Thompson, K. and Teevan, J. (2013), “Crowdsourcing for robustness in web search”, Proceedings of TREC.

Krishna, R., Zhu, Y., Groth, O., Johnson, J., Hata, K., Kravitz, J., Chen, S., Kalantidis, Y., Li, L.J., Shamma, D.A. and Bernstein, M.S. (2017), “Visual genome: connecting language and vision using crowdsourced dense image annotations”, International Journal of Computer Vision, Vol. 123 No. 1, pp. 32-73.

DOI Google Scholar

Law, E., Mityagin, A. and Chickering, M. (2009a), “Intentions: a game for classifying search query intent”, Proceedings of CHI’09 Extended Abstracts on Human Factors in Computing Systems, ACM, pp. 3805-3810.https://doi.org/10.1145/1520340.1520575

DOI

Law, E., von Ahn, L. and Mitchell, T. (2009b), “Search war: a game for improving web search”,Proceedings of the ACM sigkdd workshop on human computation, ACM, p. 31.https://doi.org/10.1145/1600150.1600160

DOI

Lewandowski, D. (2015), “Evaluating the retrieval effectiveness of web search engines using a representative query sample”, Journal of the Association for Information Science and Technology, Vol. 66 No. 9, pp. 1763-1775.

DOI Google Scholar

Liptchinsky, V., Satzger, B., Schulte, S. and Dustdar, S. (2015), “Crowdstore: a crowdsourcing graph database”, Proceedings of International Conference on Collaborative Computing: Networking, Applications and Worksharing, Springer, pp. 72-81.https://doi.org/10.1007/978-3-319-28910-6_7

DOI

Ma, H., Chandrasekar, R., Quirk, C. and Gupta, A. (2009), “Improving search engines using human computation games”, Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 275-284.https://doi.org/10.1145/1645953.1645990

DOI

Marcus, A., Wu, E., Karger, D.R., Madden, S. and Miller, R.C. (2011), “Crowdsourced databases: query processing with people”, Proceedings of CIDR.

Milne, D., Nichols, D.M. and Witten, I.H. (2008), “A competitive environment for exploratory query expansion”, Proceedings of the 8thACM/IEEE-CS Joint Conference on Digital Libraries, ACM, pp. 197-200.https://doi.org/10.1145/1378889.1378922

DOI

Moradi, M., Ardestani, M.A. and Moradi, M. (2016), “Learning decision making for soccer robots: a crowdsourcing-based approach”, Proceedings of Artificial Intelligence and Robotics (IRANOPEN), IEEE, pp. 25-29.https://doi.org/10.1109/RIOS.2016.7529514

DOI

Ofli, F., Meier, P., Imran, M., Castillo, C., Tuia, D., Rey, N., Briant, J., Millet, P., Reinhard, F., Parkan, M. and Joost, S. (2016), “Combining human computing and machine learning to make sense of big (aerial) data for disaster response”, Big Data, Vol. 4 No. 1, pp. 47-59.https://doi.org/10.1089/big.2014.0064

DOI

OReilly, T. (2007), “What is web 2.0: design patterns and business models for the next generation of software”, International Journal of Digital Economics, Vol. 1, pp. 17-37.

Google Scholar

Parameswaran, A., Teh, M.H., Garcia-Molina, H. and Widom, J. (2014), “Datasift: a crowd-powered search toolkit”,Proceedings of International Conference on Management of Data, pp. 885-888.https://doi.org/10.1145/2588555.2594510

DOI

Parameswaran, A.G., Garcia-Molina, H., Park, H., Polyzotis, N., Ramesh, A. and Widom, J. (2012), “Crowdscreen: algorithms for filtering data with humans”, Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 361-372.https://doi.org/10.1145/2213836.2213878

DOI

Park, S., Cho, K. and Choi, K. (2015), “Information seeking behavior of shopping site users: a log analysis of popshoes, a korean shopping search engine”, Journal of the Korean Society for Information Management, Vol. 32 No. 4, pp. 289-305.

DOI Google Scholar

Poirier, P. (2017), “Four human strengths and AI weaknesses”, avaliable at: https://medium.com/eruditeai/four-human-strengths-and-ai-weaknesses-a0fc1d38d538/ (accessed 20 July 2018).

Rahman, S.S., Easton, J.M. and Roberts, C. (2015), “Mining open and crowdsourced data to improve situational awareness for railway”, Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 1240-1243.https://doi.org/10.1145/2808797.2809369

DOI

Ruotsalo, T., Jacucci, G., Myllymäki, P. and Kaski, S. (2015), “Interactive intent modeling: information discovery beyond search”, Communications of the, ACM, Vol. 58 No. 1, pp. 86-92.

DOI Google Scholar

Sarma, A.D., Parameswaran, A., Garcia-Molina, H. and Halevy, A. (2014), “Crowd-powered find algorithms”, Proceedings of IEEE 30th International Conference on Data Engineering (ICDE), pp. 964-975.

Simpson, E.D., Venanzi, M., Reece, S., Kohli, P., Guiver, J., Roberts, S.J. and Jennings, N.R. (2015), “Language understanding in the wild: combining crowdsourcing and machine learning”, Proceedings of the 24th International Conference on World Wide Web, pp. 992-1002.https://doi.org/10.1145/2736277.2741689

DOI

Spirin, N. (2014), “Searching for design examples with crowdsourcing”, Proceedings of the 23rd International Conference on World Wide Web, ACM, pp. 381-382.https://doi.org/10.1145/2567948.2577371

DOI

Sushmita, S., Joho, H., Lalmas, M. and Jose, J.M. (2009), “Understanding domain relevance in web search”, Proceedings of WWW 2 Workshop on Web Search Result Summarization and Presentation, Madrid.

Tayyub, J., Hawasly, M., Hogg, D.C. and Cohn, A.G. (2017), “CLAD: a complex and long activities dataset with rich crowdsourced annotations”, arXiv preprint arXiv:1709.03456.

Teevan, J., Collins-Thompson, K., White, R.W. and Dumais, S. (2014), “Slow search”, Communications of the, ACM, Vol. 57 No. 8, pp. 36-38.

DOI Google Scholar

Thelwall, M. (2008), “Quantitative comparisons of search engine results”, Journal of the American Society for Information Science and Technology, Vol. 59 No. 11, pp. 1702-1710.

DOI Google Scholar

Trushkowsky, B., Kraska, T., Franklin, M.J., Sarkar, P. and Ramachandran, V. (2015), “Crowdsourcing enumeration queries: estimators and interfaces”, IEEE Transactions on Knowledge and Data Engineering, Vol. 27 No. 27, pp. 1796-1809.

DOI Google Scholar

Uyar, A. (2009), “Investigation of the accuracy of search engine hit counts”, Journal of Information Science, Vol. 35 No. 4, pp. 469-480.

DOI Google Scholar

Von Ahn, L., Maurer, B., McMillen, C., Abraham, D. and Blum, M. (2008), “Recaptcha: human-based character recognition via web security measures”, Science, Vol. 321 No. 5895, pp. 1465-1468.

DOI Google Scholar

Wallace, B.C., Noel-Storr, A., Marshall, I.J., Cohen, A.M., Smalheiser, N.R. and Thomas, J. (2017), “Identifying reports of randomized controlled trials (RCTs) via a hybrid machine learning and crowdsourcing approach”, Journal of the American Medical Informatics Association, Vol. 24 No. 6, pp. 1165-1168.

DOI Google Scholar

Weyer, J., Fink, R.D. and delt, F. (2015), “Human-machine cooperation in smart cars: an empirical investigation of the loss-of-control thesis”,Safety Science, Vol. 72, pp. 199-208.

DOI Google Scholar

Whitney, L. (2017), “Are computers already smarter than humans?”, Time Magazine, available at: http://time.com/4960778/computers-smarter-than-humans/ (accessed 23 March 2018).

Yampolskiy, R.V., Ashby, L. and Hassan, L. (2012), “Wisdom of artificial crowds – a metaheuristic algorithm for optimization”, Journal of Intelligent Learning Systems and Applications, Vol. 4 No. 2, pp. 98-107.

DOI Google Scholar

Yan, T., Kumar, V. and Ganesan, D. (2010), “Crowdsearch: exploiting crowds for accurate real-time image search on mobile phones”, Proceedings of the 8th International Conference on Mobile Systems, Applications, and Services, ACM, pp. 77-90.https://doi.org/10.1145/1814433.1814443

DOI

Zadeh, L.A. (2008), “Toward human level machine intelligence-is it achievable? the need for a paradigm shift”, IEEE Computational Intelligence Magazine, Vol. 3 No. 3.

DOI Google Scholar

Zahedi, M.S. et al. (2017), “How questions are posed to a search engine? An empiricial analysis of question queries in a large scale persian search engine log”, Proceedings of 3th International Conference on Web Research (ICWR), IEEE, pp. 84-89.https://doi.org/10.1109/ICWR.2017.7959310

DOI

Zeng, Z., Tang, J. and Wang, T. (2017), “Motivation mechanism of gamification in crowdsourcing projects”, International Journal of Crowd Science, Vol. 1 No. 1, pp. 71-82.

DOI Google Scholar

Zhang, J., Wang, S. and Huang, Q. (2017), “Location-based parallel tag completion for geo-tagged social image retrieval”,ACM Transactions on Intelligent Systems and Technology (TIST), Vol. 8 No. 3, Article No. 38.

DOI Google Scholar

Zhao, Z., Wei, F., Zhou, M., Chen, W. and Ng, W. (2015), “Crowd-selection query processing in crowdsourcing databases: a task-driven approach”, Proceedings of EDBT, pp. 397-408.

About this article

Publication history

Rights and permissions

Publication history

Received: 12 December 2018

Revised: 06 March 2019

Accepted: 07 March 2019

Published: 16 April 2019

Issue date: June 2019

Copyright

Rights and permissions

Mohammad Moradi. Published in International Journal of Crowd Science. Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence may be seen at http://creativecommons.org/licences/by/4.0/legalcode