Distributed computing by leveraging and rewarding idling user resources from P2P networks
Introduction
Computer science applications are deeply changing due to the cost of hardware components that is continuously decreasing [1] and their computing performances that are reaching unprecedented levels. Based on the availability of a plethora of powerful (even portable) computing devices, new high performance computing tasks have been designed and new data management paradigms emerged (e.g., the well known Big Data metaphor [2], [3], [4], [5]). Moreover, the advances in computer science allow many problems to be solved in a quite effective way both in terms of computational time and resource usage. However, several problems still require an amount of computational resources that goes far beyond the power of a single device or a single user. In order to solve these kinds of problems, a new computing paradigm has born: crowdsourcing. This new paradigm is based on the idea of gathering the resources needed to complete a task from the crowd in order to parallelize its execution.
Many attempts have been made to properly define the key features of crowdsourcing systems; however, the answer to this apparently trivial question is not straightforward, since there exist many different crowdsourcing systems based on different models and assumptions. If we try to find the common features shared among all the successful crowdsourcing systems (e.g., Wikipedia, Yahoo! Answers, Amazon Mechanical Turk) we can clearly realize that they rely on some assumptions:
- •
They should be able to involve project contributors
- •
Each contributor should solve a specific task
- •
It is mandatory to effectively evaluate single contributions
- •
They should properly react to possible misconducts
As we can see from the examples above many systems are based on an explicit collaboration of many different people sharing the will to build a long-lasting product that can be used by the whole community. Aside from this human-centric view, there exist a plethora of systems that leverage implicit co-operation among users (e.g. multiplayer games) or tools that are not devoted to the production of a tangible object (e.g., Amazon Mechanical Turk).
A well known open source framework that is widely used (mainly) for scientific purposes is BOINC (Berkeley Open Infrastructure for Network Computing) [6]. It allows volunteers to contribute to a wide variety of projects. Their contribution is rewarded by credits used to climb a leaderboard. In recent years a new category of collaborative approaches for cryptovalue mining is emerging, such as Bitcoin [7] implemented by blockchain technology. Users aiming at mining new Bitcoins contribute in solving a decoding task and are rewarded with a portion of the gathered money, proportional to the effort put in the mining task. The latter approach is gaining a lot of attention from the end users just because it allows them to earn a tangible reward.
In this paper we present our system named Coremuniti™ (that stands for Community of Cores) that is inspired by the collaborative model used in BOINC while implementing an ad hoc rewarding strategy similar to Bitcoin mining. As we do not require in principle any specific user skills but they can join the network simply providing their under used computational resources, our approach can be seen as a hybrid crowd as tasks can be solved by computer-based resources [8]. Our proposal received a grant from Calabria region that fully funded the early start-up costs. Moreover, EU commission awarded our project with a seal of excellence. We are currently in beta testing phase for commercial use.
More in detail, the novelty of our innovative project is the design of a peer to peer framework able to provide services at much lower prices compared to centralized center farms, by exploiting idle computational resources from the users joining the network.
Our software can be used in several application scenarios, e.g. computer simulation and advanced data analysis and it is well suited for vertical implementation of computing intensive tasks, representing a trans-disciplinary opportunity. More specifically, our approach can be a valid alternative to traditional solutions, such as buying or renting expensive dedicated servers. Furthermore, in many cases, even using powerful dedicated servers, the time needed to solve a problem is still too high, because the subtasks composing the problem are not parallelized at all. Our approach is based on a high performance Peer to Peer (P2P) network composed of computational resources shared by the users of the network itself. Each node of the network (i.e. users in the crowd) can set the amount of their resources to share. When a peer needs to execute a high resource consuming task, she can ask the necessary computational power to the network. The process can be executed in few clicks thus making our software quite user friendly. In order to assess the effectiveness of our solution, we analyzed the 3D rendering scenario that turns to be a severe test bench for our technology. We developed a specialized plugin named Mozaiko™ (We chose this name as our approach splits a complex task in several sub-tasks that will be re-assembled like mosaic tiles) that allows to render Blender 3D models in our distributed network. The rendering process typically engages user’s computers for a considerable amount of time. Our experimental analysis shows that existing solutions are slower and more expensive than our proposal. Moreover, our system does not require to frequently purchase new hardware, since users continuously provide (up to date) computational power. Finally, the better re-use of already powered resources could induce a beneficial systemic effect by reducing the overall energy consumption for complex tasks execution. We motivate our conjecture in the future work discussion as we are not able at this stage to generalize our early results.
P2P networks feature a common goal: the resources of many users and computers can be used in a collaborative way in order to significantly increase the computing power available for the users and parallelizing task executions. In “full” P2P networks, each computer communicates directly to each other thus allowing better bandwidth use. However, in many cases, there are some inherent drawbacks to P2P solutions and some functionality needs to be centralized. Those systems, that can be used both for data sharing [9] and distributed computation [10], are denoted as “hybrid” P2P. Coremuniti falls in the latter category and aims to build a P2P network where users can share their unexploited computing resources. In Fig. 1 we sketched a possible usage scenario for our platform when using Mozaiko plugin. However, we point out that our platform is general purpose, thus can be used for solving any complex problem that is parallelizable.
In order to join our network a user has to download the Coremuniti Server (platform independent) software. By running this software the user becomes a node server of our network, denoted in Fig. 1 as (Node Server Agent). Our software does not interfere with other applications running on the computer and the user can easily set the amount of CPU that wants to share with the network, so that Coremuniti Server can be easily adapted to everyone’s needs. On the opposite side, users who need additional computational power, in order to complete computing intensive tasks for example, can install the specific software that is denoted in Fig. 1 as (Node Client Agent) (i.e., Mozaiko for our case study). To start a new task, they simply issue a request to the network in order to gather the required resources. The submission of a new task in the network will cost to the user a number of credits proportional to the complexity of the task itself (see 4). Since each node of the network can act both as a server and as a client, when submitting a new task two cases may occur:
- 1.
the user has previously earned (a portion of) the required credits for running the task (e.g., because they acted as servers)
- 2.
they bought the required credits
In order to guarantee a high level of service, the central server is responsible of performing the subtask assignment. More in detail, to fully take advantage from the capability of our network, we partition the (possibly huge) initial task in an adequate number of (much smaller) subtasks that can be quickly executed by the server peers. Moreover, we built an internal company network of 80 peers that can be used when the number of available public peers is not sufficient to guarantee proper execution of user tasks.
As soon as subtasks are completed, we check their correctness and we reward the participating peers. We will describe our model for subtask assignment that guarantees efficient execution for clients and gave to all peers (even if they have limited computational power) the possibility to be rewarded. Interesting enough, even users that are not going to ask for task execution have the chance to earn credits that can be redeemed by coins or gadgets. The latter feature makes Coremuniti a more convenient choice w.r.t. other collaborative systems such as cryptovalue mining (as shown in the experimental evaluation section). In the early stage of development of our system, we performed some preliminary analysis by asking 500 students at University of Calabria their interest in joining the network. After specifying our rewarding model all of them agreed to join the network and they are currently beta testing the software.
Our Contribution. To summarize, we make the following major contributions1 :
- •
We design and implement a hybrid P2P infrastructure named Coremuniti that allows collaboration among users by sharing unexploited computational resources. In particular, by running our software, users can join the Coremuniti network and they can either provide or request computational resources;
- •
We define a robust model for task partitioning, assignment and rewarding to network users. More in detail, users that are available for task execution are assigned with a suitable set of (sub-)tasks. When the execution is completed, we reward users with credits that can be later redeemed for asking computational resources, coins or gifts;
- •
We discuss the 3D rendering case study to prove the effectiveness of our approach. We measured our performances against popular 3D rendering services. Moreover, we also compared our performances w.r.t. other systems that reward users for their effort in solving computational expensive tasks.
The rest of the paper is organized as follows. Section 2 reports on a comprehensive overview of existing crowdsourcing proposals. In Section 3 we present our system architecture. In Section 4, we describe our mathematical model for subtask assignment. Section 5 is devoted to the description of a challenging use case scenario for our framework. Section 6 discusses the outcome we get with our approach. Finally, in Section 7 we draw our conclusions and discuss future work.
Section snippets
Related work
The word Crowdsourcing was first introduced by Howe [12] in order to define the process of outsourcing some jobs to the crowd. It is used for a wide group of activities, as it allows companies to get substantial benefits by solving their problems in effective and efficient way. Indeed, crowd based solutions have been proposed for a wide range of applications as in image tagging [13], [14], query optimization [15], [16], [17], data processing [18], [19], [20], [21], sentiment analysis [22] to
Coremuniti architecture
The goal of Coremuniti Network is to build a reliable infrastructure that allows to share computational resources in an easy and secure way. Our framework has to be robust against attacks from malicious users, analogously to every distributed computing [49] system [49] or distributed storage systems [50]. More in detail, we need to guarantee secure communication between clients and server, trusted software for remote execution and privacy for the intermediate computation. To this end, we
Subtask assignment and credit rewarding
In order to properly assign subtasks to resource providers we developed a mathematical model, described in this section, whose primary goals are the following:
- 1.
it aims at minimizing the expected completion time for the overall task
- 2.
it takes into account resource providers’ revenue expectations
More in detail, as explained above, when a user submits a task to the Coremuniti network, this task is split in several subtasks, much easier to solve than the initial one. Every
Case study: Efficient and effective 3D rendering
Many real life applications require huge computing resources in order to properly execute. As an example we mention here physical science simulation, mathematical simulation for insurance companies, biology simulation and cryptography. In this section we will describe our Coremuniti™ based solution for a quite interesting scenario, i.e., 3D Professional Rendering. More in detail, rendering is the process of converting a graphical model into ahigh quality image by means of a computer program.
Experimental evaluation
In this section, we experimentally evaluate Coremuniti performances from two standpoints: (1) we assess the efficiency and effectiveness of our approach by comparing our performance against the state of the art solutions available; (2) we perform a comparison of the revenue users can get by joining our network w.r.t. other collaborative approaches that reward users (e.g., Bitcoin mining).
We focus our attention on two different kinds of tasks: 3D rendering a matrix multiplication (implemented by
Conclusion and future work
In this paper, we proposed a hybrid peer to peer architecture for computational resource sharing. Users joining our network are able to share their unexploited computational resources and are rewarded by tangible credits. The computational power shared is used to solve difficult task submitted by other users. In order to guarantee the efficiency and effectiveness of the computation process, we designed a task partitioning and assignment algorithm that reduces the execution times while allowing
Acknowledgments
The authors thank the Calabria regional government that fully funded the project development and the EU commission that awarded our project with a seal of excellence.
Nunziato Cassavia received his Master Degree in Computer Engineering from the University of Calabria in 2012. Currently he is a research fellow and Ph.D. student at University of Calabria in ICT. His research interest includes Big Data, Distributed Computing System, Data Warehouse and Relational and NoSQL Databases.
References (65)
- et al.
Mobile cloud computing: A survey
Future Gener. Comput. Syst.
(2013) - et al.
Dynamic energy-aware cloudlet-based mobile cloud computing model for green computing
J. Netw. Comput. Appl.
(2016) - et al.
Energy-aware task assignment for mobile cyber-enabled applications in heterogeneous cloud computing
J. Parallel Distrib. Comput.
(2018) - Trends in the cost of computing AI Impacts website, https://aiimpacts.org/trends-in-the-cost-of-computing/. (Accessed...
- Nature. Big data. Nature, September...
- The Economist. Data, data everywhere. The Economist, Feb...
- D. Agrawal, et al. Challenges and opportunities with big data. A community white paper developed by leading researchers...
- Vinayak R. Borkar, Michael J. Carey, Chen Li, Inside “Big Data Management”: Ogres, onions, or parfaits? in:...
- BOINC website, https://boinc.berkeley.edu. (Accessed 22 February...
- S. Nakamoto, Bitcoin: A peer-to-peer electronic cash system, Freely available on the web,...
An efficient hybrid peer-to-peer system for distributed data sharing
IEEE Trans. Comput.
Comparing hybrid peer-to-peer systems
Crowdsourcing: Why the Power of the Crowd is Driving the Future of Business
CDAS: A crowdsourcing data analytics system
Proc. VLDB Endow.
Human-powered sorts and joins
Proc. VLDB Endow.
CrowdER: crowdsourcing entity resolution
Proc. VLDB Endow.
Argonaut: Macrotask crowdsourcing for complex data processing
Proc. VLDB Endow.
Worker skill estimation in team-based tasks
Proc. VLDB Endow.
Crowdsourcing systems on the world-wide web
Communun. ACM
Counting with the crowd
Conducting behavioral research on amazon’s mechanical turk
Behav. Res. Methods
Scaling up crowd-sourcing to very large datasets: A case for active learning
Proc. VLDB Endow.
Cited by (6)
Multiple Instance Learning for Viral Pneumonia Chest X-ray Classification
2022, CEUR Workshop ProceedingsA survey of Big Data dimensions vs Social Networks analysis
2021, Journal of Intelligent Information SystemsA usage of Client's Resources to Reduce Computing Costs in Web Applications
2021, Proceedings of the 2021 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering, ElConRus 2021Simple user assistance by data posting
2019, Proceedings - IEEE 2nd International Conference on Artificial Intelligence and Knowledge Engineering, AIKE 2019Simple user assistance by data posting
2019, CEUR Workshop ProceedingsEvaluating user behaviour in a cooperative environment
2018, Information (Switzerland)
Nunziato Cassavia received his Master Degree in Computer Engineering from the University of Calabria in 2012. Currently he is a research fellow and Ph.D. student at University of Calabria in ICT. His research interest includes Big Data, Distributed Computing System, Data Warehouse and Relational and NoSQL Databases.
Sergio Flesca is full professor at University of Calabria. He received a Ph.D. degree in Computer Science from University of Calabria. His research interests include databases, web and semi-structured data management, information extraction, inconsistent data management, approximate query answering, and argumentation.
Michele Ianni received his Ph.D. in Information and Communication Technologies from the University of Calabria, Italy, in 2018. Currently, he is a research fellow at University of Calabria, Italy. His main research interests include cyber security, cryptography, software vulnerability exploitation, malware analysis and trusted computing.
Elio Masciari is currently senior researcher at the Institute for High Performance Computing and Networks (ICAR-CNR) of the National Research Council of Italy. His research interests include Database Management, Semistructured Data and Big Data. He has been advisor of several master theses at the University of Calabria and at University Magna Graecia in Catanzaro. He was advisor of PhD thesis in computer engineering at University of Calabria. He has served as a member of the program committee of several international conferences. He served as a reviewer for several scientific journals of international relevance. He is author of more than 100 publications on journals and both national and international conferences. He also holds “Abilitazione Scientifica Nazionale” for Full Professor role.
Chiara Pulice received her Ph.D. in Computer and Systems Engineering from the University of Calabria, Italy, in 2015. Currently, she is a research fellow at Dartmouth College, Hanover, NH, USA. Her main research interests include data integration, inconsistent databases, social network analysis as well as data mining and machine learning, particularly for counterterrorism.