Performance evaluation of MEGADOCK protein–protein interaction prediction system implemented with distributed containers on a cloud computing environment
In Proceedings of the 2019 International Conference on Parallel and Distributed Processing Techniques & Applications (PDPTA'19)
巻, 号, ページ
pp. 175-181
出版年月
2019年7月29日
出版者
和文:
英文:
会議名称
和文:
英文:
Parallel and Distributed Processing Techniques and Applications (PDPTA'19)
開催地
和文:
英文:
Las Vegas, Nevada
アブストラクト
Container-based virtualization, a lightweight virtualization technology, has begun to be introduced into large-scale parallel computing environments. In the bioinfor- matics field, where various dependent libraries and software tools need to be combined, the container technology that iso- lates the software environment and enables rapid distribution as in an immediate executable format, is expected to have many benefits. In this study, we employed Docker, which is an implementation of Linux containers, and implemented a distributed computing environment of our original protein– protein interaction prediction system, MEGADOCK, with virtual machine instances on Microsoft Azure cloud com- puting environment, and evaluated its parallel performance. Both when MEGADOCK was directly performed on the virtual machine and also when it is performed with Docker containers of MEGADOCK on the virtual machine, the execution speed achieved was almost equal even if the number of worker cores was increased up to approximately 500 cores. On the standardization of portable and executable software environments, the container techniques have large contributions in order to improve productivity and repro- ducibility of scientific research.