典型文献
DRPS:efficient disk-resident parameter servers for distributed machine learning
文献摘要:
Parameter server(PS)as the state-of-the-art distri-buted framework for large-scale iterative machine learning tasks has been extensively studied.However,existing PS-based systems often depend on memory implementations.With memory constraints,machine learning(ML)developers cannot train large-scale ML models in their rather small local clusters.Moreover,renting large-scale cloud servers is always econo-mically infeasible for research teams and small companies.In this paper,we propose a disk-resident parameter server system named DRPS,which reduces the hardware requirement of large-scale machine learning tasks by storing high dimensional models on disk.To further improve the performance of DRPS,we build an efficient index structure for parameters to reduce the disk I/O cost.Based on this index structure,we propose a novel multi-objective partitioning algorithm for the parameters.Finally,a flexible workerselection parallel model of compu-tation(WSP)is proposed to strike a right balance between the problem of inconsistent parameter versions(staleness)and that of inconsistent execution progresses(straggler).Extensive experiments on many typical machine learning applications with real and synthetic datasets validate the effectiveness of DRPS.
文献关键词:
中图分类号:
作者姓名:
Zhen SONG;Yu GU;Zhigang WANG;Ge YU
作者机构:
School of Computer Science and Engineering,Northeastern University,Shenyang 110819,China;College of Information Science and Engineering,Ocean University of China,Qingdao 266100,China
文献出处:
引用格式:
[1]Zhen SONG;Yu GU;Zhigang WANG;Ge YU-.DRPS:efficient disk-resident parameter servers for distributed machine learning)[J].计算机科学前沿,2022(04):76-87
A类:
DRPS,renting,mically,workerselection,staleness,straggler
B类:
efficient,disk,resident,servers,distributed,machine,learning,Parameter,state,framework,large,scale,iterative,tasks,has,been,extensively,studied,However,existing,systems,often,depend,memory,implementations,With,constraints,ML,developers,cannot,models,their,rather,small,local,clusters,Moreover,cloud,always,econo,infeasible,research,teams,companies,In,this,paper,named,which,reduces,hardware,requirement,by,storing,high,dimensional,To,further,improve,performance,build,structure,parameters,cost,Based,novel,multi,objective,partitioning,algorithm,Finally,flexible,parallel,compu,WSP,proposed,strike,right,balance,between,problem,inconsistent,versions,that,execution,progresses,Extensive,experiments,many,typical,applications,real,synthetic,datasets,validate,effectiveness
AB值:
0.536195
相似文献
机标中图分类号,由域田数据科技根据网络公开资料自动分析生成,仅供学习研究参考。