王杰,梁吉业,赵兴旺,郑文萍.一种基于同配性的重叠蛋白质复合体检测算法[J].计算机科学,2019,46(2):294-305
一种基于同配性的重叠蛋白质复合体检测算法
Overlapping Protein Complexes Detection Algorithm Based on Assortativity in PPI Networks
投稿时间:2018-09-26  修订日期:2018-11-19
DOI:
中文关键词:  蛋白质互作用网络,复合体检测,同配性,种子扩展方法
英文关键词:Protein-protein interaction network,Complexes detection,Assortativity,Seed expansion method
基金项目:本文受国家自然科学基金项目(61876103,0),山西省重点研发计划项目(201603D111014)资助
作者单位E-mail
王杰 山西大学计算机与信息技术学院 太原030006山西大学计算智能与中文信息处理教育部重点实验室 太原 030006  
梁吉业 山西大学计算机与信息技术学院 太原030006山西大学计算智能与中文信息处理教育部重点实验室 太原 030006 ljy@sxu.edu.cn 
赵兴旺 山西大学计算机与信息技术学院 太原030006山西大学计算智能与中文信息处理教育部重点实验室 太原 030006  
郑文萍 山西大学计算机与信息技术学院 太原030006山西大学计算智能与中文信息处理教育部重点实验室 太原 030006  
摘要点击次数: 0
全文下载次数: 0
中文摘要:
      蛋白质复合体在生物过程中具有重要的作用,从蛋白质互作用网络中进行蛋白质复合体检测是后基因时代的一项具有挑战性的任务。种子扩展方法是一种从蛋白质互作用网络中进行重叠蛋白质复合体检测的有效技术。然而,现有方法面临两方面的问题:1)在选择种子结点时通常仅仅考虑了网络中结点的直接邻居之间的连接紧密度,难以充分体现结点在局部邻域子图内的重要性;2)在簇的扩展过程中假设候选结点之间是相互独立的,忽略了候选结点的添加顺序可能对聚类结果带来的影响。为了解决以上问题,文中基于生物网络同配性提出了一种重叠蛋白质复合体检测算法。该算法利用结点的二阶邻域信息来度量结点的重要性,进而选择种子结点,在簇扩展过程中利用同配性实现多个候选结点的批量添加。为了对重叠聚类结果进行评价,提出了一种重叠复合体评价指标F-overlap。与其他复合体检测算法在蛋白质互作用数据集上的对比实验结果表明,所提算法能够有效地进行重叠蛋白质复合体检测。
英文摘要:
      Protein complexes play significant roles in biological processes.The detection of protein complexes from available protein-protein interaction (PPI) networks is one of the most challenging tasks in the post-genome era.Seed expansion method is an effective clustering technique for overlapping protein complexes detection from PPI networks.However,existing methods are usually faced with two problems.One is that they only consider link density between direct neighbors of nodes in a network in the step of seed selection,which is not enough to indicate the importance of nodes in local subgraphs consisting of their neighborhoods.The other is that candidate nodes are assumed to be independent from each other,ignoring the impact of candidate nodes’ order on clustering in the process of cluster extension.To solve the problems,this paper proposed an overlapping protein complexes detection algorithm based on assortativity,which considers 2-order neighborhood of nodes in the process of seed selection,and multiple candidate nodes are added into clusters based on assortativity in networks in the process of cluster expansion.In order to evaluate overlapping results,a new evaluation index named F-overlap was presented.Experiment results on PPI networks show that the proposed algorithm can effectively identify overlapping protein complexes.
查看全文  查看/发表评论  下载PDF阅读器