题目:Big EHR Data: A Directed-Graph Network of Disease-Disease Interactions
演讲者:Hulin Wu, University of Texas Health Science Center at Houston,US
时间:2017年6月30日上午 9:00-10:00
地点:理科楼408
摘要: Based on two health care Big Data sets with sample sizes n=10 million and 50 million respectively, we derived different types of disease-disease networks using the longitudinal information. We establish both short-term and long-term directed networks as well as the simultaneously-occurring undirected network of 1660 PheWAS disease groups. Among 2,753,940 possible disease pairs, we identified 646,969 for long-term and 10,587 for short-term significant pairs, respectively, which were observed in at least five patients and had relative risk (RR) > 1 with significance at 0.05 level after Bonferroni corrections. Among 1,376,970 possible disease pairs of simultaneous occurrence, we identified 18,137 which were observed in at least five patients and had RR > 1 with significance at 0.05 level after Bonferroni corrections. Based on the results, we define a new disease Influence Factor (IF). For the short-term network, the top diseases with the highest IF is more likely pregnancy related; while for the long-term network, it is more kidney related diseases. More clinical implications from these findings will be discussed. I will also discuss the challenges in Big Data research and future trends.
简介:武虎林教授于1994年毕业于弗罗里达州立大学。目前任休斯敦的德克萨斯大学健康科学中心公共卫生学院生物统计系主任。武教授的研究兴趣包括生物医学和健康科学大数据分析,复杂的高维数据分析、微分方程模型的统计方法和理论,计算系统生物学以及生物信息学在免疫学和传染病上的应用。武教授已在生物统计、生物信息、计算生物学、免疫学及传染病预测等研究领域发表了100多篇研究论文和两本专著,是微分方程和动力学模型的统计方法领域先驱者和开创者之一。在过去十几年来候选人从美国国家卫生研究院(NIH)作为项目负责人获得的研究开发基金总计超过3千万美元(约人民币2亿元),是美国生物统计学领域中获得美国国家资助大数据相关科研经费支持最多的教授。武教授的团队已研究开发了多个大数据分析预测软件和数据库,并有效地应用于生物信息和医学研究当中。特别是研究出的处理各种复杂数据的模型和算法,包括首创的基于海量数据高维微分方程基因网络模型,改进的分布式基因演变非线性优化算法,贝叶斯高维状态空间预测模型和算法等等,在国际上都处于领先水平。近期研发的基于大数据的高维动态网络预测模型,可用于复杂管理系统优化,动态系统预测及多维信息资源整合。