“There has been abundant interest in recent years in developing improved models and algorithms for analysis of very large graphs or networks. There has been a dual focus on developing models that can accurately characterize real-world network data and on developing algorithms for efficient inference. Modeling strategies range from elaborations on stochastic block models (SBMs), which break nodes of the network up into communities, to latent space models (LSMs) and exponential random graph models (ERGMs). There has also been abundant emphasis on studying exchangeable random graph models. The literature on fitting of models to enormous graphs (say on the order of 100,000s to millions or even billions of nodes) has to this point focused on extremely simple models, assuming exchangeability or a simple SBM structure in which each node belongs to a single community. Hence, there is a lack of algorithms for scaling up inferences on realistic network models to very large graphs. Such algorithms must necessarily rely on divide-and-conquer algorithms that exploit large clusters of computers, conducting local computations for subsets of the graph in parallel and then combining or communicating across computing nodes to infer global properties. “
All people in the developed world are exposed to a complex mixture of chemical contaminants in the environment, through the food we eat and the air we breathe. There is currently very little understanding of how these different chemicals interact to impact our risk of developing various diseases. This project develops the key data analytic tools needed to disentangle the health of effects of different chemical exposures to more accurately estimate an individual’s risk and identify strategies for reducing risk.