Title: The Library of Babel: On Trying to Read My Genome
Information and Abstract:
Applied Data Science Seminar. Not long ago, information about our DNA was virtually impossible to gain. Now, thanks to the falling cost of DNA sequencing and the growing power of bioinformatics, genetic information is undergoing a Gutenberg-scale explosion of popularity. Millions of people are paying for DNA tests from companies like 23andMe and Ancestry.com, and they are getting unprecedented amounts of information about their ancestry and hereditary diseases. For my latest book, “She Has Her Mother’s Laugh,” I got my genome sequenced and enlisted scientists at Yale and elsewhere to help me interpret it. In my talk, I’ll discuss the results of that exploration–at once enlightening and baffling
“Toward theoretical understanding of deep learning”
Speaker: Professor Sanjeev Arora
Princeton University & Institute for Advanced Study
Tomorrow – Wednesday, April 18, 2018, 12:00-1:00pm
Location: Yale Institute for Network Science, 17 Hillhouse Avenue, 3rd floor
Abstract: This talk will be a survey of ongoing efforts and recent results to develop better theoretical understanding of deep learning, from expressiveness to optimization to generalization theory. We will see the (limited) success that has been achieved and the open questions it leads to. (My expository articles appear at http://www.offconvex.org (link is external))
Bio: Sanjeev Arora is Charles C. Fitzmorris Professor of Computer Science at Princeton University and Visiting Professor at the Institute for Advanced Study. He is an expert in theoretical computer science, especially theoretical ML. He has received the Packard Fellowship (1997), Simons Investigator Award (2012), Goedel Prize (2001 and 2010), ACM-Infosys Foundation Award in the Computing Sciences (now called the ACM prize) (2012), and the Fulkerson Prize in Discrete Math (2012).
“The tool, described in Nature on 28 March1, is not the first software to wield artificial intelligence (AI) instead of human skill and intuition. Yet chemists hail the development as a milestone, saying that it could speed up the process of drug discovery and make organic chemistry more efficient.
“What we have seen here is that this kind of artificial intelligence can capture this expert knowledge,” says Pablo Carbonell, who designs synthesis-predicting tools at the University of Manchester, UK, and was not involved in the work. He describes the effort as “a landmark paper”.”
Title: “Learning and Geometry for Stochastic Dynamical Systems in high dimensions”
We discuss geometry-based statistical learning techniques for performing model reduction and modeling of certain classes of stochastic high-dimensional dynamical systems. We consider two complementary settings. In the first one, we are given long
trajectories of a system, e.g. from molecular dynamics, and we estimate, in a robust fashion, an effective number of degrees of freedom of the system, which may vary in the state space of then system, and a local scale where the dynamics is well-approximated by a reduced dynamics with a small number of degrees of freedom. We then use these ideas to produce an approximation to the generator of the system and obtain, via eigenfunctions of an empirical Fokker-Planck equation (constructed from data), reaction coordinates for the system that capture the large time behavior of the dynamics. We present various examples from molecular dynamics illustrating these ideas.
In the second setting we only have access to a (large number of expensive) simulators that can return short paths of the stochastic system, and introduce a statistical learning framework for estimating local approximations to the system, that can be (automatically) pieced together to form a fast global reduced model for the system, called ATLAS. ATLAS is guaranteed to be accurate (in the sense of producing stochastic paths whose distribution is close to that of paths generated by the original system) not only at small time scales, but also at large time scales, under suitable assumptions on the dynamics. We discuss applications to homogenization of rough diffusions in low and high dimensions, as well as relatively simple systems with separations of time scales, and deterministic chaotic systems in high-dimensions, that are well-approximated by stochastic
diffusion-like equations. Mauro Maggioni 4-10 flyer.pdf