The New England Machine Learning Day @ the Microsoft NERD Center was today. Below is the breakdown of the talks.
Best technical talk? Ryan Adams.
Funniest talk? Antonio Torralba.
Learning latent structure in documents, social networks, and more
In many applications, we face the challenge of modeling the interactions between multiple observations and hidden causes; such problems range from document retrieval, where we seek to model the underlying topics, to community detection in social networks. The (unsupervised) learning problem is to accurately estimate the model (e.g. the hidden topics, the underlying clusters, or the hidden communities in a social network) with only samples of the observed variables. In practice, many of these models are fit with local search heuristics. This talk will overview how simple and scalable linear algebra approaches provide closed form estimation methods for a wide class of these models models---including Gaussian mixture models, hidden Markov models, topic models (including latent Dirichlet allocation), and mixed membership models for communities in social networks.
Learning Word Meanings for Human-Robot Interaction
As robots become more powerful and autonomous, it is critical to develop ways for untrained users to quickly and easily tell them what to do. Natural language is a powerful and flexible modality for conveying complex requests, but in order for robots to effectively understand natural language commands, they must be able to acquire meaning representations that can be mapped to perceptual features in the external world. I will present approaches to learning these grounded meaning representations from a corpus of natural language sentences paired with a robot's perceptual model of the environment. The robot can use these learned models to recognize events, follow commands, ask questions, and request help.
From Sparsity to Rank, and Beyond: algebra, geometry, and convexity
Optimization problems involving sparse vectors or low-rank matrices are of great importance in applied mathematics and engineering. They provide a rich and fruitful interaction between algebraic-geometric concepts and convex optimization, with strong synergies with popular techniques like L1 and nuclear norm minimization. In this lecture we will provide a gentle introduction to this exciting research area, highlighting key algebraic-geometric ideas as well as a survey of recent developments, including extensions to very general families of parsimonious models such as sums of a few permutations matrices, low-rank tensors, orthogonal matrices, and atomic measures, as well as the corresponding structure-inducing norms.
Toward Reliable Bayesian Nonparametric Learning
Applications of Bayesian nonparametrics increasingly involve datasets with rich hierarchical, temporal, spatial, or relational structure. While basic inference algorithms such as the Gibbs sampler are easily generalized to such models, in practice they can fail in subtle and hard-to-diagnose ways. We explore this issue via variants of a simple and popular nonparametric Bayesian model, the hierarchical Dirichlet process. By optimizing variational learning objectives in non-traditional ways, we build improved models of text, image, and social network data.
Practical Bayesian Optimization of Machine Learning Algorithms
Machine learning algorithms frequently involve careful tuning of learning parameters and model hyperparameters. Unfortunately, this tuning is often a "black art" requiring expert experience, rules of thumb, or sometimes brute-force search. There is therefore great appeal for automatic approaches that can optimize the performance of any given learning algorithm to the problem at hand. I will describe my recent work on solving this problem with Bayesian nonparametrics, using Gaussian processes. This approach of "Bayesian optimization" models the generalization performance as an unknown objective function with a GP prior. I will discuss new algorithms that account for variable cost in function evaluation and take advantage of parallelism in evaluation. These new algorithms improve on previous automatic procedures and can reach or surpass human expert-level optimization for many algorithms including latent Dirichlet allocation for text analysis, structured SVMs for protein motif finding, and convolutional neural networks for visual object recognition.
Machine Learning for Complex Social Processes
From the activities of the US Patent Office or the National Institutes of Health to communications between scientists or political legislators, complex social processes---groups of people interacting with each other in order to achieve specific and sometimes contradictory goals---underlie almost all human endeavor. In order draw thorough, data-driven conclusions about complex social processes, researchers and decision-makers need new quantitative tools for exploring, explaining, and making predictions using massive collections of interaction data. In this talk, I will discuss the development of machine learning methods for modeling interaction data. I will concentrate on exploratory analysis of communication networks---specifically, discovery and visualization of topic-specific subnetworks in email data sets. I will present a new Bayesian latent variable model of network structure and content and explain how this model can be used to analyze intra-governmental email networks.
ML for the Future: Healthcare, Energy, and the Internet
Recent applications of ML to some of society's critical domains, including healthcare, energy grid reliability, and information retrieval. Specifically:
1) Stroke risk prediction in medical patients, using ML techniques for interpretable predictive modeling.
2) Energy grid reliability in New York City, using point process models.
3) Growing a list using the Internet, using clustering techniques.