Scientific General Events
Small BGK waves and Landau damping
Ron Graham, University of California, San Diego
Title: The Combinatorics of Solving Linear Equations
One of the fundamental topics in combinatorics involves deciding whether some given linear equation has solutions with all its variables lying in some restricted set, and if so, estimating how many such solutions there are. In this talk, we will describe some of the old and new results in this area, as well as discuss a number of unsolved problems.
Chris Godsil, University of Waterloo
Title: Continuous Quantum Walks on Graphs
If A is the adjacency matrix of a graph X, then the matrix exponential U(t)=exp(itA) determines what physicists term a continuous quantum walk. They ask questions such as: for which graphs are the vertices a and b and a t such that |U(t)a,b|=1? The basic problem is to relate the physical properties of the system with properties of the underlying graphs, and to study this we make use of results from the theory of graph spectra, number theory, ergodic theory…. My talk will present some of the progress on this topic.
Dan Drake, University of Puget Sound
Why does clustered network connectivity give rise to bistable neuronal dynamics in simulations of large networks of cortical neurons, driven by Poisson spike trains?
Forum for postdocs or graduate students applying for jobs this year or next. Gain valuable and useful information about the application process
Absolutely continuous spectrum for random Schrödinger operators on tree-strips of finite cone-type.
Title: Prediction and Calibration Using Outputs From Multiple Simulators
Abstract: Deterministic simulators are widely used to describe physical processes in lieu of physical observations. In some cases, more than one computer simulator can be used to explore the physical system. Through the combination of field observations and simulated outputs, predictive models are developed for the real physical system. The resulting model can be used to perform sensitivity analysis for the system, solve inverse problems and make predictions. The proposed approach is Bayesian and will be illustrated through applications in predictive science at the Centre for Radiative Shock Hydrodynamics at the University of Michigan.
Title: Entangled Monte Carlo
Abstract: We propose a novel method for scalable parallelization of SMC algorithms, Entangled Monte Carlo simulation (EMC). EMC avoids the transmission of particles between nodes, and instead reconstructs them from the particle genealogy. In particular, we show that we can reduce the communication to the particle weights for each machine while efficiently maintaining implicit global coherence of the parallel simulation. We explain methods to efficiently maintain a genealogy of particles from which any particle can be reconstructed. We demonstrate using examples from Bayesian phylogenetic that the computational gain from parallelization using EMC significantly outweighs the cost of particle reconstruction. The timing experiments show that reconstruction of particles is indeed much more efficient as compared to transmission of particles.
Title: EDF Tests for Ordered Categorical Data
Abstract: In this talk, we consider a general class of EDF (Empirical Distribution Function) tests for ordered categorical data (ordered contingency tables), that is when the cells have a natural ordering, for example, letter grades on exams. Asymptotic distributions are found under the null hypothesis
H_0: each row follows the same distribution.
Asymptotic distributions under some contiguous alternatives are also found and asymptotic power of these tests can be calculated. A theorem is proved connecting the cases when parameters are known with those when parameters must be estimated.
Components of these test statistics are examined and the first 4 components can be interpreted as tests that are aimed at specific alternatives: location, scale, skewness and kurtosis.
We compare powers of the EDF tests with many competing tests including tests derived from the Neyman Pearson Lemma. EDF tests compare favourably.
A example data set is analyzed.
Dr. Ruben Zamar
Title: Robustness and Other Things
Abstract: Data quality is typically affected by the presence of outliers and other forms of data contamination. It may also be affected by missing data, data duplication, etc. From a broad perspective I am interested in the study of the detrimental effect of poor data quality on statistical inference, and in developing appropriate alternative methods to address these problems. The purpose of this talk is to give students a broad picture of my research interests and some current research projects. "Other things" in the title refers to other related topics I am interested in, such as cluster analysis, model selection, bootstrap and data mining.
Dr. Joan Hu
Title: Statistical Analysis for Forest Fire Control
Abstract: This talk discusses statistical issues arising from forest fire control. We start with brief background information to motivate the statistical problems. Models and inference procedures are then proposed. A set of Canadian forest fire data is used throughout the talk for illustration.
This is an on-going project jointly with W. John Braun.
Jabed Hossain Tomal
Title: Ensembling Descriptor Sets using Phalanxes of Variables to Rank Activity of Compounds in QSAR Studies
Abstract: In QSAR studies, molecular descriptors are used to model biological activity of compounds. The statistical model aims to rank rare actives early in a list of compounds. The classifier “random forest” has been found highly accurate in QSAR studies. To enhance its performance in terms of predictive ranking, we propose an ensemble method by grouping variables together. The variables in a group (we call phalanx) are good to put together, whereas the variables in different groups (phalanxes) are good to ensemble. Finally, our method aggregates the phalanxes. There exist several molecular descriptor sets in QSAR studies, and a particular set might do well in ranking activity of compounds for some assays, and fail to do well for other assays. We have considered four assays and five descriptor sets for each. We apply the ensemble of phalanxes to each descriptor set and further ensemble across the five descriptor sets we generated. The performance of our ensemble is compared with random forest. Specifically, random forest was applied to each of the five descriptor sets and to the pool of descriptor sets. We found our method superior to any of the random forests using two rigorous evaluation procedures.
Title: Monotone Interpolation: Sampling from a Constrained Gaussian Posterior
Abstract: Gaussian process (GP) models are popular tools for non-parametric modelling and function estimation. They are commonly used in the area of computer experiments where a finite number of function evaluations are available from a simulator and the underlying functin is to be estimated using a statistical model while interpolating the given points. However, in the case that extra information such as monotonicity of the underlying function is available, it is not straight- forward to incorporate the constraints in a GP model. I will talk about the constrained posterior distribution together with a recipe to sample from it.
Title: A New Sieve Model for Extreme Values
Abstract: Although rare, extreme events leave a lasting impact on our lives and the world in general. It is therefore important to determine the potential magnitude and frequency of such events, especially when these extremes are dangerous. We focus on the case when these extreme values are heavy tailed. Extreme Value Theory provides a theoretical
basis for extrapolating and making inference into these heavy tails; however, there is room for improvement in the extrapolation methods. One modification to the heavy tail is to add an upper truncation; we propose a modification which "progressively truncates" the tail with permeable filters like a sieve. The techniques are then applied to the largest Atlantic hurricanes and the largest black sea bass in Buzzard's Bay. We find that, in most cases, the sieve model provides the best fit, followed by the truncated model.