# Fall 2007 Joint UBC/SFU Graduate Student Workshop

## Topic

**Speaker**: Derek Bingham

**Title**: Statistical Research in a Collaborative Environment

Modern statistical research is often motivated by applied problems that arise in other areas of science. Finding solutions to these applied problems leads to collaborations between statisticians and subject-specific researchers. Working in such a collaborative environment brings much benefit to both parties, but is not without challenges. This talk will relate some of my experiences working in such an environment, and how one might build successful long-term collaborative relationships.

**Title**: Confidence intervals for proportions and quantiles with application to NHANES

**Speaker**: Cindy Feng

It has been noted that the usual confidence interval for proportions does not perform well for large and small values of p. In surveys the issue is complicated by the survey design and issues of whether to use design effects, effective sample size and effective degrees of freedom arise. The question is which of the many possible confidence intervals available should be recommended for the U.S. National Health and Nutrition Examination Surveys (NHANES) end users and what cautions should be given. In addition, the issues may be different if the interval is actually being used in combination with Woodruffâ€™s method to form confidence intervals for small and large quantiles.

**Title**: Median Loss Analysis

**Speaker**: Pen Yu

In classical decision theory in statistics, Wald (1950) first introduced the risk function, and used it to evaluate how good the estimators are. Conventionally, the risk is assumed to be finite in most situations. In other words, we cannot handle the problems of heavy-tail distributions like the Cauchy distribution. In this talk, I will introduce the median version of the risk, called the median loss, and compare it with the risk and other domination criteria. Moreover, we will see that the estimator by the median loss approach is more loss robustness than the estimator by the risk, such as the Bayes estimator.

**Title**: Statistical Monitoring of Clinical Trials with Multivariate Response or Multiple Arms Using Repeated Confidence Bands

**Speaker**: Lihui Zhao

**Coauthors**:

X. Joan Hu (Simon Fraser University) Stephen W. Lagakos (Harvard University)

Clinical trials with multivariate response or multiple arms have become increasingly common because of their potential efficiency and cost saving. Interim analyses of such studies are often guided by parametric assumptions for the underlying probability models. There are situations where it is not clear at the outset how the responses differ among the treatment groups and what kinds of differences are clinically meaningful. More flexible designs and monitoring procedures are therefore desirable. In this talk, we extend the repeated confidence bands approach (Hu and Lagakos, 1999) to studies with multivariate target function. We use a recent AIDS clinical trial to illustrate how to apply the multivariate repeated confidence bands (MRCB) approach in practice.

**Title**: Prior Sensitivity and Cross-Validation using Sequential Monte Carlo

**Speaker**: Luke Bornn

In a Bayesian setting, adequately approximating the model of interest can be computationally expensive in the order of hours or even days. Prior sensitivity and cross-validation are both tasks that involve repeating this approximation repeatedly, potentially hundreds or thousands of times. In this talk I will demonstrate how sequential Monte Carlo methods can make prior sensitivity and cross-validation feasible in situations where the distribution of interest is not available analytically, reducing computational time by an order of magnitude or more in most settings.

**Title**: The Publication Process in Statistics

**Speaker**: Paul Gustafson

Peer-reviewed academic journals are central to scientific life. Scientists of all stripes spend substantial proportions of their time reading, writing, and reviewing for journals. Based on my experiences as an author, a reviewer, an associate editor, and an editor, I will make some comments on how academic journals function, and try to offer some advice on navigating the publication process.

**Title**: Finding approximate solutions to combinatorial problems with very large datasets using BIRCH

**Speaker**: Justin Harrington

Over time the boundaries between Computer Science and Statistics have blurred, with a number of disciplines (e.g. Machine Learning) being actively researched in both schools. One such technique is called BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies) (Zhang et al, 1997), which is a data pre-processing tool used for clustering extremely large datasets with a k-means algorithm. The advantage of this algorithm is that it generates "sufficient statistics" with only one pass of the dataset, and these values can then be used instead of the whole dataset for certain applications.

In this talk we demonstrate this algorithm's application in two fields, namely robust statistics and (if time permits) a new clustering method called Linear Grouping Analysis (Van Aeslt et al, 2006).

**Title**: Designs for Computer Experiments

**Speaker**: Chunfang Lin

Latin hypercube designs have been widely adopted in conducting computer experiments. In this talk, we introduce methods for constructing a rich class of Latin hypercube designs with appealing projection and space-filling properties. The class includes many orthogonal Latin hypercube designs that are not available in the literature, as well as nearly-orthogonal Latin hypercubes and two-level orthogonal-array based orthogonal Latin hypercube designs. This is joint work with Randy R. Sitter.

## Speakers

## Details

Welcome to the Fall, 2007 Joint UBC/SFU Gradaute Student Workshop in Statistics Webpage.

Many thanks to the Pacific Institute for Mathematical Sciences (PIMS) and The IRMACS Centre for making this event possible.

**Schedule**

9.00am - 10.00am Derek Bingham

10.00am - 10.15am Coffee/Muffins

10.15am - 10.45am Cindy Feng

10.45am - 11.15am Pen Yu

11.15am - 11.30am Break

11.30am - 12.00pm Lihui Zhao

12.00pm - 12.30pm Luke Bornn

12.30pm - 1.30pm Lunch

1.30pm - 2.30pm Paul Gustafson

2.30pm - 2.45pm Break

2.45pm - 3.15pm Justin Harrington

3.15pm - 3.45pm Chunfang Lin

3.45pm - 4.00pm Break

4.00pm - 5.00pm Workshop

5.30pm Dinner

**Contacts**:

Matthew Pratola

mpratola at sfu.ca

Luke Bornn

l.bornn at stat.ubc.ca

**Scientific, Seminar**

**November 2â€“3, 2007**

**-**