- 2 February 2023, 2pm. Luis Espath,
University of Nottingham.
-
Title: Physics-informed Spectral Learning: Wind Reconstruction
-
Abstract: We propose a Physics-informed Spectral Learning framework. Within this
physics-informed type of statistical learning approach, we adaptively build a sparse set of Fourier
basis functions with corresponding coefficients by solving a sequence of minimization problems where
the set of basis functions is augmented greedily at each optimization problem. We regularize our
minimization problems with the seminorm of the fractional Sobolev space in a Tikhonov fashion. To
show the capability of this tool, we address two classical physical problems. First, we reconstruct
the velocity field of incompressible flows given a finite set of measurements of an underlying
incompressible flow. This is applied to generate the UK wind map. Second, we perform the
Helmholtz--Hodge decomposition given a finite set of measurements of an underlying flow. This is
used to detect the hurricane eye of the storm of the century 1993. For this spatial approximation,
we developed the Sparse Fourier approximation based on a discrete $L^2$ projection. In the Fourier
setting, the divergence- and curl-free constraints become a finite set of linear algebraic
equations. This work is based on [1,2].
[1] Luis Espath, Dmitry Kabanov, Jonas Kiessling, and RaúlTempone. Statistical learning for fluid
flows: Sparse Fourier divergence-free approximations. Physics of Fluids, 33(9):097108, 2021.
[2] Dmitry Kabanov, Luis Espath, Jonas Kiessling, and Raúl Tempone. Estimating divergence-free flows
via neural networks. PAMM, 21(1):e202100173, 2021.
- 9 February 2023, 2pm. Codina Cotar,
University College London. (Cancelled)
- 23 February 2023, 2pm. Minmin Wang, University
of Sussex. (Cancelled)
- 2 March 2023, 2pm. Mario Beraha, Università di Torino.
-
Title: Random Measure Priors in Bayesian Frequency Recovery from Sketches
-
Abstract: Consider the problem of dealing with a stream of tokens, where each token could be
an IP address, a URL, or a language n-gram. Each token takes values in a set whose dimension is too
large to store in a computer, and a compression strategy must be devised for inference. The
count-min sketch (CMS) is a randomized data structure dealing with situations as before. In a CMS,
data is processed by multiple random hash functions that map the tokens’ space into {1, …, J}. Since
J is smaller than the total number of possible tokens, applying the hash function leads to a loss in
the information stored. Nonetheless, it is possible to estimate the frequency of each token (i.e.,
how many times it appeared) with high accuracy, at least in probability, even for moderate values of
J and hash functions.
In the traditional CMS, the stream of tokens is treated as a deterministic sequence, and randomness
is introduced by considering i.i.d. random hash functions. Here, we take a statistical point of view
and propose to consider a probabilistic model for the data as well, specifically a Bayesian
nonparametric species sampling model. We provide an explicit expression for the posterior
probability of the frequency of each token for a wide class of priors, namely the Poisson-Kingman
class. However, we show that the Dirichlet process is the only prior leading to a tractable
expression. Then, we generalize our approach to more complex data streams, considering, for
instance, streams of documents made of several n-grams.
Joint work with Stefano Favaro
- 8 March 2023, 2pm. Sam Power, University of Bristol.
-
Title: Explicit convergence bounds for Metropolis Markov chains
-
Abstract: Markov chain Monte Carlo (MCMC) algorithms are a widely-used tool for approximate
simulation from probability measures in structured, high-dimensional spaces, with a variety of
applications. A key ingredient of their success is their ability to converge rapidly to equilibrium
at a rate which depends acceptably on the ‘difficulty' of the sampling problem at hand, as captured
by the dimension of the problem, and the curvature and concentration properties of the target
distribution.
In this talk, I will present recent work with C. Andrieu, A. Lee and A. Wang on the convergence
analysis of Metropolis-type MCMC algorithms on R^d. In particular, we provide a detailed study of
the Random Walk Metropolis (RWM) Markov chain with arbitrary proposal variances and in any
dimension, obtaining interpretable estimates on their convergence behaviour under suitable
assumptions. These estimates have a provably sharp dependence on the dimension of the problem, thus
providing theoretical validation for the use of these algorithms in complex settings.
Our positive results are quite generally applicable. We also study the preconditioned
Crank--Nicolson Markov chain as applied to simulation from Gaussian Process posterior models,
obtaining dimension-independent complexity bounds under suitable assumptions.
Preprint available at .
- 23 March 2023, 2pm. Olga
Iziumtseva, University of Nottingham.
-
Title: Hilbert-valued self-intersection local times for planar Brownian motion
-
Abstract: Trajectories of planar Brownian motion are very irregular curves. Usual tools of the differential geometry are not
applicable for description of its geometrical shape. One can use Dynkin-renormalized self-intersection local times, for
example, as the geometrical characteristics of planar Brownian motion. In the talk E.B. Dynkin construction for
self-intersection local times of planar Brownian motion is extended on Hilbert-valued weights. In Dynkin construction
the weight is bounded and measurable. Since the weight function can describe the properties of the media in which the
Brownian motion moves, then relatively to the external media properties the weight function can be random and unbounded.
In the talk we discuss a possibility to consider the Hilbert-valued weights. It appears that the existence of
Hilbert-valued Dynkin-renormalized self-intersection local times is equivalent to the embedding of the values of
Hilbert-valued weight into Hilbert- Schmidt brick. Using A.A. Dorogovtsev sufficient condition of the embedding of
compact sets into Hilbert-Schmidt brick in terms of isonormal process we prove the existence of Hilbert-valued
Dynkin-renormalized self-intersection local times.
- 6 April 2023, 2pm. Arindam Fadikar, Argonne National Laboratory.
- Title: Scalable Statistical Inference of Photometric Redshift via Data Subsampling
-
Abstract:
In this talk, we will discuss a novel data-driven statistical modeling framework that combines the uncertainties from an
ensemble of statistical models learned on smaller subsets of data carefully chosen to account for imbalances in the
input space. We demonstrate this method on a photometric redshift estimation problem in cosmology, which seeks to infer
a distribution of the redshift -- the stretching effect in observing the light of far-away galaxies -- given
multivariate color information observed for an object in the sky. Our proposed method performs balanced partitioning,
graph-based data subsampling across the partitions, and training of an ensemble of Gaussian process models.
- 20 April 2023, 2pm. Federico
Girotti, University of Nottingham.
- Title: Concentration Inequalities for Output Statistics of Quantum Markov Processes
-
Abstract:
Concentration inequalities play a fundamental role in many different fields including, among the others, mathematical
statistics, learning theory, random matrix theory and statistical physics. A rich and well established theory is now
available to bound the fluctuations of well-behaving functions of independent random variables and many tools have been
extended to deal with functions of weakly dependent random variables, first and foremost Markov chains. In our talk we
will present some new concentration bounds for time averages of measurement outcomes in quantum Markov processes, which
is a relevant class of stochastic processes arising from indirect monitoring of quantum systems. The techniques and
ideas draw from those employed for empirical averages of classical Markov chains; however, if specialized to the
classical setting, the bounds provide new concentration inequalities for empirical fluxes of classical Markov chain and
imply inverse thermodynamic uncertainty relations. The talk is based on joint work with G. Bakewell-Smith, M. Guta and
J. P. Garrahan (https://link.springer.com/article/10.1007/s00023-023-01286-1, https://arxiv.org/abs/2210.04983).
- 27 April 2023, 2pm. Mayya Zhilova, Georgia Tech.
- Title: Accuracy of the bootstrap and the normal approximation in a high-dimensional framework
-
Abstract:
In this talk we will address the problem of establishing a higher-order accuracy of bootstrapping procedures and
(non-)normal approximation in a multivariate or high-dimensional setting. This topic is important for numerous problems
in statistical inference and applications concerned with confidence estimation and hypothesis testing, and involving a
growing dimension of an unknown parameter or high-dimensional random data. The new results outperform or sharpen
accuracy of the normal approximation in existing Berry–Esseen inequalities under very general conditions. The
established approximation bounds allow to track dependence of error terms on a dimension and a sample size in an
explicit way. We also show optimality of these results in case of symmetrically distributed random summands. The talk
will include an overview of statistical problems where the new results lead to improvements in accuracy of estimation
procedures.