A Procedure and Algorithm for Segmented Signal Estimation in Time Series, Image, and Higher Dimensional Data

Jeff Scargle
NASA Ames Research Center

The problem of detecting statistically significant structure
in time series data can be solved by obtaining the partition
of the observation interval with the maximum posterior
probability for a nonparametric, piecewise constant model.
The result is the most likely step-function representation
of the underlying signal. This procedure yields a denoised version of the signal with no explicit smoothing. It can
reveal structure on any scale as long as the features are statistically significant.

We have found a surprisingly simple algorithm for obtaining the optimal partition of the interval, for any data type -- including
points, binned counts, and measurements at arbitrary times with a
known error distribution. The case of point data modeled as
a piecewise constant Poisson process is demonstrated in two
modes: real-time and retrospective. The former will be used
in future NASA space observatories to trigger on transient events,
such as gamma-ray bursts, and the latter for automated processing
of large databases of astronomical time series. The same algorithm
also yields histograms in which the bins -- determined by the data and not fixed ahead of time -- are not necessarily evenly spaced.

The same approach can be used to model data in 2D, 3D and higher dimensional data spaces. For point data, the Voronoi tessellation of the data space provides an excellent data structure in which the
Voronoi cells provide information about the local point density and
its gradient. The optimum partition of the data space can be approximated
by searching over the finite space of all partitions consisting of
data cells assigned to separate partition elements, or blocks.
Our 1D "Bayesian blocks" algorithm can be extended to this higher dimensional setting, to provide optimal piecewise representations of data in time O(N**2). Results will be shown for segmentation of 2D photon maps from gamma-ray telescopes, star catalogs from the 2MASS infrared survey, and 3D representation of the large scale structure of the Universe based on the first data release from the Sloan Digital Sky Survey.

Collaborators include Brad Jackson, of San Jose State University, and Jay Norris, of the NASA Goddard Space Flight Center. I am greatful for support from the NASA Applied Information Systems Research Program, and the Woodward Fund and the Center for
Applied Mathematics and Computer Science, at San Jose State.

Presentation (PowerPoint File)

Back to Mathematical Challenges in Astronomical Imaging