Long Range Correlations
Records, each consisting of a set of simple objects from a data space, exhibit correlations that are non-local in the dataspace. A new class of mixture models is developed that capture such long range corrations.
Example data that exhibit long range corrations, consits one hundred two dimensional point sets with three clusters indicated by color in the point space (a). Part (b) shows the first five point sets. Each point is represented by a box. Color indicates cluster membership of the point. Color proportions of all point sets (c) reveal that point sets are either predominantly red and green or mainly blue. Thus, color proportions of sets form clusters with centers indicated by big A and big B in the proportion simplex. Cluster A in the simplex establishes a long range correlation between red and green clusters in the point space that cannot be detected in the point space alone.
Long range correlations
In the project, we develop several long range correlation mixture models that generalizes well known mixture models and topic models. We derive variational inference methods and implement efficient inference algorithms.
Such models can be applied to a wide range of data, including two-dimensional nuclear magnetics resonance (NMR) spectra of natural compounds and metabolites and text documents with referencing geo-locations.
Road map
- Publication of techical report of the most simple long range corration mixture model
- Reference implementation of an inference algorithm
- Analysis of natural compounds, metabolites and two-dimensional nuclear magnetics resonance (NMR) spectraAnalysis of text documents referencing geo-location
Contact: Alexander Hinneburg, hinneburg@informatik.uni-halle.de