Abstract:
How do you search for a needle in a multi-dimensional haystack, if you don’t know what a needle is and you don’t
know if there is one in the haystack?
This is the daunting challenge of big data and it requires a paradigm shift - away from hypothesis driven searches -
towards a methodology that lets the data speak for itself. Dynamic Quantum Clustering (DQC) is a visual
methodology that accomplishes this feat. It works with big, high-dimensional data and leverages our ability to
recognize complex patterns evolving in time and space. This creates a unique view of the data that makes it possible
to solve the problem of converting information to understanding.
DQC differs from commonly used approaches in several ways:
1. It is data agnostic in that it doesn’t need to model the data or use domain specific knowledge.
2. It is unbiased, in that it doesn’t assume there are structures or clusters to be found in the data.
3. It works with very dense data; i.e., data that exhibits no visible separations that can be used to define
clusters
4. It is highly visual, so it is possible to follow all stages of the process.
These characteristics make DQC ideally suited to the task of searching for unexpected information. DQC has been
successfully applied to big, complex, noisy, real-world data drawn from a wide variety of fields. In each case the
analysis revealed unexpected structure that led to a new perspective on the data. Moreover, each analysis led to a
new way of analyzing the data with a specificity and sensitivity that couldn’t be achieved using conventional
statistical methods. This talk will present examples of the application of DQC to problems in three very different
fields:
*Genomics & Proteomics
*X-ray Nano-Chemistry
*Condensed Matter Physics