What is Correlation Architecture?
A Correlation Architecture is a distinctive approach to designing repositories for the large-scale storage, transformation and analytics of data. Data Quality Management platforms such as Aperture Data Studio use a Correlation Architecture in order to achieve the desired performance in terms of scalability and analytical ability.
Why does a Correlation Architecture help?
The correlated structure means that it only stores values once, no matter how many times a value is used. This means that every value is automatically indexed at load time, allowing you to ask critical questions such as:
- Where in the enterprise is there data that resembles a telephone number?
- Where else do I have this product code on any system?
- Which fields contain the same monetary amount?
- Which tables contain credit card information in the wrong field?
Because of the correlated indexing architecture, these type of questions all result in an instant response regardless of data volumes or numbers of tables and columns. A correlated repository is designed for massive volumes, and to provide instant drill down to data rows.
Correlated design also means that Data Profiling and Data Discovery tasks can be completed extremely quickly, far quicker than conventional repository designs that typically load a complete copy of operational data instead of single, indexed values.