Magnum Opus: Association Discovery Technologies

Magnum Opus incorporates a number of unique association discovery technologies.  This page outlines some of these state-of-the-art data mining techniques and provides links to the scientific papers that describe them in detail.

At the heart of Magnum Opus is the use of k-optimal (also known as top-k) association discovery techniques.  Most association discovery techniques find frequent patterns.  Many of these will not be interesting for many applications.  In contrast k-optimal techniques allow the user to specify what makes an association interesting and how many (k) they wish to find.  It then finds the k most interested associations according to the criteria the user selects.  The available criteria for measuring interest include lift, leverage, strength (also known as confidence), support and coverage. The following papers describe these techniques in more detail.

Webb, G. I. (2000). Efficient Search for Association Rules. In R. Ramakrishnan and S. Stolfo (Eds.), Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2000) Boston, MA. New York: The Association for Computing Machinery, pages 99-107. [Abstract] [Pre-publication PDF] [Link to paper via ACM Portal]

Webb, G. I. and S. Zhang (2005). k-Optimal-Rule-Discovery. Data Mining and Knowledge Discovery 10(1). Netherlands: Springer, pages 39-79. [Abstract] [Prepublication PDF] [Link to paper via Springerlink]

In addition to association rules, Magnum Opus can also identify associations expressed as itemsets.

Webb, G.I. (2010). Self-Sufficient Itemsets: An Approach to Screening Potentially Interesting Associations Between Items. Transactions on Knowledge Discovery from Data 4. ACM, pages 3:1-3:20. [Abstract] [Pre-Publication PDF] [Link to paper via ACM Digital Library]

The speed with which Magnum Opus operates is due to the efficient OPUS search algorithm.

Webb, G. I. (1995). OPUS: An Efficient Admissible Algorithm For Unordered Search. Journal of Artificial Intelligence Research 3. Menlo Park, CA: AAAI Press, pages 431-465. [Abstract] [Link to paper via JAIR website]

Webb, G. I. (1996). Inclusive Pruning: A New Class of Pruning Rule for Unordered Search and its Application to Classification Learning. In K. Ramamohanarao (Ed.), Australian Computer Science Communications Vol. 18 (1): Proceedings of the Nineteenth Australasian Computer Science Conference (ACSC'96) Royal Melbourne Insitute of Technology, Australia. Melbourne: ACS, pages 1-10. [Abstract] [PDF]

Due to the large number of potential associations considered during association discovery, there is a large risk that many associations found by conventional techniques will be spurious.  Indeed, in some applications none of the 'associations' that are found will be real associations.  Magnum Opus incorporates unique facilities for filtering the associations that are found in order to discard those that are likely to be spurious.

Webb, G.I. (2007). Discovering Significant Patterns. Machine Learning 68(1). Netherlands: Springer, pages 1-33. [Abstract] [Pre-publication PDF] [Link to paper via Springerlink]

Webb, G.I. (2006). Discovering Significant Rules. In L. Ungar, M. Craven, D. Gunopulos and T. Eliassi-Rad (Eds.), Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2006) Philadelphia, PA. New York: The Association for Computing Machinery, pages 434 - 443. [Abstract] [Pre-publication PDF] [Download from ACM Portal]

Webb, G.I. (2008). Layered Critical Values: A Powerful Direct-Adjustment Approach to Discovering Significant Patterns. Machine Learning 71(2-3). Netherlands: Springer, pages 307-323 [Technical Note]. [Abstract] [Pre-Publication PDF] [Link to paper via Springerlink]

Magnum Opus is a highly flexible tool that can be used for many forms of data mining analysis.  For example, it can be used for contrast discovery (also known as emerging pattern discovery and closely related to subgroup discovery).

Webb, G. I., S. Butler, and D. Newlands (2003). On Detecting Differences Between Groups. In P. Domingos, C. Faloutsos, T. Senator, H. Kargupta and L. Getoor (Eds.), Proceedings of The Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2003) Washington, DC. New York: The Association for Computing Machinery, pages 256-265. [Abstract] [PDF] [Link to paper via ACM Portal]

Many of these technologies are explained in the tutorial on association discovery using Magnum Opus.