J. Banfield and A. Raftery, “Model-based Gaussian and Non-Gaussian Clustering”, Biometrics, vol. 39:803-821, pp.15-34, (1993). |
P.S. Bradley, O.I. Mangasarian, and W.N. Street. 1997. Clustering via Concave Minimization', in Advances in Neural Information Processing systems 9, M.C. Mozer, M.I. Jordan, and T. Petsche (Eds.) pp. 368-374, MIT Press, (1997). |
P. Cheeseman and J. Stutz, “Bayesian Classification (AutoClass): Theory and Results”, in [FPSU96], pp. 153-180. MIT Press, (1996). |
A.P. Dempster, N.M. Laird, and D. Rubin, “Maximum Likelihood from Incomplete Data Via the EM Algorithm”, Journal of the Royal Statistical Society, Series B, 39(1):pp. 1-38. (1977). |
U. Fayyad, D. Haussler, and P. Stolorz. “Mining Science Data”, Communications of the ACM 39(11), (1996). |
D. Fisher, “Knowledge Acquisition via Incremental Conceptual Clustering”. Machine Learning, 2:139-172, (1987). |
E. Forgy, “Cluster Analysis for Multivariate Data: Efficiency vs. Interpretability of Classifications”, Biometrics 21:768 (1965). |
Jones, “A Note on Sampling from a Tape File”, Communications of the ACM, vol. 5, (1962). |
T. Zhang, R. Ramakrishnan, and M. Livny, “Birch: A New Data Clustering Algorithm and its Applications”, Data Mining and Knowledge Discovery, vol. 1, No. 2, (1997). |
Radford M. Neal and Geoffrey E. Hinton, A View of the EM Algorithm That Justifies Incremental , Sparse and Other Variants, (date unknown). |
Bo Thiesson, Christopher Meek, David Maxcell Chickering and David Heckerman, “Learning Mixtures of DAG Models”, Technical Report MSR-TR-97-30 De. 1997, revised May 1998. |
S.Z. Selim and M.A. Ismail, “K-Means-Type Algorithms: A Generalized Covergence Therorem and Characterization of Local Optimaility,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. PAMI-6, No. 1, (1984). |