Skip to content
Register Sign in Wishlist
Model-Based Clustering and Classification for Data Science

Model-Based Clustering and Classification for Data Science
With Applications in R

Part of Cambridge Series in Statistical and Probabilistic Mathematics

  • Publication planned for: November 2019
  • availability: Not yet published - available from November 2019
  • format: Hardback
  • isbn: 9781108494205

Hardback

Add to wishlist

Looking for an inspection copy?

Please email academicmarketing@cambridge.edu.au to enquire about an inspection copy of this book

Description
Product filter button
Description
Contents
Resources
Courses
About the Authors
  • Cluster analysis finds groups in data automatically. Most methods have been heuristic and leave open such central questions as: how many clusters are there? Which method should I use? How should I handle outliers? Classification assigns new observations to groups given previously classified observations, and also has open questions about parameter tuning, robustness and uncertainty assessment. This book frames cluster analysis and classification in terms of statistical models, thus yielding principled estimation, testing and prediction methods, and sound answers to the central questions. It builds the basic ideas in an accessible but rigorous way, with extensive data examples and R code; describes modern approaches to high-dimensional data and networks; and explains such recent advances as Bayesian regularization, non-Gaussian model-based clustering, cluster merging, variable selection, semi-supervised and robust classification, clustering of functional data, text and images, and co-clustering. Written for advanced undergraduates in data science, as well as researchers and practitioners, it assumes basic knowledge of multivariate calculus, linear algebra, probability and statistics.

    • Extensive use of real-world examples - with data, code and color graphics - builds intuition and understanding
    • R package MBCbook available on CRAN allows replication of analyses
    • This up-to-date account by four leading researchers gives access to powerful, state-of-the-art methods
    Read more

    Reviews & endorsements

    'Bouveyron, Celeux, Murphy, and Raftery pioneered the theory, computation, and application of modern model-based clustering and discriminant analysis. Here they have produced an exhaustive yet accessible text, covering both the field's state of the art as well as its intellectual development. The authors develop a unified vision of cluster analysis, rooted in the theory and computation of mixture models. Embedded R code points the way for applied readers, while graphical displays develop intuition about both model construction and the critical but often-neglected estimation process. Building on a series of running examples, the authors gradually and methodically extend their core insights into a variety of exciting data structures, including networks and functional data. This text will serve as a backbone for graduate study as well as an important reference for applied data scientists interested in working with cutting-edge tools in semi- and unsupervised machine learning.' John S. Ahlquist, University of California, San Diego

    'This book, written by authoritative experts in the field, gives a comprehensive and thorough introduction to model-based clustering and classification. The authors not only explain the statistical theory and methods, but also provide hands-on applications illustrating their use with the open-source statistical software R. The book also covers recent advances made for specific data structures (e.g. network data) or modeling strategies (e.g. variable selection techniques), making it a fantastic resource as an overview of the state of the field today.' Bettina Grün, Johannes Kepler Universität Linz, Austria

    'Four authors with diverse strengths nicely integrate their specialties to illustrate how clustering and classification methods are implemented in a wide selection of real-world applications. Their inclusion of how to use available software is an added benefit for students. The book covers foundations, challenging aspects, and some essential details of applications of clustering and classification. It is a fun and informative read!' Naisyin Wang, University of Michigan

    'This is a beautifully written book on a topic of fundamental importance in modern statistical science, by some of the leading researchers in the field. It is particularly effective in being an applied presentation - the reader will learn how to work with real data and at the same time clearly presenting the underlying statistical thinking. Fundamental statistical issues like model and variable selection are clearly covered as well as crucial issues in applied work such as outliers and ordinal data. The R code and graphics are particularly effective. The R code is there so you know how to do things, but it is presented in a way that does not disrupt the underlying narrative. This is not easy to do. The graphics are 'sophisticatedly simple' in that they convey complex messages without being too complex. For me, this is a 'must have' book.' Rob McCulloch, Arizona State University

    See more reviews

    Customer reviews

    Not yet reviewed

    Be the first to review

    Review was not posted due to profanity

    ×

    , create a review

    (If you're not , sign out)

    Please enter the right captcha value
    Please enter a star rating.
    Your review must be a minimum of 12 words.

    How do you rate this item?

    ×

    Product details

    • Publication planned for: November 2019
    • format: Hardback
    • isbn: 9781108494205
    • dimensions: 260 x 185 x 25 mm
    • weight: 1.1kg
    • contains: 40 b/w illus. 171 colour illus. 48 tables
    • availability: Not yet published - available from November 2019
  • Table of Contents

    1. Introduction
    2. Model-based clustering: basic ideas
    3. Dealing with difficulties
    4. Model-based classification
    5. Semi-supervised clustering and classification
    6. Discrete data clustering
    7. Variable selection
    8. High-dimensional data
    9. Non-Gaussian model-based clustering
    10. Network data
    11. Model-based clustering with covariates
    12. Other topics
    List of R packages
    Bibliography
    Index.

  • Resources for

    Model-Based Clustering and Classification for Data Science

    Charles Bouveyron, Gilles Celeux, T. Brendan Murphy, Adrian E. Raftery

    General Resources

    Find resources associated with this title

    Type Name Unlocked * Format Size

    Showing of

    Back to top

    *This title has one or more locked files and access is given only to lecturers adopting the textbook for their class. We need to enforce this strictly so that solutions are not made available to students. To gain access to locked resources you either need first to sign in or register for an account.


    These resources are provided free of charge by Cambridge University Press with permission of the author of the corresponding work, but are subject to copyright. You are permitted to view, print and download these resources for your own personal use only, provided any copyright lines on the resources are not removed or altered in any way. Any other use, including but not limited to distribution of the resources in modified form, or via electronic or other media, is strictly prohibited unless you have permission from the author of the corresponding work and provided you give appropriate acknowledgement of the source.

    If you are having problems accessing these resources please email lecturers@cambridge.org

  • Authors

    Charles Bouveyron, Université Côte d’Azur
    Charles Bouveyron is Full Professor of Statistics at Université Côte d'Azur and the Chair of Excellence in Data Science at Institut National de Recherche en Informatique et en Automatique (INRIA), Rocquencourt. He has published extensively on model-based clustering, particularly for networks and high-dimensional data.

    Gilles Celeux, Inria Saclay Île-de-France
    Gilles Celeux is Director of Research Emeritus at Institut National de Recherche en Informatique et en Automatique (INRIA), Rocquencourt. He is one of the founding researchers in model-based clustering, having published extensively in the area for thrity-five years.

    T. Brendan Murphy, University College Dublin
    T. Brendan Murphy is Full Professor in the School of Mathematics and Statistics at University College Dublin. His research interests include model-based clustering, classification, network modeling and latent variable modeling.

    Adrian E. Raftery, University of Washington
    Adrian E. Raftery is the Boeing International Professor of Statistics and Sociology at the University of Washington. He is one of the founding researchers in model-based clustering, having published in the area since 1984.

Sign In

Please sign in to access your account

Cancel

Not already registered? Create an account now. ×

Sorry, this resource is locked

Please register or sign in to request access. If you are having problems accessing these resources please email lecturers@cambridge.org

Register Sign in
Please note that this file is password protected. You will be asked to input your password on the next screen.

» Proceed

You are now leaving the Cambridge University Press website. Your eBook purchase and download will be completed by our partner www.ebooks.com. Please see the permission section of the www.ebooks.com catalogue page for details of the print & copy limits on our eBooks.

Continue ×

Continue ×

Continue ×

Find content that relates to you

Join us online

This site uses cookies to improve your experience. Read more Close

Are you sure you want to delete your account?

This cannot be undone.

Cancel

Thank you for your feedback which will help us improve our service.

If you requested a response, we will make sure to get back to you shortly.

×
Please fill in the required fields in your feedback submission.
×