WILEY SERIES IN PROBABILITY AND STATISTICS
Established by Walter A. Shewhart and Samuel S. Wilks
Editors: David J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice, Geof H. Givens, Harvey Goldstein, Geert Molenberghs, David W. Scott, Adrian F. M. Smith, Ruey S. Tsay
Editors Emeriti: J. Stuart Hunter, Iain M. Johnstone, Joseph B. Kadane, Jozef L. Teugels
The Wiley Series in Probability and Statistics is well established and authoritative. It covers many topics of current research interest in both pure and applied statistics and probability theory. Written by leading statisticians and institutions, the titles span both state-of-the-art developments in the field and classical methods. Reflecting the wide range of current research in statistics, the series encompasses applied, methodological and theoretical statistics, ranging from applications and new techniques made possible by advances in computerized practice to rigorous treatment of theoretical approaches. This series provides essential and invaluable reading for all statisticians, whether in academia, industry, government, or research.
A complete list of titles in this series can be found at http://www.wiley.com/go/wsps
This edition first published 2018
This work is a U.S. Government work and is in the public domain in the U.S.A.
Published 2018 by John Wiley & Sons, Inc
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.
The right of Steven W. Knox to be identified as the author of this work has been asserted in accordance with law.
Registered Office(s)
John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA
Editorial Office
111 River Street, Hoboken, NJ 07030, USA
For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.
Wiley also publishes its books in a variety of electronic formats and by print-on-demand. Some content that appears in standard print versions of this book may not be available in other formats.
Limit of Liability/Disclaimer of Warranty
In view of ongoing research, equipment modifications, changes in governmental regulations, and the constant flow of information relating to the use of experimental reagents, equipment, and devices, the reader is urged to review and evaluate the information provided in the package insert or instructions for each chemical, piece of equipment, reagent, or device for, among other things, any changes in the instructions or indication of usage and for added warnings and precautions. While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
Library of Congress Cataloging-in-Publication Data
Names: Knox, Steven W., author.
Title: Machine learning : a concise introduction / by Steven W. Knox.
Description: Hoboken, New Jersey : John Wiley & Sons, 2018. | Series: Wiley series in probability and statistics |
Identifiers: LCCN 2017058505 (print) | LCCN 2018004509 (ebook) | ISBN 9781119439073 (pdf) | ISBN 9781119438984 (epub) | ISBN 9781119439196 (cloth)
Subjects: LCSH: Machine learning.
Classification: LCC Q325.5 (ebook) | LCC Q325.5 .K568 2018 (print) | DDC 006.3/1--dc23
LC record available at https://lccn.loc.gov/2017058505
Cover image: © Verticalarray/Shutterstock
Cover design by Wiley
The goal of statistical data analysis is to extract the maximum information from the data, and to present a product that is as accurate and as useful as possible.
—David Scott, Scott, David Multivariate Density Estimation:
Theory, Practice and Visualization, 1992
My purpose in writing this book is to introduce the mathematically sophisticated reader to a large number of topics and techniques in the field variously known as machine learning, statistical learning, or predictive modeling. I believe that a deeper understanding of the subject as a whole will be obtained from reflection on an intuitive understanding of many techniques rather than a very detailed understanding of only one or two, and the book is structured accordingly. I have omitted many details while focusing on what I think shows “what is really going on.” For details, the reader will be directed to the relevant literature, or to the exercises, which form an integral part of the text.
No work this small on a subject this large can be self-contained. Some undergraduate-level calculus, linear algebra, and probability is assumed without reference, as are a few basic ideas from statistics. All of the techniques discussed here can, I hope, be implemented using this book and a mid-level programming language (such as C),1 and explicit implementation of many techniques using R is presented in the last chapter.
The reader may detect a coverage bias in favor of classification over regression. This is deliberate. The existing literature on the theory and practice of linear regression and many of its variants is so strong that it does not need any contribution from me. Classification, I believe, is not yet so well documented. In keeping with what has been important in my experience, loss functions are completely general and predictive modeling is stressed more than explanatory modeling.
The intended audience for these notes has an extremely diverse background in probability, ranging from one introductory undergraduate course to extensive graduate work and published research.2 In seeking a probability notation which will create the least confusion for all concerned, I arrived at the non-standard use of P(x) for both the probability of an event x and a probability mass or density function, with respect to some measure which is never stated, evaluated at a point x. My hope, which I believe has been borne out in practice, is that anyone with sufficient knowledge to find this notation confusing will have sufficient knowledge to work through that confusion.