Practical Machine Learning in R
With Membership
Suggested price

Practical Machine Learning in R

About the Book

The book is about quickly entering the world of creating machine learning models in R. The theory is kept to minimum and there are examples for each of the major algorithms for classification, clustering, features engineering and association rules.

The book is a compilation of the leaflets the authors give to their students during the practice labs, in the courses of Pattern Recognition and Data Mining, in the Electrical and Computer Engineering Department of the Aristotle University of Thessaloniki.

About the Authors

Kyriakos Chatzidimitriou
Kyriakos Chatzidimitriou

Dr. Kyriakos Chatzidimitriou has obtained both his doctorate and engineering diplomas from the Electrical and Computer Engineering (ECE) department of the Aristotle University of Thessaloniki (AUTH), Greece, in 2012 and 2003 respectively. He has also a Master of Science degree from the Computer Science department of Colorado State University (CSU), USA, completed in 2006. In 2009, he received the excellence award as a PhD candidate by the Research Committee of AUTH, while in 2012, with team Mertacor, he got the 1st place in the international Trading Agent Competition (TAC) Ad Auctions (AA) game. He has worked as a researcher and technical leader for European (Mobile-Age, SEAF, S-CASE, CASSANDRA, Agent Academy), national (eTHMMY) and private sector funded R&D projects and as a software engineer in the industry. He is currently a research and teaching associate at ECE, AUTH working as a technical lead and software architect in R&D projects. He is also an adjunct instructor in the "Advanced Computer and Communication Systems" postgraduate programme, giving lectures about Software Engineering, Databases and Data Mining. In 2017, along with Dr. Andreas Symeonidis they founded Cyclopt, a spin-off company of AUTH, focused on software quality assessment and software analytics.

Themistoklis Diamantopoulos
Themistoklis Diamantopoulos

Dr. Themistoklis Diamantopoulos works as a researcher on Data Mining and Software Engineering. He has a Diploma in Electrical and Computer Engineering from the Aristotle University of Thessaloniki and an MSc in Computer Science from the University of Edinburgh. He has worked as a research associate in the Informatics and Telematics Institute (ITI) of Centre for Research and Technology - Hellas (CERTH) and currently works at the Intelligent Systems and Software Engineering Labgroup (ISSEL) where he obtained his PhD. His research is relevant to the areas of Data Mining and Software Engineering, while his interests also include the areas of Machine Learning and Multi-Agent Systems. He has participated and still participates in multiple EU-funded projects (eCOMPASS, S-CASE, SEAF) both as a researcher and as an active software engineer, while he also holds the position of software reusability expert in the newly founded company Cyclopt, which aims at providing innovative solutions in the area of software quality as-a-service. Personal website:

Thomas Karanikiotis
Thomas Karanikiotis

Thomas Karanikiotis is an Electrical and Computer Engineer who graduated on July, 2018 from Aristotle University of Thessaloniki, Greece with a GPA of 9,34/10,00. As an undergraduate student, he has been involved in a robotics team called "ARIADNE" at the Robotics and Automation laboratory of the Electronic & Computer Engineering Specialization Area at the Department of Electrical and Computer Engineering, with which they were elected twice (2016-2017,2017-2018) to the ten best teams in the world among major research centers and universities in the KUKA Innovation Award 2017 & 2018. He is currently working as a research associate and Ph.D. student at Aristotle University of Thessaloniki under the supervision of Associate Professor Dr. Andreas Symeonidis in the area of Machine Learning and Data Analysis. His Ph.D. Thesis is "Deep Learning Representations for Software Artefacts". As a research associate, he has been involved in a number of EU and private sector-funded projects, as a researcher and software engineer. He is also a teaching assistant in the undergraduate course “Pattern Recognition” and the postgraduate course “Big Data Analytics” at ECE, AUTH.

Michail Papamichail
Michail Papamichail

Michail Papamichail is an electrical and computers engineer graduated from Aristotle University of Thessaloniki, Greece. In the fourth year of his studies, he was selected by the University of California, Irvine for a paid internship position after a worldwide call. During the internship, he worked in the area of systems security in the Secure Systems and Software Laboratory (SSL) under professor Michael Franz. He has also worked as an Oracle RPAS solution consultant (Veltio LLC), where his main duties involved the design and implementation of large systems for demand forecasting targeting big retailers such as the UK Sainsbury's. He is currently a PhD candidate and also works as a research associate under the supervision of Dr. Andreas Symeonidis, Assistant Professor in the Department of Electrical and Computer Engineering, Aristotle University of Thessaloniki, Greece. His PhD topic lies in the area of Software Analytics focusing on Software Quality from a User-perceived perspective, Software Lifecycle Analysis and Software Maintainability. As a research associate, he works in the EU-funded project Mobile-Age and in the private sector funded R&D project CIA (Continuous Implicit Authentication). He is also a teaching assistant in the undergraduate course "Pattern Recognition" at ECE, AUTH.

Andreas Symeonidis
Andreas Symeonidis

I am an Associate Professor with the Department of Electrical and Computer Engineering at the Aristotle University of Thessaloniki, Greece. My research interests include Software analytics, Knowledge extraction from big data repositories, Automated Software Engineering and Middleware Robotics. My work has been published in over 130 papers, book chapters, and conference publications. I am also co-author of the book “Agent Intelligence through Data Mining” (ISBN: 0-387-24352-6) and have been Project Coordinator of the S-CASE (FP7-ICT-610717) and RAPP (FP7-ICT-610947) projects and Technical lead of the H2020 project SEAF (H2020-696023). I am currently coordinating more than 10 contract R&D projects. More at:

About the Contributors

Ilias Ouzounidis
Ilias Ouzounidis

Branding Design and Consulting

Cover design:

Table of Contents

  • Part I - Introduction
    • Chapter 1 - Introduction to R
      • 1.1 The basics
      • 1.2 Vectors
      • 1.3 Matrices
      • 1.4 Data frames
      • 1.5 R Scripts
      • 1.6 Functions
      • 1.7 for loops
      • 1.8 Making decisions (if & else)
      • 1.9 Datasets and statistics
      • 1.10 Factors
      • 1.11 Challenge
    • Chapter 2 - Introduction to Machine Learning
      • 2.1 Definition
      • 2.2 Main categories and tasks
      • 2.3 Survival guide to machine learning
  • Part II - Classification
    • Chapter 3 - Classification with Decision trees
      • 3.1 Introduction
      • 3.2 Splitting Criteria and Decision Tree Construction
      • 3.3 Application with Pruning and Evaluation Metrics
      • 3.4 Exercise
    • Chapter 4 - Classification with Naive Bayes
      • 4.1 Introduction
      • 4.2 Naive Bayes Model Construction and Classification
      • 4.3 Naive Bayes Application with Evaluation
    • Chapter 5 - Classification with k-Nearest Neighbors
      • 5.1 Introduction
      • 5.2 k-Nearest Neighbors Model Construction and Classification
      • 5.3 k-Nearest Neighbors Real World Example
    • Chapter 6 - Classification with Support Vector Machines
      • 6.1 Introduction
      • 6.2 Support Vector Machines Model Construction and Classification
      • 6.3 Exercise
  • Part III - Data Processing
    • Chapter 7 - Feature Selection
      • 7.1 Introduction
      • 7.2 Filter Methods
      • 7.2 Wrapper Methods
    • Chapter 8 - Dimensionality Reduction
      • 8.1 Principal Components Analysis
  • Part III - Clustering
    • Chapter 9 - Centroid-based Clustering and Evaluation
      • 9.1 k-Means in R
      • 9.2 k-Medoids in R
      • 9.3 Clustering Evaluation in R
      • 9.4 k-Means clustering overview
      • 9.5 k-Means clustering and evaluation in real life Application
      • 9.6 k-Medoids Application
    • Chapter 10 - Connectivity-based Clustering
      • 10.1 Hierarchical clustering in R
      • 10.2 Hierarchical Clustering Application and Evaluation
    • Chapter 11 Density-based Clustering
      • 11.1 - DBSCAN in R
      • 11.2 DBSCAN model construction
      • 11.3 DBSCAN calculation
      • 11.4 DBSCAN clustering with R
      • 11.5 Density-based Clustering Application
    • Chapter 12 - Distribution-based Clustering
      • 12.1 Theoretical background
      • 12.2 Modeling Gaussian Mixture Models using EM
      • 12.3 GMMs Clustering Application
      • 12.4 GMMs Application with Information Criteria
  • Part V - Extended Topics
    • Chapter 13 - Association Rules
  • Notes

The Leanpub 60-day 100% Happiness Guarantee

Within 60 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.

See full terms

80% Royalties. Earn $16 on a $20 book.

We pay 80% royalties. That's not a typo: you earn $16 on a $20 sale. If we sell 5000 non-refunded copies of your book or course for $20, you'll earn $80,000.

(Yes, some authors have already earned much more than that on Leanpub.)

In fact, authors have earnedover $12 millionwriting, publishing and selling on Leanpub.

Learn more about writing on Leanpub

Free Updates. DRM Free.

If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).

Most Leanpub books are available in PDF (for computers) and EPUB (for phones, tablets and Kindle). The formats that a book includes are shown at the top right corner of this page.

Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.

Learn more about Leanpub's ebook formats and where to read them

Write and Publish on Leanpub

You can use Leanpub to easily write, publish and sell in-progress and completed ebooks and online courses!

Leanpub is a powerful platform for serious authors, combining a simple, elegant writing and publishing workflow with a store focused on selling in-progress ebooks.

Leanpub is a magical typewriter for authors: just write in plain text, and to publish your ebook, just click a button. (Or, if you are producing your ebook your own way, you can even upload your own PDF and/or EPUB files and then publish with one click!) It really is that easy.

Learn more about writing on Leanpub