Hypothesis-Based Collaborative Filtering
Hypothesis-Based Collaborative Filtering
Retrieving Like-Minded Individuals Based on the Comparison of Hypothesized Preferences
About the Book
The vast product variety and product variation offered by online retailers provide an amazing amount of choice options to individuals, thus posing a big challenge to them finding and choosing interesting products which provide them the most utility. Consequently, consumers have to be satisfied with finding a product that provides them sufficient utility. Beyond that, individuals tend to even defer product choice, which is known as overchoice phenomenon.
Recommender systems have emerged in the past years as an effective method to help individuals with finding interesting products. As a result, the consumer welfare enhanced by $731 million to $1.03 billion in the year 2000 due to the increased product variety of online bookstores. Consumer welfare refers to consumers’ total satisfaction. This enhancement in consumer welfare is 7 to 10 times larger than the consumer welfare gain from increased competition and lower prices in the book market. In other words, recommender systems are essential for increasing consumers welfare, which ultimately leads to an increase of economic and social welfare.
Typically, recommender systems use the collective wisdom of individuals for exposing individuals to products which best fits their preferences, thus maximizing their utility. More precisely, the product ratings of like-minded individuals are considered by the recommender system to provide individuals recommendations. Commonly, like-minded individuals are retrieved by comparing their ratings for common rated products. This filtering technology is commonly referred to as collaborative filtering.
However, retrieving like-minded individuals based on their ratings for common rated products may be inappropriate because common rated products may not necessarily be a representative sample of two individuals’ preferences being compared. We show why and when this is the case.
In this dissertation, we present hypothesis-based collaborative filtering (HCF) to expose individuals to products which best fits their preferences. HCF retrieves like-minded individuals based on the similarity of their hypothesized preferences by means of machine learning algorithms hypothesizing individuals’ preferences. Machine learning is a method to extract patterns to generalize from observations, thus being adequate to hypothesize individuals’ preferences from their product ratings. We present two different frameworks which retrieve like-minded individuals comparing the composition of hypothesized preferences and the predicted utilities individuals receive from products. Furthermore, we provide empirical evidence about the superiority of HCF to baseline collaborative filtering methods.
Table of Contents
Acknowledgements
Abstract
I Setting the Scene
1 Introduction
1.1 Motivation and Thesis
1.2 Hypothesis-Based Collaborative Filtering in a Nutshell
1.3 Thesis Statement
1.3.1 Research Hypotheses
1.3.2 Research Goals
1.4 Contributions
1.5 Organization
2 Related Work
2.1 Recommender Systems
2.1.1 Formal Framework
2.1.2 Ratings
2.2 Collaborative Filtering
2.2.1 General Framework for Collaborative Filtering
2.2.2 Cold-Start Problem
2.3 Machine Learning
II Preference Modeling
3 Conceptualization and Specification of Preferences
3.1 Formalization of Preferences
3.1.1 PartialPreferences
3.2 Partial Preference Extraction from Machine Learning Models
3.2.1 Partial Preference Extraction from Decision Tree Classifier
3.2.2 Partial Preference Extraction from Naïve Bayesian Classifier
3.3 Ontological Specification of Hypothesized Preferences
3.4 Acceptance of Hypotheses
3.5 Summary
4 Domain Ontology-Boosted Decision Tree Induction
4.1 Decision Tree Induction
4.1.1 Feature Selection
4.2 SEMTREE Extension to the Decision Tree Model
4.2.1 Basic Idea
4.2.2 Injecting Concept Features to Generalize from Features
4.2.3 Classification
4.2.4 Implementation
4.3 Acceptance of Hypotheses
4.4 Summary
III Preference Similarity
5 Hypothesized Preference Similarity
5.1 Theoretical Foundation of Hypothesized Preference Similarity
5.1.1 Hypothesized Partial Preference Similarity
5.1.2 Hypothesized Semi-Partial Preference Similarity
5.2 Hypothesized Utility-Based Preference Similarity
5.2.1 Product Set for Utility Prediction
5.2.2 Correlative Predicted Utility-Based Similarity
5.2.3 Probabilistic Predicted Utility-Based Similarity
5.2.4 Probabilistic Predicted Utility-Based Semi-Partial Similarity
5.3 Hypothesis Composition-Based Preference Similarity
5.3.1 Similarity of Hypothesized Partial Preferences
5.3.2 Similarity Computation Based on Partial Preference Similarity Matrix
5.4 Summary
IV Evaluation
6 Evaluation
6.1 Experimental Setting
6.1.1 Performance Metrics
6.2 Candidates for Comparison
6.2.1 Hypothesis-Based Collaborative Filtering Candidates
6.2.2 Baseline Collaborative Filtering Candidates
6.2.3 Baseline Content Filtering Candidates
6.3 Dataset
6.4 Results and Discussion
6.4.1 Rating Prediction Accuracy
6.4.2 Relevance Filtering Quality
6.5 Information Theoretic Reflection of Hypothesized Preferences versus Product Ratings
6.6 Acceptance of Hypotheses
6.7 Summary
7 Analysis
7.1 Method
7.1.1 Grounded Theory
7.1.2 Data Collection
7.1.3 Data Analysis
7.2 Theory Development
7.2.1 TheoryConcepts
7.2.2 Comparison of Recommendation Performance
7.3 Theory Consolidation
7.4 Theory Validation
7.4.1 Experimental Setting
7.4.2 Results and Discussion
7.5 Acceptance of Hypotheses
V Closing
8 Limitations
8.1 Conceptual Limitations
8.2 Technical Limitations
9 Conclusions
9.1 Acceptance of Hypotheses
9.2 Achievements of Research Goals and Thesis
9.3 Opportunities for Future Research
VI Appendix
A Tools
A.1 RECOMIZER
A.2 OMORE
A.2.1 Architecture
A.3 MOLookup
A.4 LiMo Database
A.4.1 Interlinking Movies across Web Pages
B Movie Ontology MO
C MovieLens Dataset
C.1 Genres of MovieLens
C.2 Sparse MovieLens Dataset
D Distribution of Recommendation Performance
E Comparison Between Properties and Recommendation Performance
F Comparison Between Recomm. Perform. regarding Cold-Start Behavior
G Publications
Bibliography
Curriculum Vitae
Other books by this author
The Leanpub 60-day 100% Happiness Guarantee
Within 60 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.
See full terms
80% Royalties. Earn $16 on a $20 book.
We pay 80% royalties. That's not a typo: you earn $16 on a $20 sale. If we sell 5000 non-refunded copies of your book or course for $20, you'll earn $80,000.
(Yes, some authors have already earned much more than that on Leanpub.)
In fact, authors have earned$12,046,757writing, publishing and selling on Leanpub.
Learn more about writing on Leanpub
Free Updates. DRM Free.
If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).
Most Leanpub books are available in PDF (for computers) and EPUB (for phones, tablets and Kindle). The formats that a book includes are shown at the top right corner of this page.
Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.
Learn more about Leanpub's ebook formats and where to read them
Top Books
OpenIntro Statistics
David Diez, Christopher Barr, Mine Cetinkaya-Rundel, and OpenIntroA complete foundation for Statistics, also serving as a foundation for Data Science.
Leanpub revenue supports OpenIntro (US-based nonprofit) so we can provide free desk copies to teachers interested in using OpenIntro Statistics in the classroom and expand the project to support free textbooks in other subjects.
More resources: openintro.org.
Talking with Tech Leads
Patrick KuaA book for Tech Leads, from Tech Leads. Discover how more than 35 Tech Leads find the delicate balance between the technical and non-technical worlds. Discover the challenges a Tech Lead faces and how to overcome them. You may be surprised by the lessons they have to share.The Rails 7 Way
Obie Fernandez, Lucas Dohmen, and Tom Henrik AadlandThe Rails™ 7 Way is the comprehensive, authoritative reference guide for professionals delivering production-quality code using modern Ruby on Rails. It illuminates the entire Rails 7 API, its most powerful idioms, design approaches, and libraries. Building on the previous editions, this edition has been heavily refactored and updated.
Mastering STM32 - Second Edition
Carmine NovielloWith more than 1200 microcontrollers, STM32 is probably the most complete ARM Cortex-M platform on the market. This book aims to be the most complete guide around introducing the reader to this exciting MCU portfolio from ST Microelectronics and its official CubeHAL and STM32CubeIDE development environment.
JavaScript for hackers
Gareth HeyesLearn how to find interesting behaviour and flaws in JavaScript. Reading this book you will find the latest and greatest techniques for hacking JavaScript and generating XSS payloads. Includes ways to construct JavaScript using only +[]()! characters. Never heard of DOM Clobbering? This book has all the details.
Functional Event-Driven Architecture
Gabriel VolpeExplore the event-driven architecture (EDA) in a purely functional way. Learn to design and develop distributed systems that scale. Identify common design patterns in such systems.
Take your functional programming skills to the next level by joining me in developing a distributed system powered by Apache Pulsar and Fs2 streams, all in Scala 3!
Build Your Own Redis with C/C++
build-your-own.org- Why build Redis? Two topics to learn: network programming and data structures.
- Why from scratch? A quote from Richard Feynman: "What I cannot create, I do not understand".
- Why C? C is widely used for system programming and infrastructure software.
- Why a book? The real Redis is complex, this book breaks down the essense into easy-to-digest steps.
Machine Learning Q and AI
Sebastian Raschka, PhDHave you recently completed a machine learning or deep learning course and wondered what to learn next? With 30 questions and answers on key concepts in machine learning and AI, this book provides bite-sized bits of knowledge for your journey to becoming a machine learning expert.
Implementing DDD, CQRS and Event Sourcing
Alex LawrenceLearn how to implement DDD, CQRS and Event Sourcing. Understand the theory and put it into practice with JavaScript and Node.js. Utilize an extensive source code bundle and an interactive execution feature for a hands-on experience.
Ansible for DevOps
Jeff GeerlingAnsible is a simple, but powerful, server and configuration management tool. Learn to use Ansible effectively, whether you manage one server—or thousands.
Top Bundles
- #1
Software Architecture
2 Books
"Software Architecture for Developers" is a practical and pragmatic guide to modern, lightweight software architecture, specifically aimed at developers. You'll learn:The essence of software architecture.Why the software architecture role should include coding, coaching and collaboration.The things that you really need to think about before... - #2
CCIE Service Provider Ultimate Study Bundle
2 Books
Piotr Jablonski, Lukasz Bromirski, and Nick Russo have joined forces to deliver the only CCIE Service Provider training resource you'll ever need. This bundle contains a detailed and challenging collection of workbook labs, plus an extensively detailed technical reference guide. All of us have earned the CCIE Service Provider certification... - #3
Modern C++ Collection
3 Books
Get All about Modern C++C++ Standard Library, including C++20Concurrency with Modern C++, including C++20C++20Each book has about 200 complete code examples. Updates are included. When I update one of the books, you immediately get the updated bundle. You can expect significant updates to each new C++ standard (C++23, C++26, .. ) and also... - #4
Pattern-Oriented Memory Forensics and Malware Detection
2 Books
This training bundle for security engineers and researchers, malware and memory forensics analysts includes two accelerated training courses for Windows memory dump analysis using WinDbg. It is also useful for technical support and escalation engineers who analyze memory dumps from complex software environments and need to check for possible... - #5
1500 QUIZ COMMENTATI (3 libri)
3 Books
Tre libri dei QUIZ MMG Commentati al prezzo di DUE! I QUIZ dei concorsi ufficiali di Medicina Generale relativi agli anni: 2000-2001-2003-2012-2013-2014-2015-2016-2017-2018-2019-2020-2021 +100 inediti Raccolti in unico bundle per aiutarvi nello studio e nella preparazione al concorso. All'interno di ogni libro i quiz sono stati suddivisi per... - #6
Practical FP in Scala + Functional event-driven architecture
2 Books
Practical FP in Scala (A hands-on approach) & Functional event-driven architecture, aka FEDA, (Powered by Scala 3), together as a bundle! The content of PFP in Scala is a requirement to understand FEDA so why not take advantage of this bundle!? - #9
Growing Agile: The Complete Coach's Guide
7 Books
Growing Agile: Coach's Guide Series This bundle provides a collection of training and workshop plans for a variety of agile topics. The series is aimed at agile coaches, trainers and ScrumMasters who often find themselves needing to help teams understand agile concepts. Each book in the series provides the plans, slides, handouts and activity... - #10
Mastering Containers
2 Books
Docker and Kubernetes are taking the world by storm! These books will get you up-to-speed fast! Docker Deep Dive is over 400 pages long, and covers all objectives on the Docker Certified Associate exam.The Kubernetes Book includes everything you need to get up and running with Kubernetes!