Hypothesis-Based Collaborative Filtering
Hypothesis-Based Collaborative Filtering
Retrieving Like-Minded Individuals Based on the Comparison of Hypothesized Preferences
About the Book
The vast product variety and product variation offered by online retailers provide an amazing amount of choice options to individuals, thus posing a big challenge to them finding and choosing interesting products which provide them the most utility. Consequently, consumers have to be satisfied with finding a product that provides them sufficient utility. Beyond that, individuals tend to even defer product choice, which is known as overchoice phenomenon.
Recommender systems have emerged in the past years as an effective method to help individuals with finding interesting products. As a result, the consumer welfare enhanced by $731 million to $1.03 billion in the year 2000 due to the increased product variety of online bookstores. Consumer welfare refers to consumers’ total satisfaction. This enhancement in consumer welfare is 7 to 10 times larger than the consumer welfare gain from increased competition and lower prices in the book market. In other words, recommender systems are essential for increasing consumers welfare, which ultimately leads to an increase of economic and social welfare.
Typically, recommender systems use the collective wisdom of individuals for exposing individuals to products which best fits their preferences, thus maximizing their utility. More precisely, the product ratings of like-minded individuals are considered by the recommender system to provide individuals recommendations. Commonly, like-minded individuals are retrieved by comparing their ratings for common rated products. This filtering technology is commonly referred to as collaborative filtering.
However, retrieving like-minded individuals based on their ratings for common rated products may be inappropriate because common rated products may not necessarily be a representative sample of two individuals’ preferences being compared. We show why and when this is the case.
In this dissertation, we present hypothesis-based collaborative filtering (HCF) to expose individuals to products which best fits their preferences. HCF retrieves like-minded individuals based on the similarity of their hypothesized preferences by means of machine learning algorithms hypothesizing individuals’ preferences. Machine learning is a method to extract patterns to generalize from observations, thus being adequate to hypothesize individuals’ preferences from their product ratings. We present two different frameworks which retrieve like-minded individuals comparing the composition of hypothesized preferences and the predicted utilities individuals receive from products. Furthermore, we provide empirical evidence about the superiority of HCF to baseline collaborative filtering methods.
Table of Contents
Acknowledgements
Abstract
I Setting the Scene
1 Introduction
1.1 Motivation and Thesis
1.2 Hypothesis-Based Collaborative Filtering in a Nutshell
1.3 Thesis Statement
1.3.1 Research Hypotheses
1.3.2 Research Goals
1.4 Contributions
1.5 Organization
2 Related Work
2.1 Recommender Systems
2.1.1 Formal Framework
2.1.2 Ratings
2.2 Collaborative Filtering
2.2.1 General Framework for Collaborative Filtering
2.2.2 Cold-Start Problem
2.3 Machine Learning
II Preference Modeling
3 Conceptualization and Specification of Preferences
3.1 Formalization of Preferences
3.1.1 PartialPreferences
3.2 Partial Preference Extraction from Machine Learning Models
3.2.1 Partial Preference Extraction from Decision Tree Classifier
3.2.2 Partial Preference Extraction from Naïve Bayesian Classifier
3.3 Ontological Specification of Hypothesized Preferences
3.4 Acceptance of Hypotheses
3.5 Summary
4 Domain Ontology-Boosted Decision Tree Induction
4.1 Decision Tree Induction
4.1.1 Feature Selection
4.2 SEMTREE Extension to the Decision Tree Model
4.2.1 Basic Idea
4.2.2 Injecting Concept Features to Generalize from Features
4.2.3 Classification
4.2.4 Implementation
4.3 Acceptance of Hypotheses
4.4 Summary
III Preference Similarity
5 Hypothesized Preference Similarity
5.1 Theoretical Foundation of Hypothesized Preference Similarity
5.1.1 Hypothesized Partial Preference Similarity
5.1.2 Hypothesized Semi-Partial Preference Similarity
5.2 Hypothesized Utility-Based Preference Similarity
5.2.1 Product Set for Utility Prediction
5.2.2 Correlative Predicted Utility-Based Similarity
5.2.3 Probabilistic Predicted Utility-Based Similarity
5.2.4 Probabilistic Predicted Utility-Based Semi-Partial Similarity
5.3 Hypothesis Composition-Based Preference Similarity
5.3.1 Similarity of Hypothesized Partial Preferences
5.3.2 Similarity Computation Based on Partial Preference Similarity Matrix
5.4 Summary
IV Evaluation
6 Evaluation
6.1 Experimental Setting
6.1.1 Performance Metrics
6.2 Candidates for Comparison
6.2.1 Hypothesis-Based Collaborative Filtering Candidates
6.2.2 Baseline Collaborative Filtering Candidates
6.2.3 Baseline Content Filtering Candidates
6.3 Dataset
6.4 Results and Discussion
6.4.1 Rating Prediction Accuracy
6.4.2 Relevance Filtering Quality
6.5 Information Theoretic Reflection of Hypothesized Preferences versus Product Ratings
6.6 Acceptance of Hypotheses
6.7 Summary
7 Analysis
7.1 Method
7.1.1 Grounded Theory
7.1.2 Data Collection
7.1.3 Data Analysis
7.2 Theory Development
7.2.1 TheoryConcepts
7.2.2 Comparison of Recommendation Performance
7.3 Theory Consolidation
7.4 Theory Validation
7.4.1 Experimental Setting
7.4.2 Results and Discussion
7.5 Acceptance of Hypotheses
V Closing
8 Limitations
8.1 Conceptual Limitations
8.2 Technical Limitations
9 Conclusions
9.1 Acceptance of Hypotheses
9.2 Achievements of Research Goals and Thesis
9.3 Opportunities for Future Research
VI Appendix
A Tools
A.1 RECOMIZER
A.2 OMORE
A.2.1 Architecture
A.3 MOLookup
A.4 LiMo Database
A.4.1 Interlinking Movies across Web Pages
B Movie Ontology MO
C MovieLens Dataset
C.1 Genres of MovieLens
C.2 Sparse MovieLens Dataset
D Distribution of Recommendation Performance
E Comparison Between Properties and Recommendation Performance
F Comparison Between Recomm. Perform. regarding Cold-Start Behavior
G Publications
Bibliography
Curriculum Vitae
Other books by this author
The Leanpub 60 Day 100% Happiness Guarantee
Within 60 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.
Now, this is technically risky for us, since you'll have the book or course files either way. But we're so confident in our products and services, and in our authors and readers, that we're happy to offer a full money back guarantee for everything we sell.
You can only find out how good something is by trying it, and because of our 100% money back guarantee there's literally no risk to do so!
So, there's no reason not to click the Add to Cart button, is there?
See full terms...
Earn $8 on a $10 Purchase, and $16 on a $20 Purchase
We pay 80% royalties on purchases of $7.99 or more, and 80% royalties minus a 50 cent flat fee on purchases between $0.99 and $7.98. You earn $8 on a $10 sale, and $16 on a $20 sale. So, if we sell 5000 non-refunded copies of your book for $20, you'll earn $80,000.
(Yes, some authors have already earned much more than that on Leanpub.)
In fact, authors have earnedover $13 millionwriting, publishing and selling on Leanpub.
Learn more about writing on Leanpub
Free Updates. DRM Free.
If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).
Most Leanpub books are available in PDF (for computers) and EPUB (for phones, tablets and Kindle). The formats that a book includes are shown at the top right corner of this page.
Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.
Learn more about Leanpub's ebook formats and where to read them