Hypothesis-Based Collaborative Filtering

Retrieving Like-Minded Individuals Based on the Comparison of Hypothesized Preferences

Amancio Bouza

Free With Membership

With Membership

Free!

$9.90

You pay

$9.90

Author earns

$7.92

PDF

328

Pages

About

About the Book

The vast product variety and product variation offered by online retailers provide an amazing amount of choice options to individuals, thus posing a big challenge to them finding and choosing interesting products which provide them the most utility. Consequently, consumers have to be satisfied with finding a product that provides them sufficient utility. Beyond that, individuals tend to even defer product choice, which is known as overchoice phenomenon.

Recommender systems have emerged in the past years as an effective method to help individuals with finding interesting products. As a result, the consumer welfare enhanced by $731 million to $1.03 billion in the year 2000 due to the increased product variety of online bookstores. Consumer welfare refers to consumers’ total satisfaction. This enhancement in consumer welfare is 7 to 10 times larger than the consumer welfare gain from increased competition and lower prices in the book market. In other words, recommender systems are essential for increasing consumers welfare, which ultimately leads to an increase of economic and social welfare.

Typically, recommender systems use the collective wisdom of individuals for exposing individuals to products which best fits their preferences, thus maximizing their utility. More precisely, the product ratings of like-minded individuals are considered by the recommender system to provide individuals recommendations. Commonly, like-minded individuals are retrieved by comparing their ratings for common rated products. This filtering technology is commonly referred to as collaborative filtering.

However, retrieving like-minded individuals based on their ratings for common rated products may be inappropriate because common rated products may not necessarily be a representative sample of two individuals’ preferences being compared. We show why and when this is the case.

In this dissertation, we present hypothesis-based collaborative filtering (HCF) to expose individuals to products which best fits their preferences. HCF retrieves like-minded individuals based on the similarity of their hypothesized preferences by means of machine learning algorithms hypothesizing individuals’ preferences. Machine learning is a method to extract patterns to generalize from observations, thus being adequate to hypothesize individuals’ preferences from their product ratings. We present two different frameworks which retrieve like-minded individuals comparing the composition of hypothesized preferences and the predicted utilities individuals receive from products. Furthermore, we provide empirical evidence about the superiority of HCF to baseline collaborative filtering methods.

Share this book

Feedback

Email the Author

Author

About the Author

Amancio Bouza

Amancio has received his PhD for his thesis on recommender systems, machine learning, and Semantic Web. He has several years of experience in tech startups, IT companies, and companies across different industries as Enterpreneur, Product Manager, Product Owner, Technical Lead, and Software Engineer

Table of Contents

Acknowledgements

Abstract

I Setting the Scene

1 Introduction

1.1 Motivation and Thesis

1.2 Hypothesis-Based Collaborative Filtering in a Nutshell

1.3 Thesis Statement

1.3.1 Research Hypotheses

1.3.2 Research Goals

1.4 Contributions

1.5 Organization

2 Related Work

2.1 Recommender Systems

2.1.1 Formal Framework

2.1.2 Ratings

2.2 Collaborative Filtering

2.2.1 General Framework for Collaborative Filtering

2.2.2 Cold-Start Problem

2.3 Machine Learning

II Preference Modeling

3 Conceptualization and Specification of Preferences

3.1 Formalization of Preferences

3.1.1 PartialPreferences

3.2 Partial Preference Extraction from Machine Learning Models

3.2.1 Partial Preference Extraction from Decision Tree Classifier

3.2.2 Partial Preference Extraction from Naïve Bayesian Classifier

3.3 Ontological Specification of Hypothesized Preferences

3.4 Acceptance of Hypotheses

3.5 Summary

4 Domain Ontology-Boosted Decision Tree Induction

4.1 Decision Tree Induction

4.1.1 Feature Selection

4.2 SEMTREE Extension to the Decision Tree Model

4.2.1 Basic Idea

4.2.2 Injecting Concept Features to Generalize from Features

4.2.3 Classification

4.2.4 Implementation

4.3 Acceptance of Hypotheses

4.4 Summary

III Preference Similarity

5 Hypothesized Preference Similarity

5.1 Theoretical Foundation of Hypothesized Preference Similarity

5.1.1 Hypothesized Partial Preference Similarity

5.1.2 Hypothesized Semi-Partial Preference Similarity

5.2 Hypothesized Utility-Based Preference Similarity

5.2.1 Product Set for Utility Prediction

5.2.2 Correlative Predicted Utility-Based Similarity

5.2.3 Probabilistic Predicted Utility-Based Similarity

5.2.4 Probabilistic Predicted Utility-Based Semi-Partial Similarity

5.3 Hypothesis Composition-Based Preference Similarity

5.3.1 Similarity of Hypothesized Partial Preferences

5.3.2 Similarity Computation Based on Partial Preference Similarity Matrix

5.4 Summary

IV Evaluation

6 Evaluation

6.1 Experimental Setting

6.1.1 Performance Metrics

6.2 Candidates for Comparison

6.2.1 Hypothesis-Based Collaborative Filtering Candidates

6.2.2 Baseline Collaborative Filtering Candidates

6.2.3 Baseline Content Filtering Candidates

6.3 Dataset

6.4 Results and Discussion

6.4.1 Rating Prediction Accuracy

6.4.2 Relevance Filtering Quality

6.5 Information Theoretic Reflection of Hypothesized Preferences versus Product Ratings

6.6 Acceptance of Hypotheses

6.7 Summary

7 Analysis

7.1 Method

7.1.1 Grounded Theory

7.1.2 Data Collection

7.1.3 Data Analysis

7.2 Theory Development

7.2.1 TheoryConcepts

7.2.2 Comparison of Recommendation Performance

7.3 Theory Consolidation

7.4 Theory Validation

7.4.1 Experimental Setting

7.4.2 Results and Discussion

7.5 Acceptance of Hypotheses

V Closing

8 Limitations

8.1 Conceptual Limitations

8.2 Technical Limitations

9 Conclusions

9.1 Acceptance of Hypotheses

9.2 Achievements of Research Goals and Thesis

9.3 Opportunities for Future Research

VI Appendix

A Tools

A.1 RECOMIZER

A.2 OMORE

A.2.1 Architecture

A.3 MOLookup

A.4 LiMo Database

A.4.1 Interlinking Movies across Web Pages

B Movie Ontology MO

C MovieLens Dataset

C.1 Genres of MovieLens

C.2 Sparse MovieLens Dataset

D Distribution of Recommendation Performance

E Comparison Between Properties and Recommendation Performance

F Comparison Between Recomm. Perform. regarding Cold-Start Behavior

G Publications

Bibliography

Curriculum Vitae

Get the free sample chapters

Click the buttons to get the free sample in PDF or EPUB, or read the sample online here

Download Sample PDF

The Leanpub 60 Day 100% Happiness Guarantee

Within 60 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.

Now, this is technically risky for us, since you'll have the book or course files either way. But we're so confident in our products and services, and in our authors and readers, that we're happy to offer a full money back guarantee for everything we sell.

You can only find out how good something is by trying it, and because of our 100% money back guarantee there's literally no risk to do so!

So, there's no reason not to click the Add to Cart button, is there?

See full terms...

Earn $8 on a $10 Purchase, and $16 on a $20 Purchase

We pay 80% royalties on purchases of $7.99 or more, and 80% royalties minus a 50 cent flat fee on purchases between $0.99 and $7.98. You earn $8 on a $10 sale, and $16 on a $20 sale. So, if we sell 5000 non-refunded copies of your book for $20, you'll earn $80,000.

(Yes, some authors have already earned much more than that on Leanpub.)

In fact, authors have earned over $14 million writing, publishing and selling on Leanpub.

Learn more about writing on Leanpub

Free Updates. DRM Free.

If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).

Most Leanpub books are available in PDF (for computers) and EPUB (for phones, tablets and Kindle). The formats that a book includes are shown at the top right corner of this page.

Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.

Learn more about Leanpub's ebook formats and where to read them

Write and Publish on Leanpub

You can use Leanpub to easily write, publish and sell in-progress and completed ebooks and online courses!

Leanpub is a powerful platform for serious authors, combining a simple, elegant writing and publishing workflow with a store focused on selling in-progress ebooks.

Leanpub is a magical typewriter for authors: just write in plain text, and to publish your ebook, just click a button. (Or, if you are producing your ebook your own way, you can even upload your own PDF and/or EPUB files and then publish with one click!) It really is that easy.

Learn more about writing on Leanpub