MovieLens is a web-based recommender system and virtual community that recommends movies for its users to watch, based on their film preferences using collaborative filtering of members' movie ratings and movie reviews. It contains about 11 million ratings for about 8500 movies.[1] MovieLens was created in 1997 by GroupLens Research, a research lab in the Department of Computer Science and Engineering at the University of Minnesota,[2] in order to gather research data on personalized recommendations.[3]

History

edit

MovieLens was not the first recommender system created by GroupLens. In May 1996, GroupLens formed a commercial venture called Net Perceptions, which served clients that included E! Online and Amazon.com. E! Online used Net Perceptions' services to create the recommendation system for Moviefinder.com,[3] while Amazon.com used the company's technology to form its early recommendation engine for consumer purchases.[4]

When another movie recommendation site, eachmovie.org,[5] closed in 1997, the researchers who built it publicly released the anonymous rating data they had collected for other researchers to use. The GroupLens Research team, led by Brent Dahlen and Jon Herlocker, used this data set to jumpstart a new movie recommendation site, which they chose to call MovieLens. Since its inception, MovieLens has become a very visible research platform: its data findings have been featured in a detailed discussion in a New Yorker article by Malcolm Gladwell,[6] as well as a report in a full episode of ABC Nightline.[7] Additionally, MovieLens data has been critical for several research studies, including a collaborative study between Carnegie Mellon University, University of Michigan, University of Minnesota, and University of Pittsburgh, "Using Social Psychology to Motivate Contributions to Online Communities".[8]

During Spring in 2015, a search for "movielens" produced 2,750 results in Google Books and 7,580 in Google Scholar.[9]

Recommendations

edit

MovieLens bases its recommendations on input provided by users of the website, such as movie ratings.[2] The site uses a variety of recommendation algorithms, including collaborative filtering algorithms such as item-item,[10] user-user, and regularized SVD.[11] In addition, to address the cold-start problem for new users, MovieLens uses preference elicitation methods.[12] The system asks new users to rate how much they enjoy watching various groups of movies (for example, movies with dark humor, versus romantic comedies). The preferences recorded by this survey allow the system to make initial recommendations, even before the user has rated a large number of movies on the website.

For each user, MovieLens predicts how the user will rate any given movie on the website.[13] Based on these predicted ratings, the system recommends movies that the user is likely to rate highly. The website suggests that users rate as many fully watched films as possible, so that the recommendations given will be more accurate, since the system would then have a better sample of the user's film tastes.[3] However, MovieLens' rating incentive approach is not always particularly effective, as researchers found more than 20% of the movies listed in the system have so few ratings that the recommender algorithms cannot make accurate predictions about whether subscribers will like them or not.[8] The recommendations on movies cannot contain any marketing values that can tackle the large number of movie ratings as a "seed dataset".[1]

In addition to movie recommendations, MovieLens also provides information on individual films, such as the list of actors and directors of each film. Users may also submit and rate tags (a form of metadata, such as "based on a book", "too long", or "campy"), which may be used to increase the film recommendations system's accuracy.[3]

Reception

edit

By September 1997, the website had reached over 50,000 users.[3] When the Akron Beacon Journal's Paula Schleis tried out the website, she was surprised at how accurate the website was in terms of recommending new films for her to watch based on her film tastes.[13]

Outside of the realm of movie recommendations, data from MovieLens has been used by Solution by Simulation to make Oscar predictions.[14]

Research

edit

In 2004, a collaborative study with researchers from Carnegie Mellon University, University of Michigan, University of Minnesota and University of Pittsburgh designed and tested incentives derived from the social psychology principles of social loafing and goal-setting on MovieLens users.[8] The researchers saw that under-contribution seemed to be a problem for the community and set up a study to discern the most effective way to motivate users to rate and review more films. The study executed two field experiments; one involved email messages that reminded users of the uniqueness of their contributions and the benefits that follow from them, and the other gave users a range of individual or group goals for contribution.

The first experiment, based on the analysis of the MovieLens community’s cumulative response, found that users were more likely to contribute to the community when they were reminded of their uniqueness, leading them to think that their contributions are not duplicates of what other users are able to provide. Contrary to the researchers’ hypothesis, they also found that users were less likely to contribute when it was made salient to them the benefit they receive from rating or the benefit others receive when they rate. Lastly, they found no support for the relationship between uniqueness and benefit.

The second experiment found that users were also more likely to contribute when they were given specific and challenging goals and were led to believe that their contributions were needed in order to accomplish the group’s goal. The study found that in this particular context, giving users group-level goals actually increased contributions compared to individual goals, where the researchers predicted that the reverse would be true due to the effects of social loafing. The relationship between goal difficulty and user contributions in both the group and individual cases gave weak evidence that beyond a certain difficulty threshold, performance drops, instead of plateaus as previously hypothesized in Locke and Latham’s goal-setting theory.

Datasets

edit

GroupLens Research, a human-computer interaction research lab at the University of Minnesota, provides the rating data sets collected from MovieLens website for research use. The full data set contains 26,000,000 ratings and 750,000 tag applications applied to 45,000 movies by 270,000 users. It also includes tag genome data with 12 million relevance scores across 1,100 tags (Last updated 8/2017).[15] There are many types of research conducted based on the MovieLens data sets. Liu et al. used MovieLens data sets to test the efficiency of an improved random walk algorithm by depressing the influence of large-degree objects.[16] GroupLens has terms of use for the dataset, and it accepts requests via the internet.

References

edit
  1. ^ a b "MovieLens Database available from Technology Commercialization".
  2. ^ a b Schofield, Jack (2003-05-22). "Land of Gnod". The Guardian. London.
  3. ^ a b c d e Ojeda-Zapata, Julio (1997-09-15). "New Site Personalizes Movie Reviews". St. Paul Pioneer Press. p. 3E.
  4. ^ Booth, Michael (2005-01-30). "How do computers know so much about us?". The Denver Post. p. F01.
  5. ^ Lim, Myungeun; Kim, Juntae (2001). "Web Intelligence: Research and Development". Proceedings of the First Asia-Pacific Conference on Web Intelligence: Research and Development. Asia-Pacific Conference on Web Intelligence. Lecture Notes in Computer Science. Vol. 2198/2001. Springer Berlin/Heidelberg. pp. 438–442. doi:10.1007/3-540-45490-X_56. ISBN 978-3-540-42730-8.
  6. ^ Gladwell, Malcolm (October 4, 1999). "Annals of Marketing: The Science of the Sleeper: How the Information Age Could Blow Away the Blockbuster". New Yorker. 75 (29): 48–55. Archived from the original on December 30, 2009. Retrieved 2009-12-29.
  7. ^ Krulwich, Robert (December 10, 1999). "ABC Nightline: Soulmate". ABC.
  8. ^ a b c Beenen, Gerard; Ling, Kimberly; Wang, Xiaoqing; Chang, Klarissa; Frankowski, Dan; Resnick, Paul; Kraut, Robert E. (2004). "Using Social Psychology to Motivate Contributions to Online Communities". CommunityLab: 93–116. CiteSeerX 10.1.1.320.5540.
  9. ^ http://files.grouplens.org/papers/harper-tiis2015.pdf [bare URL PDF]
  10. ^ Sarwar, Badrul, et al. "Item-based collaborative filtering recommendation algorithms." Proceedings of the 10th international conference on World Wide Web. ACM, 2001.
  11. ^ Ekstrand, Michael D. Towards Recommender Engineering Tools and Experiments for Identifying Recommender Differences. Diss. UNIVERSITY OF MINNESOTA, 2014.
  12. ^ Chang, Shuo, F. Maxwell Harper, and Loren Terveen. "Using Groups of Items to Bootstrap New Users in Recommender Systems." Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing. ACM, 2015.
  13. ^ a b Schleis, Paula (2000-11-13). "Site Lets Everybody be a Critic". Akron Beacon Journal. p. D2.
  14. ^ Hickey, Walt. "Do Your Oscar Predictions Stack Up? Here's What The Data Says." FiveThirtyEight. N.p., 18 Feb. 2016. Web. 08 Mar. 2016. <http://fivethirtyeight.com/features/oscar-data-model-predictions-2015/>
  15. ^ "GroupLens".
  16. ^ Chuang Liu, Zhen Liu, Zi-Ke Zhang, Jun-Lin Zhou, Yan Fu, Da-Cheng Nie (2014). "A personalized recommendation algorithm via biased random walk". 11th International Joint Conference on Computer Science and Software Engineering (JCSSE).{{cite news}}: CS1 maint: multiple names: authors list (link)
edit