I Tube, You Tube, Everybody Tubes: Analyzing the World's Largest User Generated Content Video System
Meeyoung Cha, Haewoon Kwak, Pablo Rodriguez, Yong-Yeol Ahn, Sue Moon
Proc. ACM Internet Measurement Conference (IMC), San Diego, CA, October 2007
User Generated Content (UGC) is re-shaping the way people
watch video and TV, with millions of video producers and
consumers. In particular, UGC sites are creating new viewing
patterns and social interactions, empowering users to be
more creative, and developing new business opportunities.
To better understand the impact of UGC systems, we have
analyzed YouTube, the world's largest UGC VoD system.
Based on a large amount of data collected, we provide an
in-depth study of YouTube and other similar UGC systems.
In particular, we study the popularity life-cycle of videos,
the intrinsic statistical properties of requests and their
relationship with video age, and the level of content aliasing
or of illegal content in the system. We also provide insights
on the potential for more efficient UGC VoD systems (e.g.,
utilizing P2P techniques or making better use of caching).
Finally, we discuss the opportunities to leverage the latent
demand for niche videos that are not reached today due to
information filtering effects or other system scarcity distortions.
Overall, we believe that the results presented in this
paper are crucial in understanding UGC systems and can
provide valuable information to ISPs, site administrators,
and content owners with major commercial and technical
implications.
[PDF (1,343KB)]
@inproceedings{imc2007cha,
author = "Meeyoung Cha and Haewoon Kwak and Pablo Rodriguez and Yong-Yeol Ahn and Sue Moon",
title = "{I Tube, You Tube, Everybody Tubes: Analyzing the World's Largest User Generated Content Video System}",
booktitle = {ACM Internet Measurement Conference},
year = {2007},
month = {October}
}
Data
We share our traces on user-generated videos for the wider community use.
Our traces include meta-information about videos from
YouTube
and Daum services.
We provide snapshot of all videos in some of their video categories.
For more information on the traces,
please refer to our paper.
If you have a publication using our trace, please let us know by email at haewoon ATT an.kaist.ac.kr.
YouTube Entertainment Category
Format: url | length | views | ratings | stars
Example: /watch?v=abc|01:30|100|5|4.0
Description:
This trace provides meta-information of all the videos
in Entertainment category.
Each line represents a single video.
The example above indicates that
the length of YouTube video
http://www.youtube.com/watch?v=abc is 1:30 or 90 seconds and
this video was viewed 100 times. 5 users rated this video
and the average score of rating was 4.0.
Please note that there may be empty fields in our traces.
Download YouTube Ent Trace
(collected at December 21, 2006, number of videos = 1,687,506)
YouTube Science & Technology Category
Format: url | length | views1 | ratings1 | user_id | upload_date | views2 | comments2 | favorited2 | ratings2 | stars2 | honors2 | links2 | related2
Example: watch?v=abcd1234567|01:30|100|5|mia|January 16,
2007|200|10|10|10|4.0|5|10 https://www.myspace.com/::13
https://www.blogspot.com|/watch?v=a /watch?v=b
Description:
This trace provides meta-information of all the videos
in Science & Technology category. This category is now called "Howto & DIY."
The example above indicates that
video http://www.youtube.com/watch?v=abcd1234567, uploaded by user ID
mia, has length of 1:30 or 90 seconds.
The views1, ratings1 fields reflect the number of views and ratings
collected at January 15, 2007 -- which in this example are 100 and 5, respectively.
We have collected video information for the same set of videos after a month.
Views2, ratings2, ..., related2 fields indicate
the number of views, comments, favorites, ratings, stars, honors,
linking pages and their clicks, and related videos,
collected at February 14, 2007.
Please note that deleted videos will appear with empty fields in our trace.
Linking videos are shown as a tuple of
clicks page_url, concatenated by :: sign. The example above indicates that 10 clicks were made from myspace.com web site and 13 clicks,
from blogspot.com web site. Finally, related2 shows the list of related
selected by YouTube. Note that both linking pages and related videos in our
traces are based on any information shown in the front page of the corresponding video (i.e., there may be other linking pages and related videos).
Download YouTube Sci Trace
(collected at January 15 / February 14, 2007, number of videos
= 252,255)
Daum Food and Travel Categories
Format: video_id | upload_date | length | user_id | recommended | views
Example: /ClipView.do?clipid=994690&type=chal|06.11.03|322|80757|3|267
Description:
Each line includes the meta-information of a video.
The example above indicates that Daum video
with URL /ClipView.do?clipid=994690&type=chal, uploaded by user 80757,
has length of 322 seconds (or 5:22).
The view and recommended fields show the number of views and recommendations for the corresponding video,
collected at April, 12, 2007 -- which in this example are 267 and 3, respectively.
Download Daum Food Trace
(collected at April 3, 2007, number of videos = 1,393)
Download Daum Travel Trace
(collected at April 12, 2007, number of videos = 9,295)
Contact
Meeyoung Cha (meeyoung.cha ATT gmail.com)
Haewoon Kwak (haewoon ATT an.kaist.ac.kr)