First the easiest one - we provide id of video as a string and as a result we receive Video instance :
from django-syncr.youtube import YoutubeSyncr def sync_video_id(id): """ Synchronize single video by given YoutubeID Takes youtube id as a string. Returns Video instance. """ y = YoutubeSyncr() try: video = y.syncVideo(id) return video except: return video
That was easy. Let's try something more complicated then. Next there's a synchronization when provided url to video. Using python's built in urlparse module we divide url into components. Because my actual code needed to be compatible with version 2.4 of python, I'm using cgi module here (the same functions just stored in different place). Here's how this looks step by step. Everything is rather self-explanatory so I will not go through it line by line :
>>> url = 'http://www.youtube.com/watch?v=mPXxI1uyVEE&feature=rec-LGOUT-exp_fres h+div-1r-3-HM' >>> url 'http://www.youtube.com/watch?v=mPXxI1uyVEE&feature=rec-LGOUT-exp_fresh+div-1r-3 -HM' >>> import urlparse >>> url_data = urlparse.urlparse(url) >>> url_data ParseResult(scheme='http', netloc='www.youtube.com', path='/watch', params='', q uery='v=mPXxI1uyVEE&feature=rec-LGOUT-exp_fresh+div-1r-3-HM', fragment='') >>> query = urlparse.parse_qs(url_data[4]) >>> query {'feature': ['rec-LGOUT-exp_fresh div-1r-3-HM'], 'v': ['mPXxI1uyVEE']} >>> id = query["v"][0] >>> id 'mPXxI1uyVEE'
and that's how we will do this in django :
from django-syncr.youtube import YoutubeSyncr def sync_video_url(url): """ synchronize single video by it's url Takes video url as a string, returns Video object. """ import urlparse url_data = urlparse.urlparse(url) try: query = cgi.parse_qs(url_data[4]) except: query = urlparse.parse_qs(url_data[4]) id = query["v"][0] y = YoutubeSyncr() try: return video = y.syncVideo(id) except: return none
When you're finished warm up It's time for more complex functionality. Now we will sync videos of specific users meeting specified tags requirements. First the core function that will do the work here. Because I tried to make it reusable django's ContentType module is used. :
from django.contrib.contenttypes import generic from django.contrib.contenttypes.models import ContentType from django-syncr.youtube import YoutubeSyncr def sync_video_user(parent,model_name,user,tags): """ Sync all videos of user matching given tags pattern . Takes parent object, object to sync videos (as a string), string with username and string with tags. Returns list of synchronized videos. Takes care of already existing videos as well as prevents duplication of slugs. """ from django.template.defaultfilters import slugify #first we need to get the actual model for object holding video. model_class = ContentType.objects.get(model=model_name).model_class() #we create a queryset for searching for duplicates of video we're currently worknig on queryset = model_class._default_manager.all() #parse string of tags to get proper validating url string fmt_tags = parse_tags(tags) #request for youtube feed feed = get_youtube_feed_url(user, fmt_tags) #search for videos parsing returned feed sync = search_youtube(feed) #create a temporary instance (notice not using save() anywhere here) to get object methods available as well as queryset instance = model_class() for vid in sync: #if our queryset does not already contain object with this video id if not queryset.filter(**{"yt_video_id": vid.video_id}): #check if slug is free free = try_slug(instance, slugify(vid.title)) if not free: #if not use function which we had already discussed some time ago - unqiue_slugify free = unique_slugify(instance, slugify(vid.title)) else: free = slugify(vid.title) #create new object new = model_class(parent=parent, slug=free, name = vid.title, yt_video_id=vid.video_id, video=vid, publication_date=vid.published, active=True, ) new.save() return 1
Now that we have the main part of the code, we can look on the smaller pieces :
def parse_tags(tags): """ Parse received string of tags to tags list. Then append them to yt search API address. Return path to search api. parse_tags(tags): url(string) """ tags_list = tags.split(',') tags_list = [tag.strip() for tag in tags_list] tags_list = [re.sub(r"\s+", "+", tag) for tag in tags_list parse_string = '&category=%s&v=2' % '%2C+'.join(tag.strip() for tag in tags_list) return parse_string
Not much magic going on here. We split string of tags and use simple regex to get single tags from it (tags are either separated by commas or more than two whitespaces. Multi-words tags allowed). We then put all tags to query url in the proper form.
Next function adds our tags to the rest of the query string containing youtube username :
def get_youtube_feed_url(user, url): """ Add username to youtube feed address. """ feed = '/feeds/api/videos?author=%s&alt=rss' % user feed += url return feed
Finally when we have the whole query url we can perform the search using youtube api :
def search_youtube(path): """ Search videos on youtube If path is given, connects to youtube feed returns any videos matching query. All videos are then synched. """ if path: import feedparser import urlparse YTSearchFeed = feedparser.parse("http://gdata.youtube.com" + path) videos = [] for yt in YTSearchFeed.entries: #the only new part is here. Because returned feed is a really complex json file, we search for 'link' keys that store url to our desired videos url_data = urlparse.urlparse(yt['link']) try: query = cgi.parse_qs(url_data[4]) except: query = urlparse.parse_qs(url_data[4]) id = query["v"][0] videos.append(id) synched = [] for video in videos: try: sync = sync_video_id(video) synched.append(sync) except: pass return synched else: return []
And basically we're finished.
No comments:
Post a Comment