Commit Graph

509 Commits

Author SHA1 Message Date
Yen Chi Hsuan 7b2fcbfd4e
[common] Skip TYPE=CLOSED-CAPTIONS lines in m3u8 manifests
According to [1], valid values for TYPE are AUDIO, VIDEO, SUBTITLES
and CLOSED-CAPTIONS. Such a value is found in Anvato master playlists,
though I don't use _extract_m3u8_formats() in the end.

Part of #9522.

[1] https://tools.ietf.org/html/draft-pantos-http-live-streaming-19#section-4.3.4.1
2016-05-21 13:16:28 +08:00
Yen Chi Hsuan 16da9bbc29
[common] Add _m3u8_meta_format() template
For extractors who handle m3u8 manifests by themselves. (eg., AnvatoIE)

Part of #9522
2016-05-21 13:15:28 +08:00
Yen Chi Hsuan ad96b4c8f5
[common] Extract audio formats in SMIL
Found in http://www.cbc.ca/player/play/2657631896

Closes #5156
2016-05-20 19:02:53 +08:00
Sergey M․ ed56f26039
[extractor/common] Improve name extraction for m3u8 formats 2016-05-15 03:34:35 +06:00
Sergey M․ 8a92e51c60
[extractor/common] Relax wording for creator metafield 2016-05-02 21:31:35 +06:00
Yen Chi Hsuan e9c6cdf4a1
[common] Fix format_id construction for HLS 2016-04-29 22:50:16 +08:00
Kagami Hiiragi b24d6336a7 [vlive] Add support for live videos 2016-04-29 14:22:50 +03:00
Yen Chi Hsuan d6712378e7
Merge branch 'akamai_pv' of https://github.com/remitamine/youtube-dl into remitamine-akamai_pv 2016-04-25 21:02:02 +08:00
remitamine fb72ec58ae [extractor/common] do not process f4m manifest that contain akamai playerVerificationChallenge 2016-04-25 13:37:03 +01:00
Yen Chi Hsuan 2c0d9c6217
[extractor/common] Allow empty post data 2016-04-21 13:06:06 +08:00
Sergey M․ 49caf3307f
[extractor/common] Remove irrelevant comment 2016-04-10 17:10:27 +06:00
Sergey M․ bacec0397f [extractor/common] Relax _hidden_inputs 2016-04-08 23:33:45 +06:00
Sergey M․ fb38aa8b53 [extractor/common] Support arbitrary format strings for template based identifiers in mpd manifests (Closes #9119, closes #9120) 2016-04-08 22:48:08 +06:00
Sergey M․ 7a93ab5f3f [extractor/common] Introduce music album metafields 2016-04-07 02:53:53 +06:00
Sergey M․ b507cc925b [extractor/common] Carry long line 2016-04-02 18:49:58 +06:00
Sergey M․ db8ee7ec05 [extractor/common] Fix numeric identifiers conversion in DASH URL templates 2016-04-02 18:48:05 +06:00
remitamine df634be2ed [common] prefer using mime type over ext for smil subtitle extraction
the subtitle ext for http://www.cnet.com/videos/download-amazon-prime-movies-and-tv/
is adb_xml while using the mime type it get tt(application/smptett+xml)
2016-04-01 19:47:49 +01:00
Sergey M․ 41d06b0424 [extractor/common] Improve _request_webpage
* Do not ignore data, headers and query for Requests
* Default values for headers and query switched to dicts since these are used by urllib itself
2016-03-31 22:58:38 +06:00
Sergey M․ b22ca76204 [extractor/common] Filter out unsupported encrypted media for f4m formats (Closes #8573) 2016-03-27 07:42:38 +06:00
Sergey M․ 19dbaeece3 Remove _sort_formats from _extract_*_formats methods
Now _sort_formats should be called explicitly.
_sort_formats has been added to all the necessary places in code.

Closes #8051
2016-03-27 07:03:08 +06:00
Sergey M․ 15707c7e02 [compat] Add compat_urllib_parse_urlencode and eliminate encode_dict
encode_dict functionality has been improved and moved directly into compat_urllib_parse_urlencode
All occurrences of compat_urllib_parse.urlencode throughout the codebase have been replaced by compat_urllib_parse_urlencode

Closes #8974
2016-03-26 01:46:57 +06:00
remitamine 49dea4913b Merge pull request #8513 from remitamine/dash-sort
[extractor/common] fix dash formats sorting
2016-03-15 18:39:50 +01:00
Sergey M․ 0fdbb3322b [extractor/common] Add _parse_f4m_formats routine 2016-03-13 03:16:08 +06:00
remitamine 09f572fbc0 [extractor/common] add transform_source to _download_smil and _extract_smil_formats 2016-03-11 22:37:07 +01:00
remitamine 15bf934de5 Merge pull request #8819 from remitamine/simple-webpage-requests
[extractor/common] simplify using data, headers and query params with _download_* methods
2016-03-11 18:19:43 +01:00
remitamine cdfee16818 [extractor/common] add data, headers and query params to _request_webpage 2016-03-11 18:12:50 +01:00
Yen Chi Hsuan a6c8b75904 [common] Use mimeType to determine file extensions (#8766) 2016-03-11 23:51:42 +08:00
Yen Chi Hsuan 64f08d4ff2 Merge pull request #8766 from yan12125/dash-detect-ext
Detect file extensions of DASH formats from their codecs
2016-03-11 21:40:07 +08:00
Yen Chi Hsuan af7d5a63b2 [common] Document protocol http_dash_segments 2016-03-06 17:47:07 +08:00
Yen Chi Hsuan 2def60c5f3 [common] Use codec2ext for DASH formats (#8764) 2016-03-05 18:18:39 +08:00
Yen Chi Hsuan e9c0cdd389 [jython] Introduce compat_os_name
os.name is always 'java' on Jython
2016-03-03 19:24:24 +08:00
Sergey M․ 7bcd2830dd [extractor/common] Document uploader_url 2016-03-02 23:31:24 +06:00
Sergey M․ 2bc0c46f98 [extractor/common] Document license metafield 2016-03-02 23:06:39 +06:00
Sergey M․ d77ab8e255 Add --mark-watched feature (Closes #5054) 2016-03-01 01:01:33 +06:00
Sergey M․ 9cdffeeb3f [extractor/common] Clarify rationale on media playlist detection 2016-02-27 07:01:11 +06:00
Sergey M․ fbb6edd298 [extractor/common] Properly extract audio only formats in master m3u8 playlists 2016-02-27 06:48:13 +06:00
Sergey M․ f5bdb44443 [extractor/common] Add _remove_duplicate_formats 2016-02-22 01:19:39 +06:00
remitamine cafcf657a4 add more subtitles mime types to mimetype2ext and fix the platform subtitle extraction 2016-02-20 22:02:03 +01:00
Sergey M․ 611c1dd96e [refactor] Single quotes consistency 2016-02-14 15:37:17 +06:00
Sergey M․ d800609c62 [refactor] Do not specify redundant None as second argument in dict.get() 2016-02-14 14:25:04 +06:00
Sergey M․ bb20526b64 [extractor/common] Improve base url construction 2016-02-13 00:13:56 +06:00
remitamine c349456ef6 [extractor/common] strip http urls in smil manifest 2016-02-12 17:38:48 +01:00
remitamine 81e1c4e2fc [extractor/common] remove duplicate rtmp formats in smil manifest 2016-02-11 17:58:48 +01:00
remitamine dd86780596 [extractor/common] fix dash formats sorting 2016-02-11 10:55:50 +01:00
remitamine 154c209e2d [extractor/common] improve dash format ids 2016-02-11 10:33:26 +01:00
remitamine 51e9094f4a [extractor/common] extract youtube dash formats filesize(fixes #8480) 2016-02-09 20:05:39 +01:00
remitamine d413095f7e [extractor/common] remove duplicated formats and subtiles in smil manifests 2016-02-09 17:15:41 +01:00
remitamine 6a3828fddd [common] use float conversion instead of using division from __future__ 2016-02-06 14:27:04 +01:00
remitamine 91cb6b5065 rename _parse_mpd to _parse_mpd_formats and add default value for mpd namespace 2016-02-06 14:03:48 +01:00
remitamine 0826a0b555 [common] sort dash formats 2016-02-06 06:52:48 +01:00
remitamine 255732f0d3 [common] fix segment duration calculation 2016-02-03 23:57:08 +01:00
remitamine 53c269c6fd [common] fix media_template string formating 2016-02-03 23:54:34 +01:00
remitamine 675d001633 [common] skip drm protected dash formats 2016-02-03 18:44:43 +01:00
remitamine d577c79632 [common] ignore ISO 639-2 generic codes 2016-02-03 13:24:07 +01:00
remitamine f14be22816 [common] remove duplicate reference to namespace 2016-02-02 22:02:08 +01:00
remitamine 9c74423510 [common] fix media template regex 2016-02-02 18:30:31 +01:00
remitamine 1bac34556f [common] add a generic support for mpd manifests 2016-02-02 18:09:25 +01:00
Yen Chi Hsuan 2d2fa82d17 [common] Add _extract_dash_manifest_formats 2016-01-30 22:52:23 +08:00
Yen Chi Hsuan c94678957f [common] Remove unused arguments 2016-01-30 22:45:16 +08:00
Yen Chi Hsuan 16f38a699f [common] Rename to namespace
For consistency with _parse_smil_*
2016-01-30 22:40:56 +08:00
Yen Chi Hsuan df374b5222 [common] Prefer the manifest than formats_dict in determining codecs 2016-01-30 21:42:27 +08:00
Yen Chi Hsuan 5ea1eb78f5 [common] Fix for youtube 2016-01-30 21:36:01 +08:00
Yen Chi Hsuan b323e1707d [common] Modify _parse_dash_manifest for use in Facebook 2016-01-30 21:27:43 +08:00
Yen Chi Hsuan 17b598d30c [common] _parse_dash_manifest() from youtube.py 2016-01-30 21:05:55 +08:00
Sergey M․ 350cf045d8 [extractor/common] Restrict checks when auto calculating tbr 2016-01-30 01:47:46 +06:00
remitamine a9d5f12fec Merge pull request #8328 from remitamine/hls-master-detect
[extractor/common] detect media playlist in _extract_m3u8_formats
2016-01-27 18:07:30 +01:00
remitamine 7f32e5dc35 [extractor/common] detect media playlist in _extract_m3u8_formats 2016-01-27 17:53:42 +01:00
Sergey M․ b0d21deda9 [extractor/common] Auto calculate tbr when missing 2016-01-27 21:11:17 +06:00
Yen Chi Hsuan 77f785076f [common] Keep full codec name from m3u8 manifests
See #8293. This is for consistency between YouTube and HLS formats.
2016-01-25 01:03:46 +08:00
Yen Chi Hsuan 0b26ba3fc8 [extractor/common] Allow passing more parameters to _search_json_ld 2016-01-16 20:45:36 +08:00
Sergey M․ 4ca2a3cf3c [extractor/common] Add initial support for JSON-LD metadata extraction into info_dict 2016-01-16 00:36:02 +06:00
Jakub Wilk dfb1b1468c Fix typos
Closes #8200.
2016-01-10 17:24:28 +01:00
Sergey M 3f3343cd3e Merge pull request #8061 from dstftw/introduce-chapter-and-series-fields
Introduce chapter and series fields
2016-01-03 03:54:22 +06:00
Sergey M․ 27bfd4e526 [extractor/common] Introduce number fields for chapters and series 2016-01-01 20:26:56 +06:00
Philipp Hagemeister 32f9036447 [ccc] Add language information to formats 2016-01-01 13:28:45 +01:00
Sergey M․ 7109903e61 [extractor/common] Document chapter and series fields 2015-12-31 03:10:44 +06:00
Sergey M․ 7e5edcfd33 Simplify formats accumulation for f4m/m3u8/smil formats
Now all _extract_*_formats routines return a list
2015-12-29 00:58:24 +06:00
remitamine 39d60b715a Merge pull request #7769 from remitamine/sort
[common] lower (m3u8,rtmp,rtsp) format preference only if required program is not available
2015-12-28 19:15:14 +01:00
remitamine d497a201ca [common] use specific variable for protocol preference in _sort_formats 2015-12-28 18:45:01 +01:00
remitamine 8d29e47f54 [common] simplify the use of _extract_m3u8_formats and _extract_f4m_formats 2015-12-27 15:33:39 +01:00
Sergey M․ 9b9c5355e4 Rename error_to_str to error_to_compat_str 2015-12-20 07:00:39 +06:00
Sergey M․ 7f8b271465 Properly convert errors to strings 2015-12-20 05:27:38 +06:00
Sergey M․ dd85e4d707 [extractor/common] Properly decode error string on python 2 (Closes #1354, closes #3957, closes #4037, closes #6449) 2015-12-20 02:43:50 +06:00
Sergey M․ 62d231c004 [extractor/common] Clarify duration can be float 2015-12-03 20:55:02 +06:00
Sergey M? 5c2266df4b Switch codebase to use sanitized_Request instead of
compat_urllib_request.Request

[downloader/dash] Use sanitized_Request

[downloader/http] Use sanitized_Request

[atresplayer] Use sanitized_Request

[bambuser] Use sanitized_Request

[bliptv] Use sanitized_Request

[brightcove] Use sanitized_Request

[cbs] Use sanitized_Request

[ceskatelevize] Use sanitized_Request

[collegerama] Use sanitized_Request

[extractor/common] Use sanitized_Request

[crunchyroll] Use sanitized_Request

[dailymotion] Use sanitized_Request

[dcn] Use sanitized_Request

[dramafever] Use sanitized_Request

[dumpert] Use sanitized_Request

[eitb] Use sanitized_Request

[escapist] Use sanitized_Request

[everyonesmixtape] Use sanitized_Request

[extremetube] Use sanitized_Request

[facebook] Use sanitized_Request

[fc2] Use sanitized_Request

[flickr] Use sanitized_Request

[4tube] Use sanitized_Request

[gdcvault] Use sanitized_Request

[extractor/generic] Use sanitized_Request

[hearthisat] Use sanitized_Request

[hotnewhiphop] Use sanitized_Request

[hypem] Use sanitized_Request

[iprima] Use sanitized_Request

[ivi] Use sanitized_Request

[keezmovies] Use sanitized_Request

[letv] Use sanitized_Request

[lynda] Use sanitized_Request

[metacafe] Use sanitized_Request

[minhateca] Use sanitized_Request

[miomio] Use sanitized_Request

[meovideo] Use sanitized_Request

[mofosex] Use sanitized_Request

[moniker] Use sanitized_Request

[mooshare] Use sanitized_Request

[movieclips] Use sanitized_Request

[mtv] Use sanitized_Request

[myvideo] Use sanitized_Request

[neteasemusic] Use sanitized_Request

[nfb] Use sanitized_Request

[niconico] Use sanitized_Request

[noco] Use sanitized_Request

[nosvideo] Use sanitized_Request

[novamov] Use sanitized_Request

[nowness] Use sanitized_Request

[nuvid] Use sanitized_Request

[played] Use sanitized_Request

[pluralsight] Use sanitized_Request

[pornhub] Use sanitized_Request

[pornotube] Use sanitized_Request

[primesharetv] Use sanitized_Request

[promptfile] Use sanitized_Request

[qqmusic] Use sanitized_Request

[rtve] Use sanitized_Request

[safari] Use sanitized_Request

[sandia] Use sanitized_Request

[shared] Use sanitized_Request

[sharesix] Use sanitized_Request

[sina] Use sanitized_Request

[smotri] Use sanitized_Request

[sohu] Use sanitized_Request

[spankwire] Use sanitized_Request

[sportdeutschland] Use sanitized_Request

[streamcloud] Use sanitized_Request

[streamcz] Use sanitized_Request

[tapely] Use sanitized_Request

[tube8] Use sanitized_Request

[tubitv] Use sanitized_Request

[twitch] Use sanitized_Request

[twitter] Use sanitized_Request

[udemy] Use sanitized_Request

[vbox7] Use sanitized_Request

[veoh] Use sanitized_Request

[vessel] Use sanitized_Request

[vevo] Use sanitized_Request

[viddler] Use sanitized_Request

[videomega] Use sanitized_Request

[viewvster] Use sanitized_Request

[viki] Use sanitized_Request

[vk] Use sanitized_Request

[vodlocker] Use sanitized_Request

[voicerepublic] Use sanitized_Request

[wistia] Use sanitized_Request

[xfileshare] Use sanitized_Request

[xtube] Use sanitized_Request

[xvideos] Use sanitized_Request

[yandexmusic] Use sanitized_Request

[youku] Use sanitized_Request

[youporn] Use sanitized_Request

[youtube] Use sanitized_Request

[patreon] Use sanitized_Request

[extractor/common] Remove unused import

[nfb] PEP 8
2015-11-23 21:56:23 +06:00
Sergey M․ 019839faaa [extractor/common] Use baseURL from f4m manifest for recursive manifest extraction 2015-11-21 18:01:39 +06:00
Sergey M 30eecc6a04 Merge pull request #7296 from jaimeMF/xml_attrib_unicode
Use a wrapper around xml.etree.ElementTree.fromstring in python 2.x (…
2015-10-31 18:15:21 +00:00
Sergey M․ dbd82a1d4f [extractor/common] Fix m3u8 extraction on failure 2015-11-01 00:01:34 +06:00
Sergey M․ dc519b5421 [extractor/common] Make ie_key and IE_NAME return unicode string 2015-10-31 23:12:57 +06:00
Jaime Marquínez Ferrándiz 36e6f62cd0 Use a wrapper around xml.etree.ElementTree.fromstring in python 2.x (#7178)
Attributes aren't unicode objects, so they couldn't be directly used in info_dict fields (for example '--write-description' doesn't work with bytes).
2015-10-25 20:13:16 +01:00
remitamine 3711304510 [extractor/common] get the redirected m3u8_url in _extract_m3u8_formats 2015-10-24 19:01:54 +06:00
Jaime Marquínez Ferrándiz 865d1fbafc [extractor/common] Remove unused import 2015-10-24 12:39:23 +02:00
Sergey M․ 943a1e24b8 [extractor/common] Use more generic URLError in _is_valid_url 2015-10-24 16:25:04 +06:00
Sergey M․ 02835c6bf4 [extractor/common] Document repost_count 2015-10-18 09:34:54 +06:00
Sergey M․ 448ef1f31c [extractor/common] Allow angle brackets in attributes in _og_regexes (#7215) 2015-10-18 09:11:02 +06:00
Sergey M․ 7a6d76a64d [extractor/common] Require closing quote in _og_regexes (Closes #7174)
E.g. do not match `property='og:video:type'` when `og:video` is requested.
2015-10-14 20:49:39 +06:00
Sergey M․ 4180a3d8b7 [extractor/common] Allow quoteless content attribute in og regexes (Closes #7115) 2015-10-10 01:46:01 +06:00
Yen Chi Hsuan 57935b2564 [extractor/common] Allow HTML5 unquoted attribute values
Fixes #7108

HTML5 allows unquoted attribute values. See the "Unquoted attribute value
syntax" section [1] for more information

[1] http://www.w3.org/TR/html5/syntax.html
2015-10-09 14:11:00 +08:00
Sergey M․ 4bba371644 [YoutubeDL] Autocalculate ext for subtitles when missing 2015-10-04 20:42:26 +06:00
Sergey M․ e5851b963a [extractor/common] Make f4m extraction for SMIL non fatal 2015-10-01 23:04:56 +06:00
Sergey M․ 4de6131090 [extractor/common] Add fatal to _extract_f4m_formats 2015-10-01 23:03:31 +06:00
Sergey M․ 3a1341a7bc [extractor/common] Make m3u8 extraction for SMIL non fatal 2015-10-01 22:59:20 +06:00
Sergey M․ c78e48177c [extractor/common] Check validity of direct URLs 2015-10-01 22:54:54 +06:00
Sergey M․ 647eab4541 [extractor/common] Extract upload date from SMIL 2015-10-01 22:20:28 +06:00
Sergey M․ 1e5bcdec02 [extractor/common] Extract images from SMIL 2015-10-01 22:20:21 +06:00
Sergey M․ e7d8e98a9f [extractor/common] Allow float bitrates 2015-10-01 22:20:15 +06:00
Sergey M․ 8aab976bbd [extractor/common] Document release_date field 2015-09-26 21:07:54 +06:00
Sergey M․ c430802e32 [extractor/common] Add raise_geo_restricted 2015-09-22 21:50:20 +06:00
Sergey M․ 586f1cc532 [extractor/common] Skip html comment tags (Closes #6822) 2015-09-11 21:07:32 +06:00
Sergey M․ 73eb13dfc7 [extractor/common] Case insensitive inputs extraction 2015-09-11 20:43:05 +06:00
Sergey M․ be0e5dbd83 [extractor/common] Extract submit inputs 2015-09-06 07:20:47 +06:00
Sergey M․ 43e7d3c945 [extractor/common] Add raise_login_required 2015-08-26 21:24:47 +06:00
Jaime Marquínez Ferrándiz 8c97f81943 [common] Follow convention of using 'cls' in classmethods 2015-08-21 11:35:51 +02:00
Yen Chi Hsuan f738dd7b7c [common] Remove debugging codes 2015-08-21 01:43:22 +08:00
Yen Chi Hsuan 912e0b7e46 [common] Add _merge_subtitles() 2015-08-21 01:37:07 +08:00
Yen Chi Hsuan 03bc7237ad [common] _parse_smil_subtitles: accept `lang` as the subtitle language 2015-08-20 23:18:58 +08:00
Sergey M․ 5cdefc4625 [extractor/common] Add more subtitle mime types for guess when ext is missing 2015-08-20 01:02:50 +06:00
Sergey M․ ce00af8767 [extractor/common] Add default subtitles lang 2015-08-20 00:56:17 +06:00
Yen Chi Hsuan f877c6ae5a [theplatform] Use InfoExtractor._parse_smil_formats() 2015-08-19 23:11:25 +08:00
Sergey M․ e64b756943 [extractor/common] Interactive TFA code input 2015-08-15 21:55:07 +06:00
Sergey M․ 201ea3ee8e [extractor/common] Improve _hidden_inputs 2015-08-15 21:52:22 +06:00
Sergey M․ 8b9848ac56 [extractor/common] Expand meta regex 2015-08-15 15:58:30 +06:00
Sergey M․ 942acef594 [extractor/common] Extract _parse_xspf 2015-08-09 19:41:55 +06:00
Sergey M․ 98044462b1 [extractor/common] Use playlist id as default title 2015-08-09 19:18:50 +06:00
Sergey M․ e0b9d78fab [extractor/common] Clarify playlists can have description field 2015-08-09 19:09:50 +06:00
Sergey M․ 8d6765cf48 [extractor/generic] Add generic support for xspf playist extraction 2015-08-09 19:07:18 +06:00
Sergey M. d5d7bdaeb5 Merge pull request #6428 from dstftw/improve-generic-smil-support
Improve generic SMIL support
2015-08-08 05:47:33 +06:00
Sergey M․ 5b0c40da24 [extractor/common] Expand meta regex 2015-08-08 03:36:29 +06:00
Sergey M․ 17712eeb19 [extractor/common] Extract namespace parse routine 2015-08-02 01:31:17 +06:00
Sergey M․ 41c3a5a7be [extractor/common] Fix python 3 2015-08-02 01:20:49 +06:00
Sergey M․ a107193e4b [extractor/common] Extract f4m and m3u8 formats, subtitles and info 2015-08-02 01:13:21 +06:00
remitamine 799207e838 [viewster] extract the api auth token
Closes #6406.
2015-07-30 12:55:48 +02:00
Sergey M․ 864f24bd2c [extractor/common] Add _meta_regex and clarify tags field 2015-07-29 03:43:03 +06:00
Purdea Andrei 5316bf7487 Documented tags as a possible dict key 2015-07-28 18:30:42 +03:00
Sergey M․ 10952eb2cf [extractor/common] Consistent URL spelling 2015-07-23 23:37:45 +06:00
Jaime Marquínez Ferrándiz 297a564bee [youtube] Extract end_time 2015-07-23 13:20:21 +02:00
Jaime Marquínez Ferrándiz 7c80519cbf [youtube] Extract start_time
From the 't=*' in the url.
Currently youtube-dl doesn't use the value, but it was requested for the mpv plugin.
2015-07-20 21:10:28 +02:00
Sergey M․ 74fe23ec35 [extractor/common] Style 2015-07-18 16:35:28 +06:00
Yen Chi Hsuan a38436e889 [extractor/common] Add 'transform_source' parameter to _extract_f4m_formats() 2015-07-17 12:02:49 +08:00
Sergey M․ 31c746e5dc [extractor/common] Keep going in some media_url is missing 2015-07-16 01:25:33 +06:00
Sergey M․ 70f0f5a8ca [extractor/common] Recursively extract child f4m manifests 2015-07-16 01:15:15 +06:00
Sergey M․ cc357c4db8 [extractor/common] Properly handle full URLs 2015-07-16 01:14:52 +06:00
Sergey M․ 97f4aecfc1 [extractor/common] Handle malformed f4m manifests 2015-07-16 01:14:08 +06:00
Sergey M․ cf61d96df0 [extractor/common] Add _form_hidden_inputs 2015-07-14 22:38:10 +06:00
Sergey M․ f8da79f828 [extractor/common] Improve _form_hidden_inputs and rename to _hidden_inputs 2015-07-14 22:36:30 +06:00
Sergey M․ 27713812a0 [extractor/common] Add method for extracting form hidden input fields as dict 2015-07-10 21:49:09 +06:00
Yen Chi Hsuan 13af92fdc4 [common] Add 'fatal' to _extract_m3u8_formats 2015-07-06 08:39:38 +08:00
Sergey M․ 5414623791 [extractor/common] Remove superfluous line 2015-06-29 00:49:19 +06:00
Sergey M․ c342041fba [extractor/common] Use NO_DEFAULT from utils 2015-06-28 22:56:45 +06:00
Yen Chi Hsuan 621ed9f5f4 [common] Add note and errnote field for _extract_m3u8_formats 2015-06-07 16:33:22 +08:00
Sergey M․ baa43cbaf0 [extractor/common] Relax valid url check verbosity 2015-05-17 02:59:35 +06:00
Yen Chi Hsuan c1c924abfe [utils,common] Merge format_srt_time and _subtitles_timecode
format_srt_time uses a comma as the delimiter between seconds and
milliseconds while _subtitles_timecode uses a dot. All .srt examples I
found on the Internet uses a comma, so I use a comma in the merged
version. See http://matroska.org/technical/specs/subtitles/srt.html and
http://devel.aegisub.org/wiki/SubtitleFormats/SRT
2015-05-12 13:04:54 +08:00
Yen Chi Hsuan 05d5392cda [common] Ignore subtitles in m3u8 2015-05-07 18:06:22 +08:00
Sergey M․ 74f728249f [extractor/common] Fallback to empty string for (yet) missing `format_id` in `_sort_formats` (Closes #5624) 2015-05-06 21:24:24 +06:00
Jaime Marquínez Ferrándiz 2ddcd88129 Remove code that was only used by the Grooveshark extractor 2015-05-02 17:29:56 +02:00
zouhair cf0649f8b7 Typo: twice "the the" to "the" 2015-04-29 11:03:10 -04:00
Sergey M․ 3ded7bac16 [extractor/common] Add ability to specify custom field preference for `_sort_formats` 2015-04-20 21:13:31 +06:00
Jaime Marquínez Ferrándiz 08f2a92c9c InfoExtractor._search_regex: Suggest updating when the regex is not found (suggested in #5442)
Reuse the same message from ExtractorError
2015-04-17 14:55:24 +02:00
Yen Chi Hsuan c9a779695d [extractor/common] Add the encoding parameter
The QQMusic info extractor need forced encoding for correct working.
2015-04-16 17:34:54 +08:00
Sergey M․ 830d53bfae [utils] Add `video_title` for `url_result` 2015-04-12 23:11:47 +06:00
Sergey M․ e21a55abcc [extractor/common] Remove f4m section
It's now provided by `f4m_id`
2015-04-04 23:05:25 +06:00
Sergey M․ 4a34f69ea6 [extractor/common] Add subtitles timecode formatter 2015-03-13 21:38:28 +06:00
Sergey M․ f207019ce5 [extractor/common] Remove 'm3u8' from quality selection URL 2015-03-06 22:53:53 +06:00
Sergey M․ 8dc9d361c2 [extractor/common] Fix format_id when `last_media` is None and always include `m3u8_id` if present
The rationale behind `m3u8_id` was to resolve duplicates when processing several m3u8 playlists within the same media that give equal resulting `format_id`'s,
e.g. `youtube-dl http://www.rts.ch/play/tv/passe-moi-les-jumelles/video/la-fee-des-bois-mustang-les-chemins-du-vent?id=3854925 -F`
2015-03-06 22:52:50 +06:00
Philipp Hagemeister a0bb7c5593 [extractor/common] Improve m3u format IDs (#5143) 2015-03-06 10:49:42 +01:00
Sergey M․ 2f0f6578c3 [extractor/common] Assume non HTTP(S) URLs valid 2015-03-02 22:38:44 +06:00
Philipp Hagemeister 72a406e7aa [extractor/common] Pass in video_id (#5057) 2015-02-26 01:35:43 +01:00
Antti Ajanki 6f4ba54079 [extractor/common] Extract HTTP (possibly f4m) URLs from a .smil file 2015-02-24 21:22:59 +02:00
Antti Ajanki 637570326b [extractor/common] Extract the first of a seq of videos in a .smil file 2015-02-24 21:22:59 +02:00
Jaime Marquínez Ferrándiz bfc993cc91 Merge branch 'subtitles-rework'
(Closes PR #4964)
2015-02-23 17:13:03 +01:00
Sergey M․ 9fe6ef7ab2 [extractor/common] Fix preference for m3u8 quality selection URL 2015-02-23 03:30:10 +06:00
Philipp Hagemeister 8fb3ac3649 PEP8: W503 2015-02-21 14:55:13 +01:00
Philipp Hagemeister 77b2986b5b [extractor/common] Recognize Indian censorship (#5021) 2015-02-21 14:51:07 +01:00
Jaime Marquínez Ferrándiz 9868ea4936 [extractor/common] Simplify subtitles handling methods
Initially I was going to use a single method for handling both subtitles and automatic captions, that's why I used the 'list_subtitles' and the 'subtitles' variables.
2015-02-17 22:16:29 +01:00
Philipp Hagemeister fa15607773 PEP8 fixes 2015-02-17 21:46:20 +01:00
Jaime Marquínez Ferrándiz 4cd95bcbc3 [twitch:stream] Prefer the 'source' format (fixes #4972) 2015-02-17 18:57:01 +01:00
Sergey M? 4069766c52 [extractor/common] Test URLs with GET 2015-02-17 22:35:27 +06:00
Jaime Marquínez Ferrándiz 360e1ca5cc [youtube] Convert to new subtitles system
The automatic captions are stored in the 'automactic_captions' field, which is used if no normal subtitles are found for an specific language.
2015-02-16 22:47:39 +01:00
Jaime Marquínez Ferrándiz c84dd8a90d [YoutubeDL] store the subtitles to download in the 'requested_subtitles' field
We need to keep the orginal subtitles information, so that the '--load-info' option can be used to list or select the subtitles again.
We'll also be able to have a separate field for storing the automatic captions info.
2015-02-16 21:51:08 +01:00
Jaime Marquínez Ferrándiz a504ced097 Improve subtitles support
For each language the extractor builds a list with the available formats sorted (like for video formats), then YoutubeDL selects one of them using the '--sub-format' option which now allows giving the format preferences (for example 'ass/srt/best').
For each format the 'url' field can be set so that we only download the contents if needed, or if the contents needs to be processed (like in crunchyroll) the 'data' field can be used.

The reasons for this change are:
* We weren't checking that the format given with '--sub-format' was available, checking it in each extractor would be repetitive.
* It allows to easily support giving a format preference.
* The subtitles were automatically downloaded in the extractor, but I think that if you use for example the '--dump-json' option you want to finish as fast as possible.

Currently only the ted extractor has been updated, but the old system still works.
2015-02-16 21:51:03 +01:00
Philipp Hagemeister 03cd72b007 [extractor/common] Move up filesize
filesize and tbr should correlate, so it doesn't make sense to treat them differently.
2015-02-16 04:39:22 +01:00
Jaime Marquínez Ferrándiz 6ca7732d5e [extractor/common] Fix link to external documentation 2015-02-14 22:20:24 +01:00
Jaime Marquínez Ferrándiz 2d30521ab9 [youtube] Extract average rating (closes #2362) 2015-02-11 18:39:31 +01:00
Philipp Hagemeister 9650885be9 [escapist] Filter video differently (Fixes #4919) 2015-02-10 15:55:51 +01:00
Philipp Hagemeister 7e5db8c930 [options] Add --no-color 2015-02-10 04:22:10 +01:00
Philipp Hagemeister 3a5bcd0326 [extractor/common] Wrap extractor errors (Fixes #1194)
For now, we just wrap some common errors. More may follow. We do not want to catch actual programming errors in the extractors, such as 1 // 0.
2015-02-10 01:17:23 +01:00
Naglis Jonaitis 69319969de [extractor/common] Add new helper method _family_friendly_search 2015-02-08 17:39:00 +02:00
Philipp Hagemeister 1e1896f2de [extractor/common] Correct sort order.
We should look at height and width before ext_preference.
2015-02-06 15:16:45 +01:00
Sergey M․ 3900eec27c [extractor/common] Fix 2.0 manifest extraction (Closes #4830) 2015-02-06 04:29:29 +06:00
Sergey M․ 60ca389c64 [extractor/common] Prefix f4m/m3u8 entries with identifier 2015-02-05 22:16:27 +06:00
Philipp Hagemeister 9bb8e0a3f9 [wsj] Add new extractor (Fixes #4854) 2015-02-03 10:58:28 +01:00
Philipp Hagemeister 1a6373ef39 [sort_formats] Prefer bitrate over video size
720p @ 1000KB/s looks way better than 1080p @ 500KB/s
2015-02-03 10:53:07 +01:00
Philipp Hagemeister 995029a142 [nerdist] Add new extractor (Fixes #4851) 2015-02-02 23:38:35 +01:00
Philipp Hagemeister b04b885271 [extractor/common] Document all protocol values 2015-01-30 15:53:16 +01:00
Sergey M․ 96a53167fa [common] Generalize URLs' HTTP errors pre-testing 2015-01-26 00:32:31 +06:00
Philipp Hagemeister 3dee7826e7 [rtl2] PEP8, simplify, make rtmp tests run (#470) 2015-01-25 18:09:48 +01:00
Philipp Hagemeister cfb56d1af3 Add --list-thumbnails 2015-01-25 02:43:19 +01:00
Jaime Marquínez Ferrándiz e1554a407d [extractors] Use http_headers for setting the User-Agent and the Referer 2015-01-24 18:23:53 +01:00
Philipp Hagemeister 121c09c7be Merge remote-tracking branch 'Dineshs91/f4m-2.0' 2015-01-10 17:51:52 +01:00
Philipp Hagemeister 6271f1cad9 [youtube|ffmpeg] Automatically correct video with non-square pixels (Fixes #4674) 2015-01-10 05:45:51 +01:00