You are not logged in.
Pages: 1
Hi!
I'm trying to write a script that downloads the latest newscast for me. I download the html page with wget but I need help to extract the video stream url from there.
First I download the page with wget:
wget svtplay.se/t/102534/aktuellt
Then, in the html file that I just downloaded there's a link to a rtmp:// stream. Looks like this:
simon ~
> cat aktuellt | grep rtmp
<param name="flashvars" value="dynamicStreams=url:rtmp://fl11.c91005.cdn.qbrick.com/91005/_definst_/kluster/20101017/1136333-1017AKTUELLT2100-PLAY-mp4-c-v1.mp4,bitrate:850|url:rtmp://fl11.c91005.cdn.qbrick.com/91005/_definst_/kluster/20101017/1136333-1017AKTUELLT2100-PLAY-mp4-b-v1,bitrate:320&background=http://media.svt.se/download/mcc/flash/20101017/1136333-1017AKTUELLT2100-PLAY/1136333-1017AKTUELLT2100-PLAY_start_0.jpg&urlinmail=http://svtplay.se/v/2196528/aktuellt/17_10_21_00&liveStart=&length=841&noemail=true&noembed=true&autostart=true&buffertime=2.0&a=2196528&expression=full&startpos=0&expired=false&statisticsUrl=http://ld.svt.se/svt/svt/s?svt-play.Nyheter.Hela-program.17-10-21%3A00.2196528&client=svt-play&folderStructure=Aktuellt.Hela+program.Hela+program&category=Nyheter&title=17%2F10+21%3A00&broadcastDate=20101017" />
<param name="flashvars" value="dynamicStreams=url:rtmp://fl11.c91005.cdn.qbrick.com/91005/_definst_/kluster/20101017/1136333-1017AKTUELLT2100-PLAY-mp4-c-v1.mp4,bitrate:850|url:rtmp://fl11.c91005.cdn.qbrick.com/91005/_definst_/kluster/20101017/1136333-1017AKTUELLT2100-PLAY-mp4-b-v1,bitrate:320&background=http://media.svt.se/download/mcc/flash/20101017/1136333-1017AKTUELLT2100-PLAY/1136333-1017AKTUELLT2100-PLAY_start_0.jpg&urlinmail=http://svtplay.se/v/2196528/aktuellt/17_10_21_00&liveStart=&length=841&noemail=true&noembed=true&autostart=true&buffertime=2.0&a=2196528&expression=full&startpos=0&expired=false&statisticsUrl=http://ld.svt.se/svt/svt/s?svt-play.Nyheter.Hela-program.17-10-21%3A00.2196528&client=svt-play&folderStructure=Aktuellt.Hela+program.Hela+program&category=Nyheter&title=17%2F10+21%3A00&broadcastDate=20101017" />
<div class="external-player">Länk för extern spelare: <a class="external-player" href="rtmp://fl11.c91005.cdn.qbrick.com/91005/_definst_/kluster/20101017/1136333-1017AKTUELLT2100-PLAY-mp4-c-v1.mp4">Flash (rtmp)</a></div>
I'd like to extract the first rtmp:// link with sed (or awk, or whatever. I don't know), rtmp://fl11.c91005.cdn.qbrick.com/91005/_definst_/kluster/20101017/1136333-1017AKTUELLT2100-PLAY-mp4-c-v1.mp4 in this case. This url changes every day so I'd need something that prints everything from the first occurance of rtmp:// to .mp4. I'm guessing it's possible, I just don't have any idea how to accomplish this.
I'd appreciate som help with this.
Simon.
Offline
sed's the wrong tool (initially) to use. You'll want something that understands HTML. I had success using xmllint:
xmllint --html --xpath '//param[@name="flashvars"]/@value' <(curl -s svtplay.se/t/102534/aktuellt) 2>/dev/null | sed -n 's/.*url:\(rtmp:.*\.mp4\).*/\1/p'
Offline
Using falconindy's line, the whole thing can look e.g. like this:
rtmpdump -e -r $(xmllint --html --xpath '//param[@name="flashvars"]/@value' <(curl -s svtplay.se/t/102534/aktuellt) 2>/dev/null | sed -n 's/.*url:\(rtmp:.*\.mp4\).*/\1/p') -o $(date "+%F").mp4
Offline
Pages: 1