2021年3月17日星期三

How to programmatically download a m3u8 video referenced in a blob in Python?

Note that this question is different from How do we download a blob url video [closed] in that it requires no human interaction with the browser.

I have the following problem:

  • I have a list of URLs. They point to HTML pages that have the same underlying structure.
  • There's a image in the middle of the page; when it's clicked, it loads a player.
  • The player as a blob references to a m3u8 playlist though this is not visible in the HTML itself (it's visible in the Network tab of Chrome).
  • The player streams a short video.

What I need to do:

  • Programmatically access the various URLs. Get the HTML and click on the image-player.
  • Get the blob reference and use that one to get the m3u8 playlist.
  • Download the stream as a video (bonus points for downloading it as a gif).

Note that the solution would require no human interaction with the browser. API-wise, the input should be a list of URLs and the output a list of videos/gifs.

An example page can be found here in case you want to test your solution.

My understanding is that I can use Selene to get the HTML and click on the image to start the player. However, I have no idea how to process the blob to get the m3u8 and then use that one for the actual video.

https://stackoverflow.com/questions/66683933/how-to-programmatically-download-a-m3u8-video-referenced-in-a-blob-in-python March 18, 2021 at 10:29AM

没有评论:

发表评论