Enhancing ABR Video Efficiency at Pinterest | by Pinterest Engineering | Pinterest Engineering Weblog | Aug, 2024
Zhihao Hong; Employees Software program Engineer | Emma Adams; Sr. Software program Engineer | Jeremy Muhia; Sr. Software program Engineer | Blossom Yin; Software program Engineer II | Melissa He; Sr. Software program Engineer |
Video content material has emerged as a well-liked format for folks to find inspirations at Pinterest. On this weblog publish, we are going to define current enhancements made to the Adaptive Bitrate (ABR) video efficiency, in addition to its optimistic impression on consumer engagement.
Phrases:
- ABR: An acronym for Adaptive Bitrate (ABR) Streaming protocol.
- HLS: HTTP stay streaming (HLS) is an ABR protocol developed by Apple and supported each stay and on-demand streaming.
- DASH: Dynamic Adaptive Streaming over HTTP (DASH) is one other ABR protocol that works equally to HLS.
ABR Streaming is a broadly adopted protocol within the trade for delivering video content material. This methodology includes encoding the content material at a number of bitrates and resolutions, leading to a number of rendition variations of the identical video. Throughout playback, gamers improve the consumer expertise by choosing the very best high quality and dynamically adjusting it primarily based on community situations.
At Pinterest, we make the most of HLS and DASH for video streaming on iOS and Android platforms, respectively. HLS is supported by way of iOS’s AVPlayer, whereas DASH is supported by Android’s ExoPlayer. Presently, HLS accounts for about 70% of video playback periods on iOS apps, and DASH accounts for round 55% of video periods on Android.
Startup latency is a key metric for evaluating video efficiency at Pinterest. Subsequently, we are going to first study the steps of beginning an ABR video. To provoke playback, purchasers should receive each the manifest recordsdata and partial media recordsdata from the CDN by way of community requests. Manifest recordsdata are sometimes small-size recordsdata however present important metadata about video streams, together with their resolutions, bitrates, and many others. In distinction, the precise video and audio content material is encoded within the media recordsdata. Downloading assets in each file sorts takes time and is the first contributor to customers’ perceived latency.
Our work goals to cut back latency in manifest loading (highlighted in Determine 1), thereby bettering total video startup efficiency. Though manifest recordsdata are small in dimension, downloading them provides a non-trivial quantity of overhead on video startup. That is significantly noticeable with HLS, the place every video comprises a number of manifests, leading to a number of community spherical journeys. As illustrated in Determine 1, after receiving video URLs from the API endpoint, gamers start the streaming course of by downloading the principle manifest playlist from the CDN. The content material inside the principle manifest supplies URLs for the rendition-level manifest playlists, prompting gamers to ship subsequent requests to obtain them. Lastly, primarily based on community situations, gamers obtain the suitable media recordsdata to begin playback. Buying manifests is a prerequisite for downloading media bytes, and decreasing this latency results in quicker loading instances and a greater viewing expertise. Within the subsequent part, we are going to share our answer to this downside.
At a excessive degree, our answer eliminates the spherical journey latency related to fetching a number of manifests by embedding them within the API response. When purchasers request Pin metadata by way of endpoints, the manifest file bytes are serialized and built-in into the response payload together with different metadata. Throughout playback, gamers can swiftly entry the manifest info domestically, enabling them to proceed on to the ultimate media obtain stage. This methodology permits gamers to bypass the method of downloading manifests, thereby decreasing video startup latency.
Now let’s dive into a number of the implementation particulars and challenges we discovered in the course of the course of.
One of many main hurdles we confronted was the overhead imposed on the API endpoint. To course of API requests, the backend is required to retrieve video manifest recordsdata, inflicting an increase within the total latency of API responses. This situation is especially outstanding for HLS, the place quite a few manifest playlists are wanted for every video, leading to a big improve in latency attributable to a number of community calls. Though an preliminary try to parallelize community calls offered some reduction, the latency regression continued.
We efficiently tackled this situation by incorporating a MemCache layer into the manifest serving course of. Memcache supplies considerably decrease latency than community calls when the cache is hit and is efficient for platforms like Pinterest, the place in style content material is persistently served to varied purchasers, leading to a excessive cache hit charge. Following the implementation of Memcache, API overhead was successfully managed.
We additionally performed iterations on backend mechanisms for retrieving manifests, evaluating fetching from CDN or fetching instantly from origins: S3. We finally landed S3 fetching because the optimum answer. In distinction to CDN fetching, S3 fetching gives higher efficiency and price effectivity.
After a number of iterations, the ultimate backend circulation seems to be like this:
AVPlayer
For the gamers’ aspect, we make the most of the AVAssetResourceLoaderDelegate APIs on iOS to customise the loading strategy of manifest recordsdata. These APIs allow purposes to include their very own logic for managing useful resource loading.
Determine 4 illustrates the manifest loading course of between AVPlayer and our code. Manifest recordsdata are delivered from the backend and saved on the consumer. When playback is requested, AVPlayer sends a number of AVAssetResourceLoadingRequests to acquire details about the manifest information. In response to those requests, we find the serialized information from the API response and provide them accordingly. The loading requests for media recordsdata are redirected to the CDN tackle as regular.
ExoPlayer
Android employs a comparable method by consolidating the loading requests at ExoPlayer. The method begins with getting the contents of the .mpd manifest file from the API. These contents are saved within the tag property of MediaItem.LocalConfiguration.
// MediaItemTag.kt
information class MediaItemTag(
val manifest: String?,
)// VideoManager.kt
val mediaItemTag = MediaItemTag(
manifest = apiResponse.dashManifest,
)
val mediaItem = MediaItem.Builder()
.setTag(mediaItemTag)
.setUri(...) // instance: "https://instance.com/some/video/url.mpd"
.construct()
After the MediaItem is about and we’re able to name put together, the notable customization begins when ExoPlayer creates a MediaSource. Particularly, Android implements the MediaSource.Factory interface, which specifies a createMediaSource method.
In createMediaSource, we will retrieve the DASH manifest that was saved within the tag property from the MediaItem and rework it into an ExoPlayer primitive utilizing DashManifestParser:
// CustomMediaSourceFactory.kt
override enjoyable createMediaSource(mediaItem: MediaItem): MediaSource
val mediaItemTag = mediaItem.localConfiguration!!.tag
val manifest = (mediaItemTag as MediaItemTag).manifestval dashManifest = DashManifestParser()
.parse(
mediaItem.localConfiguration!!.uri,
manifest.byteInputStream()
)
With the contents of the manifest obtainable in reminiscence, using DashMediaSource.Factory’s API for side-loading a manifest is the final step required:
class PinterestMediaSourceFactory(
personal val dataSourceFactory: DataSource.Manufacturing facility,
): MediaSource.Manufacturing facility override enjoyable createMediaSource(mediaItem: MediaItem): MediaSource
val dashChunkSourceFactory = DefaultDashChunkSource
.Manufacturing facility(dataSourceFactory)
val dashMediaSourceFactory = DashMediaSource
.Manufacturing facility(
/* chunkSourceFactory = */ dashChunkSourceFactory,
/* manifestDataSourceFactory = */ null,
)
return dashMediaSourceFactory.createMediaSource(
dashManifest,
mediaItem
)
Now, as an alternative of getting to first fetch the .mpd file from the CDN, ExoPlayer can skip that request and instantly start fetching media.
As of this writing, each Android and iOS platforms have absolutely applied the above answer. This has led to a big enchancment in video efficiency metrics, significantly in startup latency. Consequently, consumer engagement on Pinterest was boosted because of the quicker viewing expertise.
With the flexibility to govern the manifest loading course of, purchasers could make native changes to realize extra fine-grained video high quality management primarily based on the UI floor. By eradicating undesirable bitrate renditions from the unique manifest file earlier than offering it to the gamers, we will restrict the variety of bitrate renditions obtainable for the participant, thereby gaining extra management over playback high quality. As an illustration, on a big UI floor reminiscent of full display, high-quality video is extra preferable. By eradicating the decrease bitrate renditions from the unique manifest, we will make sure that gamers solely play high-quality video with out making ready a number of units of manifests at backend.
We wish to lengthen our honest gratitude to Liang Ma and Sterling Li for his or her vital contributions to this venture and distinctive technical management. Their experience and dedication have been instrumental in overcoming quite a few challenges and driving the venture to success. We’re deeply appreciative of their efforts and the optimistic impression they’ve had on this initiative.
To be taught extra about engineering at Pinterest, take a look at the remainder of our Engineering Weblog and go to our Pinterest Labs website. To discover and apply to open roles, go to our Careers web page.