Picture-in-picture (PIP) is a separate browser window with a video that sits outside the page.
You minimize the tab or even the browser where you’re watching the video, and it’s still visible in the little window. It’s handy if you’re broadcasting a screen and want to see your interlocutors or are watching a really interesting show, but someone sent a message.
Let’s figure out how to create this window and make it work.
As of early 2022, the Picture-in-picture specification is in draft form. All browsers work differently, and the support leaves a lot to be desired.
As of early 2022, only 48% of browsers support the feature.
Let’s go to a technical guide to implementation with all the pitfalls and unexpected plot twists. Enjoy 🙂
Tutorial on how to set up PiP
First, you need a video element.
<video controls src="video.mp4"></video> 
In Chrome and Safari, the PiP activation button should appear.
5 minutes and it’s done!
What about Firefox?
Unfortunately, Mozilla doesn’t yet have full picture-in-picture support 🙁
To activate PiP in Mozilla, every user has to go into configuration (type about:config in the search box). Then find media.videocontrols.picture-in-picture.enabled and make it true.
Because of the weak support for PIP in Mozilla, we won’t look at that browser any further.
Now you can activate web picture-in-picture in all popular browsers.
But what if this is not enough?
Could it be more convenient?
Maybe add a nice activation button?
Or automatically switch to PIP when you leave the page?
Yes, this is all possible!
Software PiP activation
To start, let’s implement the basic open/close functionality and connect the button.
Let’s say your browser supports picture-in-picture on the web. To open and close the PiP window, we need to:
- make sure the feature is supported
- make sure that there is no other PIP
- implement the cross-browser picture-in-picture activation/deactivation function.
Check support
To make sure we can programmatically activate a PIP window, we need to know if it is activated in the browser and if there is an opening method.
You can check the activation status through the document pictureInPictureEnabled property:
"pictureInPictureEnabled" in document && document.pictureInPictureEnabled 
To make sure that we can interact with the PIP window, let’s try to find a picture-in-picture activation method.
For Safari it’s webkitSetPresentationMode, for all other browsers requestPictureInPicture.
xport const canPIP = (): boolean => "pictureInPictureEnabled" in document &&
 document.pictureInPictureEnabled;
const supportsOldSafariPIP = () => {
 const video = document.createElement("video");
 return (
   canPIP() &&
   video.webkitSupportsPresentationMode &&
   typeof video.webkitSetPresentationMode === "function"
 );
};
const supportsModernPIP = () => {
 const video = document.createElement("video");
 return (
   canPIP() &&
   video.requestPictureInPicture &&
   typeof video.requestPictureInPicture === "function"
 )
};
const supportsPIP = (): boolean => supportsOldSafariPIP() || supportsModernPIP(
Checking for the presence of a PiP window
To determine whether or not we already have a picture-in-picture window, you can look it up in the properties of the document
document.pictureInPictureElement
Opening and closing functions
Opening function
The standard requestPictureInPicture open method.
video.requestPictureInPicture();
For more support among browsers, let's implement a fallback. To enter picture-in-picture on safari, you need to use the webkitSetPresentationMode method of the video element:
video.webkitSetPresentationMode("picture-in-picture")
Closing function
The standard closing method:
document.exitPictureInPicture()
Fallback for Safari
video.webkitSetPresentationMode("inline")
As a result, we have the functionality to open or close the PIP.
export const canPIP = () =>
 "pictureInPictureEnabled" in document &&
 document.pictureInPictureEnabled;
const isInPIP = () => Boolean(document.pictureInPictureElement);
const supportsOldSafariPIP = () => {
 const video = document.createElement("video");
 return (
   canPIP() &&
   video.webkitSupportsPresentationMode &&
   typeof video.webkitSetPresentationMode === "function"
 );
};
const supportsModernPIP = () => {
 const video = document.createElement("video");
 return (
   canPIP() &&
   video.requestPictureInPicture &&
   typeof video.requestPictureInPicture === "function"
 )
};
const supportsPIP = () =>
 supportsOldSafariPIP() || supportsModernPIP();
export const openPIP = async (video) => {
 if (isInPIP()) return;
 if (supportsOldSafariPIP())
   await video.webkitSetPresentationMode("picture-in-picture");
 if (supportsModernPIP())
   await video.requestPictureInPicture();
};
const closePIP = async (video) => {
 if (!isInPIP()) return;
 if (supportsOldSafariPIP())
   await video.webkitSetPresentationMode("inline");
 if (supportsModernPIP())
   await document?.exitPictureInPicture();
};
Now, all we have to do is enable the button.
const disablePIP = async () => {
 await closePIP(videoElement.current).catch(/*handle error*/)
};
const enablePIP = async () => {
 await openPIP(videoElemeant.current).catch(/*handle error*/)
};
const handleVisibility = async () => {
 if (document.visibilityState === "visible") await disablePIP();
 else await enablePIP();
};
const togglePIP = async () => {
 if (isInPIP()) await disablePIP()
 else await enablePIP()
};
Don't forget to catch errors from asynchronous functions and connect the functionality to the button.
<button onClick={togglePIP} className={styles.Button}>
 {isPIPOn ? "Turn off PIP" : "Turn on PIP"}
</button>
See? Not so much code and the button for switching between PiP and normal mode is ready!
Automatic activation of web picture-in-picture
Why do you need picture-in-picture?
To surf the Internet and watch video streams from another page!
Chatting in a video conference in your browser, you want to tell something while peeking into Google Docs but still seeing the person you’re talking to, just like in Skype. You can do that with PiP. Or you want to keep watching the movie while answering an urgent message in a messenger – this is also possible if the site where you watch the movie has developed PiP functionality.
Let’s implement the automatic opening of the PiP window when you leave the page.
Safari has the autoPictureInPicture property, it turns on the Picture-In-Picture mode only if the user is watching a fullscreen video.
To activate it, you need to make the video element property autoPictureInPicture true.
if (video && "autoPictureInPicture" in video) {
  video.autoPictureInPicture = true;
}
That’s it for Safari.
Chrome and similar browsers allow you to ping without a fullscreen, but the video must be visible and the focus must be on the page.
You can use the Page Visibility API to track page abandonment.
document.addEventListener("visibilitychange", async () => {
 if (document.visibilityState === "visible")
   await closePIP(video);
 else
   await openPIP(video);
});
Enjoy, the picture-in-picture auto-activation is ready.
PIP Controls
PiP video has the following buttons by default:
- pause (except when we pass a media stream to a video tag)
- switch back to the page
- next/previous video
Use the media session API to configure video switching.
navigator.mediaSession.setActionHandler('nexttrack', () => {
 // set next video src
});
navigator.mediaSession.setActionHandler('previoustrack', () => {
 // set prev video src
});
Linking with video conferencing
Let’s say we want to make a browser-based Skype with screen sharing.
It would be nice to show the demonstrator’s face. And also so that he can see himself, should, for example, his hair end up disheveled.
Javascript picture-in-picture would be perfect for that!
To display a WebRTC media stream in PiP, all you have to do is apply it to the video, and that’s it.
video.srcObject = await navigator.mediaDevices.getUserMedia({
 video: true,
 audio: true,
})
In this uncomplicated way, you can show the face of the screen demonstrator. And best of all, there is no need to transmit additional video of the speaker’s face, because it’s already present in the demonstration exactly where the author wishes it to be.
This not only saves traffic for all users in the video conference but also creates a more convenient interface for the demonstrator and the audience.
The same logic works with the interlocutor in an online conference.
Anything that can be displayed in the video tag can be displayed in the PiP window.
The pitfalls
Nothing works perfectly from the first try 🙂 Here are some tips on what to do when picture in picture mode is not working.
Error: Failed to execute ‘requestPictureInPicture’.
DOMException: Failed to execute ‘requestPictureInPicture’ on ‘HTMLVideoElement’: Must be handling a user gesture if there isn’t already an element in picture-in-picture JS.
So either the browser has realized that we’re abusing the API, or you forgot to check if the window is open
In the w3 draft, the requirements are userActivationRequired and playingRequired. This means that picture-in-picture can only be activated when the user interacts and if the video is playing.
At the moment the error can be found in 2 popular cases:
- (Chrome) trying to navigate to PiP if the page is out of focus.
- (Safari) attempt to navigate to PiP without user interaction.
The video in the PiP window doesn’t update
To deal with this problem in react, just change the key property along with the media stream update or src.
<video controls key={/* updated key */} src="video.mp4"></video>
Video in the PiP window freezes
From time to time a video hangs. This usually happens when the video tag disappears from the page. In such a situation, you need to call the document.exitPictureInPicture() method.
When starting a broadcast in another tab or application, the auto-opening PiP window doesn't work (Chrome)
This problem is related to this error. The reason is that when you click on the system window to select a tab or page to show, our page loses focus. If there is no focus, the userActivationRequired condition can’t be satisfied, so you can’t open Pip right after the start of the demonstration.
However, it is possible to open a PiP window in advance, say, when the page loses focus:
document.addEventListener("blur", () => {
 // open PIP
})
In this case, the PiP will open before the broadcast begins.
Conclusion
Despite pretty weak browser support, only 48% as of early 2022, Javascript-enabled PiP is a pretty quick feature to implement and brings an amazing user experience to web app users with video or broadcasts.
However, you should consider the fact that half of the users may never use it due to poor support.
You can test this feature out in the sandbox.
BONUS TIP
How to turn on picture-in-picture on YouTube?
- Turn on the video
- Open console. For macOS, use Option + ⌘ + J. For Windows or Linux, use Shift + CTRL + J.
- Enter this code:
document.onclick = () => {
 document.querySelector('video')?.requestPictureInPicture();
 document.onclick = null;
};
- Press Enter.
- Click on an empty spot on the page.
 
Top comments (0)