Using WebRTC in Chrome extensions

While it is possible to use a capture of the complete desktop without a Chrome extension, this type of screen sharing is hardly practical. When you are screencasting a presentation you want to focus the audience's attention on your content and not on your desktop background or instant messaging notifications.

For security reasons a website cannot silently capture individual browser tabs. The APIs for capturing tabs and application windows are only available to Chrome extensions, which need to be actively installed by the user.

What is a Chrome extension?

A Chrome extension is just a combination of HTML and Javascript, tied together by a manifest.json file, conveniently packaged and distributed through the Chrome Webstore.

An extension consists of a background page (which is running invisibly in the background all the time) and a popup page (which is shown when you click the little icon right next to Chrome's URL bar).
In this post i will focus on the WebRTC part, you can learn about developing Chrome extensions here.

Some limitations apply

The javascript running in the background page cannot request access to your mic or camera. This is because the page is not visible to the user and thus cannot give consent for accessing the mic or camera. Requesting a desktop capture will only work if you don't request audio.

To build an application which let's you share a presentation and let you talk about it at the same time the Chrome extension and your website will have to work together. We have used a small content script (another piece of javascript, which get's injected into your website automatically).

There is a more direct way to enable communication between your website and the extension. The Chrome runtime API let's your website talk directly to your chrome extension. For this to work the manifest.json has to allow this communication for your website (by adding the URL to the "externally_connectable" setting).

Because "externally_connectable" does not allow generic wildcard URLs this was no option for us. Our ahoy! Confernce server can be downloaded and run inside local networks, which means that the extension cannot know the URL of the website.

Passing messages between scripts

Content scripts (which get injected into a website by the extension) cannot access your website's javascript, they run in a different sandbox. This is an important security feature but makes our life a bit more complicated. The only way to communicate between both scripts is to register for the "message" DOM event and send messages with the window.postMessage method (which allows cross origin message passing).

function handleEvent(event) {
    if (event.data.type && (event.data.type == "FROM_PAGE")) {
	/* forward message from controller.js to background.js */
	if (extensionPort) {
	    extensionPort.postMessage(event.data.payload);
	}
    }
}
window.addEventListener("message", handleEvent, false);

Capturing the currently active tab

Chrome's tabCapture API is a drop-in replacement for the usual getUserMedia() method:

var captureOptions = { audio : true, video : true };

chrome.tabCapture.capture(captureOptions, 
    function(stream) {
	if (stream) {
	    /* it's a regular MediaStream */
	}
    }
);

Capturing an application window

Getting access to an individual application window (or the whole desktop) is pretty straightforward, too.
It just involves using Chrome's dektopCapture API to select a window before using the regular getUserMedia() method.

chrome.desktopCapture.chooseDesktopMedia(["screen", "window"],
    function(mediaSourceId) {
        if (mediaSourceId) {
	    var captureOptions = { 
		audio:false, 
		video: { mandatory: { 
			chromeMediaSource: "desktop", 
			chromeMediaSourceId: mediaSourceId } 
		} 
	    };
	    navigator.webkitGetUserMedia(captureOptions,
		function(stream) {
		    /* a MediaStream with a video track, no audio */
		},
		function() {
		    /* something went wrong */
		}
	    );
	} else {
	    /* user aborted the window selection */
	}
    }
);

Putting it all together

After you have access to the MediaStream objects you need to use the regular Peerconnection API methods and transport the generated SDP offer/answer to your server side.

To get some inspiration you can find the source code for the AhoyConference Chrome extension
on our GitHub page.

You can test the extension on our free video conferencing platform at www.ahoyconference.com.