Webrtc voice chat. Multi-user chat using WebRTC

WebRTC (Web Real-Time Communications) is a technology that allows Web applications and sites to capture and selectively transmit audio and/or video media streams, as well as exchange arbitrary data between browsers, without necessarily using intermediaries. The set of standards that WebRTC technology includes allows you to exchange data and conduct peer-to-peer teleconferences without the user having to install plugins or any other third-party software.

WebRTC consists of several interconnected application programming interfaces (APIs) and protocols that work together. The documentation you'll find here will help you understand the basics of WebRTC, how to set up and use a connection for data and media streaming, and much more.

Compatibility

Since the WebRTC implementation is still in its infancy and every browser has WebRTC functionality, we strongly recommend using Google's Adapter.js polyfill library before starting to work on your code.

Adapter.js uses wedges and polyfills to seamlessly bridge differences in WebRTC implementations among the contexts that support it. Adapter.js also handles vendor prefixes and other property naming differences, making it easier to develop on WebRTC with the most compatible results. The library is also available as an NPM package.

To further explore the Adapter.js library, take a look.

WebRTC Concepts and Usage

WebRTC is multi-purpose and, together with , provides powerful multimedia capabilities for the Web, including support for audio and video conferencing, file sharing, screen capture, identity management, and interoperability with legacy telephone systems, including support for DTMF tone dialing. Connections between nodes can be created without the use of special drivers or plugins, and often without intermediate services.

The connection between two nodes is represented as an RTCPeerConnection interface object. Once a connection is established and opened, using the RTCPeerConnection object, media streams ( MediaStream s) and/or data channels ( RTCDataChannel s) can be added to the connection.

Media streams can consist of any number of tracks (tracks) of media information. These tracks are represented by MediaStreamTrack interface objects, and can contain one or more types of media data, including audio, video, text (such as subtitles or chapter titles). Most streams consist of at least only one audio track (one audio track), or video track, and can be sent and received as streams (real-time media) or saved to a file.

You can also use a connection between two nodes to exchange arbitrary data using the RTCDataChannel interface object, which can be used to transmit service information, stock market data, game status packages, file transfer or private data channels.

more details and links to relevant guides and tutorials needed

WebRTC interfaces

Because WebRTC provides interfaces that work together to perform different tasks, we have divided them into categories. See the sidebar index for quick navigation.

Connection setup and management

These interfaces are used to configure, open and manage WebRTC connections. They represent single-layer media connections, data channels, and interfaces that are used to exchange information about the capabilities of each node to select the best configuration for establishing a two-way multimedia connection.

RTCPeerConnection Represents a WebRTC connection between a local computer and a remote node. Used to handle successful data transfer between two nodes. RTCSessionDescription Represents the session parameters. Each RTCSessionDescription contains descriptions of type , indicating which part (offer/response) of the negotiation process it describes, and an SDP descriptor for the session. RTCIceCandidate Represents the Internet Connection Establishment (ICE) server candidate for establishing an RTCPeerConnection connection. RTCIceTransport Represents Internet Connectivity Facility (ICE) information. RTCPeerConnectionIceEvent Represents events that occur on ICE candidates, typically RTCPeerConnection . One type is passed to this event object: icecandidate. RTCRtpSender Controls the streaming and transmission of data through an object of type MediaStreamTrack for an object of type RTCPeerConnection . RTCRtpReceiver Controls the reception and decoding of data through an object of type MediaStreamTrack for an object of type RTCPeerConnection . RTCTrackEvent Indicates that a new incoming MediaStreamTrack object has been created and an RTCRtpReceiver object has been added to the RTCPeerConnection object. RTCCertificate Represents a certificate that uses the RTCPeerConnection object. RTCDataChannel Represents a bidirectional data channel between two connection nodes. RTCDataChannelEvent Represents events that are raised when an object of type RTCDataChannel is attached to an object of type RTCPeerConnection datachannel . RTCDTMFSender Controls the encoding and transmission of dual-tone multi-frequency (DTMF) signaling for an object of type RTCPeerConnection . RTCDTMFToneChangeEvent Indicates an incoming Dual Tone Multi Frequency (DTMF) tone change event. This event does not bubble (unless otherwise specified) and is not cancelable (unless otherwise specified). RTCStatsReport Asynchronously reports the status for the passed object of type MediaStreamTrack . RTCIdentityProviderRegistrar Registers an identity provider (idP). RTCIdentityProvider Enables the browser's ability to request the creation or verification of an identity declaration. RTCIdentityAssertion Represents the remote node identifier of the current connection. If the node has not yet been installed and confirmed, the interface reference will return null . Does not change after installation. RTCIdentityEvent Represents an identity provider (idP) declaration event object. Event of an object of type RTCPeerConnection. One type is passed to this identityresult event. RTCIdentityErrorEvent Represents an error event object associated with an identity provider (idP). Event of an object of type RTCPeerConnection. Two types of error are passed to this event: idpassertionerror and idpvalidationerror. Guides WebRTC Architecture Overview Below the API that developers use to create and use WebRTC is a set of network protocols and connection standards. This review is a showcase of these standards. WebRTC allows you to organize a connection in a node-to-node mode to transfer arbitrary data, audio, video streams, or any combination of them in the browser. In this article, we'll take a look at the life of a WebRTC session, starting with the connection being established and going all the way until it terminates when it's no longer needed. WebRTC API Overview WebRTC consists of several interrelated application programming interfaces (APIs) and protocols that work together to support the exchange of data and media streams between two or more nodes. This article presents brief overview each of these APIs and what purpose it serves. WebRTC Basics This article will walk you through creating a cross-browser RTC application. By the end of this article, you should have a working point-to-point data and media channel. WebRTC Protocols This article introduces the protocols that complement the WebRTC API. This guide describes how you can use a node-to-node connection and linked

WebRTC is an API provided by the browser and allows you to organize a P2P connection and transfer data directly between browsers. There are quite a few tutorials on the Internet on how to write your own video chat using WebRTC. For example, here is an article on Habré. However, they are all limited to connecting two clients. In this article I will try to talk about how to organize connection and exchange of messages between three or more users using WebRTC.

The RTCPeerConnection interface is a peer-to-peer connection between two browsers. To connect three or more users, we will have to organize a mesh network (a network in which each node is connected to all other nodes).
We will use the following scheme:

When opening the page, we check for the presence of the room ID in location.hash

If the room ID is not specified, generate a new one

We send the signaling server a message that we want to join the specified room

Signalling server sends a notification about a new user to other clients in this room

Clients already in the room send the newcomer an SDP offer

Newbie responds to offers

0. Signaling server

As you know, although WebRTC provides the possibility of P2P connection between browsers, its operation still requires additional transport for exchanging service messages. In this example, the transport used is a WebSocket server written in Node.JS using socket.io:

Var socket_io = require("socket.io"); module.exports = function (server) ( var users = (); var io = socket_io(server); io.on("connection", function(socket) ( // Want a new user to join the room socket.on("room ", function(message) ( var json = JSON.parse(message); // Add the socket to the list of users users = socket; if (socket.room !== undefined) ( // If the socket is already in some room , exit it socket.leave(socket.room); ) // Enter the requested room socket.room = json.room; socket.join(socket.room); in this room a message about the joining of a new participant socket.broadcast.to(socket.room).emit("new", json.id )); // Message related to WebRTC (SDP offer, SDP answer or ICE candidate) socket.on("webrtc", function(message) ( var json = JSON.parse(message); if (json.to !== undefined && users !== undefined) ( // If the message specifies a recipient and that recipient known to the server, we send the message only to it... users.emit("webrtc", message); ) else ( // ...otherwise we consider the message to be broadcast socket.broadcast.to(socket.room).emit("webrtc", message); ) )); // Someone has disconnected socket.on("disconnect", function() ( // When a client disconnects, notify others about it socket.broadcast.to(socket.room).emit("leave", socket.user_id); delete users )); )); );

1.index.html

The source code for the page itself is quite simple. I deliberately did not pay attention to layout and other beauties, since this article is not about that. If someone wants to make it beautiful, it won’t be difficult.

WebRTC Chat Demo Connected to 0 peers
Send

2. main.js 2.0. Getting links to page elements and WebRTC interfaces var chatlog = document.getElementById("chatlog"); var message = document.getElementById("message"); var connection_num = document.getElementById("connection_num"); var room_link = document.getElementById("room_link");

We still have to use browser prefixes to access WebRTC interfaces.

Var PeerConnection = window.mozRTCPeerConnection || window.webkitRTCPeerConnection; var SessionDescription = window.mozRTCSessionDescription || window.RTCSessionDescription; var IceCandidate = window.mozRTCIceCandidate || window.RTCIceCandidate;

2.1. Determining the room ID

Here we need a function to generate a unique room and user identifier. We will use UUID for these purposes.

Function uuid() ( var s4 = function() ( return Math.floor(Math.random() * 0x10000).toString(16); ); return s4() + s4() + "-" + s4() + "-" + s4() + "-" + s4() + "-" + s4() + s4() + s4();

Now let's try to extract the room identifier from the address. If one is not specified, we will generate a new one. Let's display a link to the current room on the page, and, at the same time, generate the identifier of the current user.

Var ROOM = location.hash.substr(1); if (!ROOM) ( ROOM = uuid(); ) room_link.innerHTML = "Link to the room"; var ME = uuid();

2.2. WebSocket

Immediately when opening the page, we will connect to our signaling server, send a request to enter the room and specify message handlers.

// Specify that when closing a message, you need to send a notification to the server about this var socket = io.connect("", ("sync disconnect on unload": true)); socket.on("webrtc", socketReceived); socket.on("new", socketNewPeer); // Immediately send a request to enter the room socket.emit("room", JSON.stringify((id: ME, room: ROOM))); // Helper function for sending address messages related to WebRTC function sendViaSocket(type, message, to) ( socket.emit("webrtc", JSON.stringify((id: ME, to: to, type: type, data: message )));

2.3. PeerConnection settings

Most ISPs provide Internet connections via NAT. Because of this, direct connection becomes not such a trivial matter. When creating a connection, we need to specify a list of STUN and TURN servers that the browser will try to use to bypass NAT. We will also indicate a couple of additional options for connection.

Var server = ( iceServers: [ (url: "stun:23.21.150.121"), (url: "stun:stun.l.google.com:19302"), (url: "turn:numb.viagenie.ca", credential: "your password goes here", username: " [email protected]") ] ); var options = ( optional: [ (DtlsSrtpKeyAgreement: true), // required for connection between Chrome and Firefox (RtpDataChannels: true) // required in Firefox to use the DataChannels API ] )

2.4. Connecting a new user

When a new peer is added to the room, the server sends us a new message. According to the message handlers above, the socketNewPeer function will be called.

Var peers = (); function socketNewPeer(data) ( peers = (candidateCache: ); // Create a new connection var pc = new PeerConnection(server, options); // Initialize it initConnection(pc, data, "offer"); // Save peers in the list peers.connection = pc; // Create a DataChannel through which messages will be exchanged var channel = pc.createDataChannel("mychannel", ()); channel.owner = data peers.channel = channel; channel bindEvents(channel); // Create SDP offer pc.createOffer(function(offer) ( pc.setLocalDescription(offer); ) ) function initConnection(pc, id, sdpType) ( pc.onicecandidate = function (event) ( if (event.candidate) ( // When a new ICE candidate is detected, add it to the list for further sending peers.candidateCache.push(event.candidate); ) else ( // When candidate discovery is complete, the handler will be called again, but without candidate // In this case, we first send the peer an SDP offer or SDP answer (depending on the function parameter)... sendViaSocket(sdpType, pc.localDescription, id); // ...and then all previously found ICE candidates for (var i = 0; i< peers.candidateCache.length; i++) { sendViaSocket("candidate", peers.candidateCache[i], id); } } } pc.oniceconnectionstatechange = function (event) { if (pc.iceConnectionState == "disconnected") { connection_num.innerText = parseInt(connection_num.innerText) - 1; delete peers; } } } function bindEvents (channel) { channel.onopen = function () { connection_num.innerText = parseInt(connection_num.innerText) + 1; }; channel.onmessage = function (e) { chatlog.innerHTML += "Peer says: " + e.data + ""; }; }

2.5. SDP offer, SDP answer, ICE candidate

When we receive one of these messages, we call the handler for the corresponding message.

Function socketReceived(data) ( var json = JSON.parse(data); switch (json.type) ( case "candidate": remoteCandidateReceived(json.id, json.data); break; case "offer": remoteOfferReceived(json. id, json.data); break; case "answer": remoteAnswerReceived(json.id, json.data) )

2.5.0 SDP offer function remoteOfferReceived(id, data) ( createConnection(id); var pc = peers.connection; pc.setRemoteDescription(new SessionDescription(data)); pc.createAnswer(function(answer) ( pc.setLocalDescription(answer )); function createConnection(id) ( if (peers === undefined) ( peers = (candidateCache: ); var pc = new PeerConnection(server, options); initConnection(pc, id, "answer"); peers.connection = pc; pc.ondatachannel = function(e) ( peers.channel = e.channel; peers.channel.owner = id; bindEvents(peers.channel); ) ) 2.5.1 SDP answer function remoteAnswerReceived(id , data) ( var pc = peers.connection; pc.setRemoteDescription(new SessionDescription(data)); ) 2.5.2 ICE candidate function remoteCandidateReceived(id, data) ( createConnection(id); var pc = peers.connection; pc. addIceCandidate(new IceCandidate(data)) 2.6. Sending a message

When the Send button is clicked, the sendMessage function is called. All it does is go through the list of peers and try to send the specified message to everyone.

The purpose of this article is to use a demo sample of peer-to-peer video chat (p2p video chat) to familiarize yourself with its structure and operating principle. For this purpose, we will use the multi-user peer-to-peer video chat demo webrtc.io-demo. It can be downloaded from the link: https://github.com/webRTC/webrtc.io-demo/tree/master/site.

It should be noted that GitHub is a site or web service for the collaborative development of Web projects. On it, developers can post the codes of their developments, discuss them and communicate with each other. In addition, some large IT companies post their official repositories on this site. The service is free for open source projects. GitHub is a repository for open, free source code libraries.

So, we will place the demo sample of peer-to-peer video chat downloaded from GitHub on drive C personal computer in the created directory for our application "webrtc_demo".

Rice. 1

As follows from the structure (Fig. 1), peer-to-peer video chat consists of client script.js and server server.js scripts implemented in the JavaScript programming language. Script (library) webrtc.io.js (CLIENT) - provides the organization of real-time communications between browsers using a peer-to-peer scheme: "client-client", and webrtc.io.js (CLIENT) and webrtc.io.js (SERVER), Using the WebSocket protocol, they provide duplex communication between the browser and the web server using a client-server architecture.

The webrtc.io.js (SERVER) script is included in the webrtc.io library and is located in the node_modules\webrtc.io\lib directory. The video chat interface index.html is implemented in HTML5 and CSS3. The contents of the webrtc_demo application files can be viewed using one of the html editors, for example "Notepad++".

We will check the working principle of video chat in file system PC. To run the server (server.js) on a PC, you need to install the node.js runtime environment. Node.js allows you to run JavaScript code outside of the browser. You can download node.js from the link: http://nodejs.org/ (version v0.10.13 as of 07/15/13). On the main page of the node.org website, click on the download button and go to http://nodejs.org/download/. For windows users first download win.installer (.msi), then run win.installer (.msi) on the PC, and install nodejs and "npm package manager" in the Program Files directory.

Rice. 2

Thus, node.js consists of an environment for developing and running JavaScript code, as well as a set of internal modules that can be installed using the manager or npm package manager.

To install modules you must command line From the application directory (for example, "webrtc_demo") run the command: npm install module_name. During the installation of modules, the npm manager creates a node_modules folder in the directory from which the installation was performed. During operation, nodejs automatically connects modules from the node_modules directory.

So, after installing node.js, open the command line and update the express module in the node_modules folder of the webrtc_demo directory using the npm package manager:

C:\webrtc_demo>npm install express

The express module is a web framework for node.js or a web platform for application development. To have global access to express, you can install it this way: npm install -g express .

Then update the webrtc.io module:

C:\webrtc_demo>npm install webrtc.io

Then on the command line we launch the server: server.js:

C:\webrtc_demo>node server.js

Rice. 3

That's it, the server is running successfully (Figure 3). Now, using a web browser, you can contact the server by IP address and load the index.html web page, from which the web browser will extract the client script code - script.js and the webrtc.io.js script code, and execute them. To operate peer-to-peer video chat (to establish a connection between two browsers), you need to contact the signal server running on node.js from two browsers that support webrtc.

As a result, the interface of the client part of the communication application (video chat) will open with a request for permission to access the camera and microphone (Fig. 4).

Rice. 4

After clicking the "Allow" button, the camera and microphone are connected for multimedia communication. In addition, you can communicate via text data through the video chat interface (Fig. 5).

Rice. 5

It should be noted that. The server is a signaling server, and is mainly designed to establish connections between users' browsers. Node.js is used to operate the server.js server script that provides WebRTC signaling.

Today, WebRTC is the hot technology for streaming audio and video in browsers. Conservative technologies, such as HTTP Streaming and Flash, are more suitable for distributing recorded content (video on demand) and are significantly inferior to WebRTC in terms of real time and online broadcasts, i.e. where minimal video latency is required to allow viewers to see what is happening “live”.

The possibility of high-quality real-time communication comes from the WebRTC architecture itself, where the UDP protocol is used to transport video streams, which is the standard basis for transmitting video with minimal delays and is widely used in real-time communication systems.

Communication latency is important in online broadcasting systems, webinars and other applications that require interactive communication with the video source, end users and requires a solution.

Another good reason to try WebRTC is that it is definitely a trend. Today everyone Android Chrome the browser supports this technology, which guarantees millions of devices ready to watch the broadcast without installing any additional software or configurations.

In order to test the WebRTC technology in action and launch a simple online broadcast on it, we used the Flashphoner WebRTC Media & Broadcasting Server server software. The features state the ability to broadcast WebRTC streams in one-to-many mode, as well as support for IP cameras and video surveillance systems via the RTSP protocol; In this review we will focus on web-web broadcasts and their features.

Installing WebRTC Media & Broadcasting Server

Because for Windows systems there was no server version, and I didn’t want to install a virtual machine like VMWare+Linux, so I could test online broadcasts at home Windows computer it didn't work out. To save time, we decided to take an instance on cloud hosting like this:

It was Centos x86_64 version 6.5 without any pre-installed software in the Amsterdam data center. Thus, all we have at our disposal is the server and ssh access to it. For those who are familiar with console commands Linux installing WebRTC The server promises to be simple and painless. So what we did:

1. Download the archive:

$wget https://site/download-wcs5-server.tar.gz

2. Unpack:

$tar -xzf download-wcs5-server.tar.gz

3. Install:

$cd FlashphonerWebCallServer

During installation, enter the server IP address: XXX.XXX.XXX.XXX

4. Activate the license:

$cd /usr/local/FlashphonerWebCallServer/bin

$./activation.sh

5. Start WCS server:

$service webcallserver start

6. Check the log:

$tail - f /usr/local/FlashphonerWebCallServer/logs/flashphoner_manager.log

7. Check that the two processes are in place:

$ps aux | grep Flashphoner

The installation process is complete.

Testing WebRTC online broadcasts

Testing the broadcasts turned out to be a simple matter. In addition to the server, there is a web client, which consists of a dozen Javascript, HTML and CSS files and was deployed by us to the /var/www/html folder during the installation stage. The only thing that had to be done was to enter the server’s IP address into the flashphoner.xml config so that the web client could establish a connection with the server via HTML5 Websockets. Let's describe the testing process.

1. Open the test client page index.html in Chrome browser e:

2. In order to start broadcasting, you need to click the “Start” button in the middle of the screen.
Before you do this, you need to make sure that the webcam is connected and ready to use. There are no special requirements for the webcam; for example, we used a standard camera built into a laptop with a resolution of 1280x800.

The Chrome browser will definitely ask for access to the camera and microphone so that the user understands that his video will be sent to the Internet server and allows it.

3. The interface represents a successful broadcast of the video stream from the camera to the WebRTC server. In the upper right corner, an indicator indicates that the stream is going to the server; in the lower corner there is a “Stop” button to stop sending the video.

Please note the link in the box below. It contains a unique identifier for this stream, so anyone can join the viewing. Just open this link in your browser. To copy it to the clipboard, click on the “Copy” button.

In real applications such as webinars, lectures, online video broadcasts or interactive TV, developers will have to implement the distribution of this identifier to certain groups of viewers so that they can connect to the desired streams, but this is already the logic of the application. WebRTC Media & Broadcasting Server does not affect it, but only distributes video.

5. The connection is established and the viewer sees the stream on the screen. Now he can send a link to someone else, stop the stream playing, or enable full-screen mode using the controls in the lower right corner.

Results of testing WebRTC online broadcast server

During testing, the latency seemed perfect. The ping to the data center was about 100 milliseconds and the delay was invisible to the eye. From here, we can assume that the real delay is the same 100 plus or minus a few tens of milliseconds for the buffering time. Compared to Flash video: in such tests, Flash does not behave as well as WebRTC. So, if you move your hand on a similar network, the movement on the screen can be seen only after one or two seconds.

Regarding quality, we note that cubes can sometimes be distinguished by movements. This is consistent with the nature of the VP8 codec and its main purpose - to provide real-time video communication with acceptable quality and without communication delays.

The server is quite easy to install and configure; running it does not require any serious skills other than knowledge of Linux at the level of an advanced user who can execute commands from the console via ssh and use text editor. As a result, we managed to set up a one-to-many online broadcast between browsers. Connecting additional viewers to the stream also did not pose any problems.

The broadcast quality turned out to be quite acceptable for webinars and online broadcasts. The only thing that raised some questions was the video resolution. The camera supports 1280x800, but the resolution in the test image is very similar to 640x480. Apparently, this issue needs to be clarified with the developers.

Video on testing broadcast from a webcam
via WebRTC server

Technologies for making calls from the browser have been around for many years: Java, ActiveX, Adobe Flash...In the last few years it has become clear that plugins and left virtual machines They don’t shine with convenience (why should I install anything at all?) and, most importantly, with security. What to do? There is a way out!

Until recently, IP networks used several protocols for IP telephony or video: SIP, the most common protocol, the fading H.323 and MGCP, Jabber/Jingle (used in Gtalk), the semi-open Adobe RTMP* and, of course, the closed Skype. The WebRTC project, initiated by Google, is trying to revolutionize the world of IP and web telephony by making all softphones, including Skype, unnecessary. WebRTC not only implements all communication capabilities directly inside the browser, which is now installed on almost every device, but also tries to simultaneously solve a more general problem of communication between browser users (exchange of various data, screen broadcasting, collaboration with documents, and much more).

WebRTC from the web developer's perspective

From a web developer's point of view, WebRTC consists of two main parts:

control of media streams from local resources (camera, microphone or screen local computer) is implemented by the navigator.getUserMedia method, which returns a MediaStream object;
peer-to-peer communication between devices generating media streams, including defining communication methods and directly transmitting them - RTCPeerConnection objects (for sending and receiving audio and video streams) and RTCDataChannel (for sending and receiving data from the browser).

What are we going to do?

We will figure out how to organize a simple multi-user video chat between browsers based on WebRTC using web sockets. We’ll start experimenting in Chrome/Chromium, as the most advanced browsers in terms of WebRTC, although Firefox 22, released on June 24, has almost caught up with them. It must be said that the standard has not yet been adopted, and the API may change from version to version. All examples were tested in Chromium 28. For simplicity, we will not monitor the cleanliness of the code and cross-browser compatibility.

MediaStream

The first and simplest WebRTC component is MediaStream. It gives the browser access to media streams from the local computer's camera and microphone. In Chrome, for this you need to call the function navigator.webkitGetUserMedia() (since the standard is not yet finalized, all functions come with a prefix, and in Firefox the same function is called navigator.mozGetUserMedia()). When you call it, the user will be asked to allow access to the camera and microphone. It will be possible to continue the call only after the user gives his consent. The parameters of the required media stream and two callback functions are passed as parameters to this function: the first will be called if access to the camera/microphone is successfully obtained, the second - in case of an error. First, let's create an HTML file rtctest1.html with a button and an element:

WebRTC - first introduction video ( height: 240px; width: 320px; border: 1px solid gray; ) getUserMedia

Microsoft CU-RTC-Web

Microsoft would not be Microsoft if it had not immediately responded to Google's initiative by releasing its own incompatible non-standard variant called CU-RTC-Web (html5labs.interoperabilitybridges.com/cu-rtc-web/cu-rtc-web.htm). Although the share of IE, already small, continues to decline, the number of Skype users gives Microsoft hope to displace Google, and it can be assumed that this particular standard will be used in the browser Skype versions. The Google standard is focused primarily on communication between browsers; at the same time, the bulk of voice traffic still remains on the regular telephone network, and gateways between it and IP networks are needed not only for ease of use or faster distribution, but also as a means of monetization that will allow more players to develop them . The emergence of another standard may not only lead to an unpleasant need for developers to support two incompatible technologies at once, but also in the future give the user a wider choice of possible functionality and available technical solutions. Wait and see.

Enabling Local Stream

Inside the tags of our HTML file, let's declare a global variable for the media stream:

Var localStream = null;

The first parameter to the getUserMedia method must specify the parameters of the requested media stream - for example, simply enable audio or video:

Var streamConstraints = ("audio": true, "video": true); // Request access to both audio and video

Or specify additional parameters:

Var streamConstraints = ( "audio": true, "video": ( "mandatory": ( "maxWidth": "320", "maxHeight": "240", "maxFrameRate": "5"), "optional": ) );

The second parameter to the getUserMedia method must be passed to the callback function, which will be called if it is successful:

Function getUserMedia_success(stream) ( console.log("getUserMedia_success():", stream); localVideo1.src = URL.createObjectURL(stream); // Connect the media stream to the HTML element localStream = stream; // and save it in a global variable for future reference)

The third parameter is a callback function, an error handler that will be called in case of an error

Function getUserMedia_error(error) ( console.log("getUserMedia_error():", error); )

The actual call to the getUserMedia method is a request for access to the microphone and camera when the first button is pressed

Function getUserMedia_click() ( console.log("getUserMedia_click()"); navigator.webkitGetUserMedia(streamConstraints, getUserMedia_success, getUserMedia_error); )

It is not possible to access a media stream from a file opened locally. If we try to do this, we will get an error:

NavigatorUserMediaError (code: 1, PERMISSION_DENIED: 1)"

Let's upload the resulting file to the server, open it in the browser and, in response to the request that appears, allow access to the camera and microphone.

You can select which devices Chrome will have access to in Settings, link Show advanced settings. additional settings"), Privacy section ("Personal data"), Content button ("Content settings"). IN Firefox browsers and Opera, devices are selected from a drop-down list directly when access is allowed.

When using HTTP protocol permission will be requested each time the media stream is accessed after the page has loaded. Switching to HTTPS will allow you to display the request once, only the very first time you access the media stream.

Notice the pulsating circle in the bookmark icon and the camera icon on the right side of the address bar:

RTCMediaConnection

RTCMediaConnection is an object designed to establish and transmit media streams over the network between participants. In addition, this object is responsible for generating a media session description (SDP), obtaining information about ICE candidates for traversing NAT or firewalls (local and using STUN), and interacting with the TURN server. Each participant must have one RTCMediaConnection per connection. Media streams are transmitted using the encrypted SRTP protocol.

TURN servers

There are three types of ICE candidates: host, srflx and relay. Host contains information received locally, srflx - what the node looks like to an external server (STUN), and relay - information for proxying traffic through the TURN server. If our node is behind NAT, then host candidates will contain local addresses and will be useless, srflx candidates will only help with certain types of NAT and relay will be the last hope to pass traffic through an intermediate server.

Example of an ICE candidate of type host, with address 192.168.1.37 and port udp/34022:

A=candidate:337499441 2 udp 2113937151 192.168.1.37 34022 typ host generation 0

General format for specifying STUN/TURN servers:

Var servers = ( "iceServers": [ ( "url": "stun:stun.stunprotocol.org:3478" ), ( "url": "turn:user@host:port", "credential": "password" ) ]);

There are many public STUN servers on the Internet. There is a large list, for example. Unfortunately, they solve too few problems. There are practically no public TURN servers, unlike STUN. This is due to the fact that the TURN server passes through media streams, which can significantly load both the network channel and the server itself. Therefore, the easiest way to connect to TURN servers is to install it yourself (obviously, you will need a public IP). Of all the servers, in my opinion, the best is rfc5766-turn-server. There's even a underneath finished image for Amazon EC2.

With TURN, not everything is as good as we would like, but active development is underway, and I would like to hope that after some time WebRTC, if not equal to Skype in terms of quality of passage through address translation (NAT) and firewalls, is at least noticeable will come closer.

RTCMediaConnection requires an additional mechanism for exchanging control information to establish a connection - although it generates this data, it does not transmit it, and transmission to other participants must be implemented separately.

The choice of transfer method rests with the developer - at least manually. As soon as the exchange of necessary data takes place, RTCMediaConnection will install media streams automatically (if possible, of course).

offer-answer model

To establish and change media streams, the offer/answer model (described in RFC3264) and the SDP (Session Description Protocol) are used. They are also used SIP protocol. In this model, there are two agents: Offerer - the one who generates the SDP description of the session to create a new one or modify an existing one (Offer SDP), and Answerer - the one who receives the SDP description of the session from another agent and responds with its own session description (Answer SDP). At the same time, the specification requires a higher-level protocol (for example, SIP or its own over web sockets, as in our case), which is responsible for transmitting SDP between agents.

What data needs to be passed between two RTCMediaConnections so that they can successfully establish media streams:

The first participant initiating the connection forms an Offer in which it transmits an SDP data structure (the same protocol is used for the same purpose in SIP) describing the possible characteristics of the media stream that it is about to begin transmitting. This block of data must be transferred to the second participant. The second participant forms an Answer, with its SDP, and sends it to the first.
Both the first and second participants perform the procedure of determining possible ICE candidates with the help of which the second participant can transmit a media stream to them. As candidates are identified, information about them should be passed on to another participant.

Formation Offer

To generate an Offer, we need two functions. The first one will be called if it is successfully formed. The second parameter of the createOffer() method is a callback function called in case of an error during its execution (provided that the local thread is already available).

Additionally, two event handlers are needed: onicecandidate when defining a new ICE candidate and onaddstream when connecting a media stream from the far side. Let's go back to our file. Let's add another one to the HTML after the lines with elements:

createOffer

And after the line with the element (for the future):

Also at the beginning of the JavaScript code we will declare a global variable for RTCPeerConnection:

Var pc1;

When calling the RTCPeerConnection constructor, you must specify STUN/TURN servers. For more information about them, see the sidebar; as long as all participants are on the same network, they are not required.

Var servers = null;

Parameters for preparing Offer SDP

Var offerConstraints = ();

The first parameter of the createOffer() method is a callback function called upon successful formation of an Offer

Function pc1_createOffer_success(desc) ( console.log("pc1_createOffer_success(): \ndesc.sdp:\n"+desc.sdp+"desc:", desc); pc1.setLocalDescription(desc); // Set RTCPeerConnection generated by Offer SDP using the setLocalDescription method. // When the far side sends its Answer SDP, it will need to be set using the setRemoteDescription method // Until the second side is implemented, we do nothing // pc2_receivedOffer(desc);

The second parameter is a callback function that will be called in case of an error

Function pc1_createOffer_error(error)( console.log("pc1_createOffer_success_error(): error:", error); )

And let’s declare a callback function to which ICE candidates will be passed as they are determined:

Function pc1_onicecandidate(event)( if (event.candidate) ( console.log("pc1_onicecandidate():\n"+ event.candidate.candidate.replace("\r\n", ""), event.candidate); // Until the second side is implemented, we do nothing // pc2.addIceCandidate(new RTCIceCandidate(event.candidate) ) )

And also a callback function for adding a media stream from the far side (for the future, since for now we only have one RTCPeerConnection):

Function pc1_onaddstream(event) ( console.log("pc_onaddstream()"); remoteVideo1.src = URL.createObjectURL(event.stream); )

When you click on the “createOffer” button, we will create an RTCPeerConnection, set the onicecandidate and onaddstream methods and request the formation of an Offer SDP by calling the createOffer() method:

Function createOffer_click() ( console.log("createOffer_click()"); pc1 = new webkitRTCPeerConnection(servers); // Create RTCPeerConnection pc1.onicecandidate = pc1_onicecandidate; // Callback function for processing ICE candidates pc1.onaddstream = pc1_onaddstream; // Callback function called when a media stream appears from the far side. There is none yet pc1.addStream(localStream); // We will transmit the local media stream (assuming that it has already been received) pc1.createOffer(// And actually request the formation of the Offer pc1_createOffer_success , pc1_createOffer_error, offerConstraints)

Let's save the file as rtctest2.html, upload it to the server, open it in a browser and see in the console what data is generated during its operation. The second video will not appear yet, since there is only one participant. Let us recall that SDP is a description of the parameters of a media session, available codecs, media streams, and ICE candidates are possible options for connecting to a given participant.

Formation of Answer SDP and exchange of ICE candidates

Both the Offer SDP and each of the ICE candidates must be transferred to the other side and there, after receiving them, RTCPeerConnection calls the setRemoteDescription methods for the Offer SDP and addIceCandidate for each ICE candidate received from the far side; similarly in the opposite direction for Answer SDP and remote ICE candidates. The Answer SDP itself is formed similarly to the Offer; the difference is that it is not the createOffer method that is called, but the createAnswer method, and before that the RTCPeerConnection method setRemoteDescription is passed to the Offer SDP received from the caller.

Let's add another video element to the HTML:

And a global variable for the second RTCPeerConnection under the declaration of the first one:

Var pc2;

Processing Offer and Answer SDP

The formation of Answer SDP is very similar to Offer. In the callback function called upon successful formation of an Answer, similar to Offer, we will give a local description and pass the received Answer SDP to the first participant:

Function pc2_createAnswer_success(desc) ( pc2.setLocalDescription(desc); console.log("pc2_createAnswer_success()", desc.sdp); pc1.setRemoteDescription(desc); )

The callback function, called in case of an error when generating Answer, is completely similar to Offer:

Function pc2_createAnswer_error(error) ( console.log("pc2_createAnswer_error():", error); )

Parameters for forming Answer SDP:

Var answerConstraints = ( "mandatory": ( "OfferToReceiveAudio":true, "OfferToReceiveVideo":true ) );

When the second participant receives the Offer, we will create an RTCPeerConnection and form an Answer in the same way as the Offer:

Function pc2_receivedOffer(desc) ( console.log("pc2_receiveOffer()", desc); // Create an RTCPeerConnection object for the second participant in the same way as the first one pc2 = new webkitRTCPeerConnection(servers); pc2.onicecandidate = pc2_onicecandidate; // Set the event handler when it appears ICE candidate pc2.onaddstream = pc_onaddstream; // When a stream appears, connect it to HTML pc2.addStream(localStream); // Transfer the local media stream (in our example, the second participant has the same one as the first) // Now, when the second RTCPeerConnection is ready, we will pass it the received Offer SDP (we passed the local stream to the first one) pc2.setRemoteDescription(new RTCSessionDescription(desc)); // Request the second connection to generate data for the Answer message pc2.createAnswer(pc2_createAnswer_success, pc2_createAnswer_error, answerConstraints); )

In order to transfer Offer SDP from the first participant to the second in our example, let’s uncomment it in the pc1 function createOffer success() call line:

Pc2_receivedOffer(desc);

To implement the processing of ICE candidates, let’s uncomment in the ICE candidate readiness event handler of the first participant pc1_onicecandidate() its transfer to the second:

Pc2.addIceCandidate(new RTCIceCandidate(event.candidate));

The second participant's ICE candidate readiness event handler is mirror-like to the first:

Function pc2_onicecandidate(event) ( if (event.candidate) ( console.log("pc2_onicecandidate():", event.candidate.candidate); pc1.addIceCandidate(new RTCIceCandidate(event.candidate)); ) )

Callback function for adding a media stream from the first participant:

Function pc2_onaddstream(event) ( console.log("pc_onaddstream()"); remoteVideo2.src = URL.createObjectURL(event.stream); )

Ending the connection

Let's add another button to the HTML

Hang Up

And a function to terminate the connection

Function btnHangupClick() ( // Disconnect local video from HTML elements, stop the local media stream, set = null localVideo1.src = ""; localStream.stop(); localStream = null; // For each participant, disable video from HTML elements, close the connection, set the pointer = null remoteVideo1.src = ""; pc1 = null; remoteVideo2.src = "";

Let's save it as rtctest3.html, upload it to the server and open it in the browser. This example implements two-way transmission of media streams between two RTCPeerConnections within the same browser tab. To organize the exchange of Offer and Answer SDP, ICE candidates between participants and other information through the network, instead of directly calling procedures, it will be necessary to implement the exchange between participants using some kind of transport, in our case - web sockets.

Screen broadcast

The getUserMedia function can also capture the screen and stream as a MediaStream by specifying the following parameters:

Var mediaStreamConstraints = ( audio: false, video: ( mandatory: ( chromeMediaSource: "screen"), optional: ) );

To successfully access the screen, several conditions must be met:

enable screenshot flag in getUserMedia() in chrome://flags/,chrome://flags/;
the source file must be downloaded via HTTPS (SSL origin);
the audio stream should not be requested;
Multiple requests should not be executed in one browser tab.

Libraries for WebRTC

Although WebRTC is not yet finished, several libraries based on it have already appeared. JsSIP is designed to create browser-based softphones that work with SIP switches such as Asterisk and Camalio. PeerJS will make it easier to create P2P networks for data exchange, and Holla will reduce the amount of development required for P2P communications from browsers.

Node.js and socket.io

In order to organize the exchange of SDP and ICE candidates between two RTCPeerConnections via the network, we use Node.js with the socket.io module.

Installing the latest stable version of Node.js (for Debian/Ubuntu) is described

$ sudo apt-get install python-software-properties python g++ make $ sudo add-apt-repository ppa:chris-lea/node.js $ sudo apt-get update $ sudo apt-get install nodejs

Installation for others operating systems described

Let's check:

$ echo "sys=require("util"); sys.puts("Test message");" > nodetest1.js $ nodejs nodetest1.js

Using npm (Node Package Manager) we will install socket.io and the additional express module:

$ npm install socket.io express

Let's test it by creating a nodetest2.js file for the server side:

$ nano nodetest2.js var app = require("express")() , server = require("http").createServer(app) , io = require("socket.io").listen(server); server.listen(80); // If port 80 is free app.get("/", function (req, res) ( // When accessing the root page res.sendfile(__dirname + "/nodetest2.html"); // send the HTML file )) ; io.sockets.on("connection", function (socket) ( // When connecting socket.emit("server event", ( hello: "world" )); // send a message socket.on("client event", function (data) ( // and declare an event handler when a message arrives from the client console.log(data); ));

And nodetest2.html for the client side:

$ nano nodetest2.html var socket = io.connect("/"); // Websocket server URL (the root page of the server from which the page was loaded) socket.on("server event", function (data) ( console.log(data); socket.emit("client event", ( " name": "value" )); ));

Let's start the server:

$ sudo nodejs nodetest2.js

and open the page http://localhost:80 (if running locally on port 80) in the browser. If everything is successful, in the console Browser JavaScript we will see the exchange of events between the browser and the server when connecting.

Exchange of information between RTCPeerConnection via web sockets Client part

Let's save our main example (rtcdemo3.html) under the new name rtcdemo4.html. Let's include the socket.io library in the element:

And at the beginning of the JavaScript script - connecting to websockets:

Var socket = io.connect("http://localhost");

Let's replace the direct call to the functions of another participant by sending him a message via web sockets:

Function createOffer_success(desc) ( ... // pc2_receivedOffer(desc); socket.emit("offer", desc); ... ) function pc2_createAnswer_success(desc) ( ... // pc1.setRemoteDescription(desc); socket .emit("answer", desc); ) function pc1_onicecandidate(event) ( ... // pc2.addIceCandidate(new RTCIceCandidate(event.candidate)); socket.emit("ice1", event.candidate); .. . ) function pc2_onicecandidate(event) ( ... // pc1.addIceCandidate(new RTCIceCandidate(event.candidate)); socket.emit("ice2", event.candidate); ... )

In the hangup() function, instead of directly calling the functions of the second participant, we will transmit a message via web sockets:

Function btnHangupClick() ( ... // remoteVideo2.src = ""; pc2.close(); pc2 = null; socket.emit("hangup", ()); )

And add message receiving handlers:

Socket.on("offer", function (data) ( console.log("socket.on("offer"):", data); pc2_receivedOffer(data); )); socket.on("answer", function (data) (е console.log("socket.on("answer"):", data); pc1.setRemoteDescription(new RTCSessionDescription(data)); )); socket.on("ice1", function (data) ( console.log("socket.on("ice1"):", data); pc2.addIceCandidate(new RTCIceCandidate(data)); )); socket.on("ice2", function (data) ( console.log("socket.on("ice2"):", data); pc1.addIceCandidate(new RTCIceCandidate(data)); )); socket.on("hangup", function (data) ( console.log("socket.on("hangup"):", data); remoteVideo2.src = ""; pc2.close(); pc2 = null; ) );

Server part

On the server side, save the nodetest2 file under the new name rtctest4.js and inside the io.sockets.on("connection", function (socket) ( ... ) function we will add receiving and sending client messages:

Socket.on("offer", function (data) ( // When we receive the "offer" message, // since there is only one client connection in this example, // we will send the message back through the same socket socket.emit("offer" , data); // If it were necessary to forward the message over all connections // except the sender: // soket.broadcast.emit("offer", data )); socket.on("answer", function (data) ( socket.emit("answer", data); )); socket.on("ice1", function (data) ( socket.emit("ice1", data); )); socket.on("ice2", function (data) ( socket.emit("ice2", data); )); socket.on("hangup", function (data) ( socket.emit("hangup", data); ));

In addition, let's change the name of the HTML file:

// res.sendfile(__dirname + "/nodetest2.html"); // Send the HTML file res.sendfile(__dirname + "/rtctest4.html");

Starting the server:

$ sudo nodejs nodetest2.js

Despite the fact that the code of both clients is executed within the same browser tab, all interaction between the participants in our example is completely carried out over the network and “separating” the participants does not require any special difficulties. However, what we did was also very simple - these technologies are good because they are easy to use. Even if sometimes deceptive. In particular, let's not forget that without STUN/TURN servers our example will not be able to work in the presence of address translation and firewalls.

Conclusion

The resulting example is very conventional, but if we slightly universalize the event handlers so that they do not differ between the caller and the called party, instead of two objects pc1 and pc2, make an RTCPeerConnection array and implement dynamic creation and removing elements, you will get a completely usable video chat. There are no special specifics associated with WebRTC, and an example of a simple video chat for several participants (as well as the texts of all examples in the article) is on the disk that comes with the magazine. However, you can already find quite a lot on the Internet. good examples. In particular, the following were used in preparing the article: simpl.info getUserMedia, simpl.info RTCPeerConnection, WebRTC Reference App.

It can be assumed that very soon, thanks to WebRTC, there will be a revolution not only in our understanding of voice and video communications, but also in the way we perceive the Internet as a whole. WebRTC is positioned not only as a technology for browser-to-browser calls, but also as a real-time communication technology. The video communication we discussed is only a small part possible options its use. There are already examples of screencasting and collaboration, and even a browser-based P2P content delivery network using RTCDataChannel.

Webrtc voice chat. Multi-user chat using WebRTC

Fly Tornado Slim review: the thinnest smartphone in the world has arrived in Russia Fly Tornado Slim

Tuner for tuning guitar Guitar Tuner from MuzLand

3 string processing functions

Updating Yandex browser to the latest version

Animation layers. What is animation

Webrtc voice chat. Multi-user chat using WebRTC

Related articles