Skip to content

Video Chat Stack Options

Michael Sawan edited this page Feb 5, 2024 · 1 revision

Introduction

Being a telehealth app, our senior design project needs to facilitate a video conversation between a doctor and their patients. Therefore, we looked at different technologies to add video chat functionality to our app. Below, we describe each framework and highlight their key features, advantages, and disadvantages. Although options such as WebRTC and WebSockets would've given us the most flexibility and been most conducive to learning how these technologies, we decided to go with Agora.io because of its simplicity and generous free tier limits.

Options

1. WebRTC over WebSockets

  • Description: WebRTC (Web Real-Time Communication) is an open-source project that provides web browsers and mobile applications with real-time communication via simple APIs. It supports video, voice, and generic data to be sent between peers, allowing developers to build powerful voice- and video-communication solutions. WebSocket is a protocol that provides a full communication channel over a single, persistent connection between a client and the server. The low-latency and persistent communication of WebSockets make it suitable to set up a video chat program. Combining the two requires you to use WebRTC on the clients and creating a WebSocket server that establishes the video link between the host and recipient.
  • Pros:
    • Optimized Real-Time Communication: WebRTC is specifically designed for real-time media applications, providing high-quality video and audio streaming. This ensures an excellent user experience in our video chat component.
    • Low Latency: WebSockets offer a persistent, low-latency connection that is ideal for sending real-time control messages, chat, or updates during a video call, enhancing the interactivity of our senior design project.
    • Scalability: Using WebSockets for non-media related data (like text chat, file sharing, or control signals) offloads some tasks from the WebRTC layer, potentially improving scalability and performance of the video chat.
    • Compatibility: Both technologies works in most modern browsers without external dependencies.
  • Cons:
    • Complexity in Implementation: Integrating WebRTC with WebSockets adds complexity to the development process. We need to handle two different technologies, each with its own set of APIs and quirks.
    • Handling Disconnections: Both WebRTC and WebSockets require robust handling of network disconnections and reconnections, which can add to the development complexity.
    • Security Considerations: Both technologies require careful attention to security. WebSockets need secure handling to prevent attacks like message interception or spoofing.

2. Twilio

  • Description: Founded in 2008, Twilio is a cloud communications platform that provides a suite of APIs and services for developers to build complex communication systems, including video chat applications. Twilio's Video API, developers can create customized video chat experiences with features like one-to-one, group calling, and screen sharing. The API supports WebRTC and handles intricate aspects like signaling, network traversal, and media optimization. This means developers can focus on building unique features and user interfaces without worrying about the underlying complexities of real-time video communication.
  • Pros:
    • Ease of Use and Integration: Twilio provides comprehensive APIs and SDKs for various programming languages, making it relatively straightforward to integrate video chat into our telehealth app.
    • Advanced Features: Twilio supports advanced video chat functionalities such as screen sharing, group calls, and recording, allowing for the creation of feature-rich applications.
    • Global Reach and Reliability: With a strong global presence, Twilio offers reliable connectivity and high-quality video streaming across different geographical locations, which is crucial if we would like to have our application run reliably anywhere in the country.
  • Cons:
    • Cost Concerns: Twilio operates on a pay-as-you-go pricing model, which can become costly depending on the app’s usage patterns and the number of users. The free tier for Twilio is also rather limited, and may not suffice outside of our development testing.
    • Potential for Vendor Lock-in: Integrating deeply with Twilio’s services might lead to vendor lock-in, making it challenging to switch providers in the future without significant redevelopment.
    • Inconsistent User Experience Across Devices: Twilio's performance can vary depending on the end user's device capabilities and operating system. Twilio's Video Chat SDK, which is resource-intensive, might not perform optimally on older or less powerful devices.

3. Agora

  • Description: Founded in 2014, Agora.io provides a platform for real-time communication that focuses on delivering high-quality, low-latency video and voice calls. Agora's technology is based on a proprietary real-time communication (RTC) network, which ensures reliable and scalable performance across global locations. Additionally, Agora.io includes features like adaptive bitrate streaming, automatic noise cancellation, and echo reduction, enhancing the overall user experience in video chats.
  • Pros:
    • Ultra-Low Latency: Agora specializes in ultra-low latency streaming, which is crucial for our telehealth app: we do not want doctors or patients missing anything that the other might say.
    • Strong Network Optimization: Agora's intelligent network optimization ensures stable performance even in varying network conditions, which is vital for maintaining call quality.
    • Advanced Features: Like Twilio, Agora supports advanced video chat functionalities such as screen sharing, echo reduction, group callls, which can improve the telehealth experience.
    • Generous Free Tier: Agora provides a very generous free tier where the first 10,000 minutes of video chat are free per month. This is more than sufficient for us in order to test and demonstrate.
  • Cons:
    • Dependence on External Service: Relying on Agora’s infrastructure means dependence on their service for critical application functionality, which can pose risks in terms of service stability and data privacy.
    • Potential Data Compliance Issues: Depending on Cigna's data collection policy, it may be problematic that Agora retains a lot of the video chat, even if we are not using the recording functionality.

4. Daily.co

  • Description: Daily.co is a startup that offers powerful and user-friendly API for video and audio calls. It is especially focused on ease of integration, offering pre-built UI components and low-level APIs for more advanced users. The API supports a wide range of functionalities, including one-on-one calls, group calls, screen sharing, and call recording. It also has more of a focus on security than competitors, offering encrypted video streams, which ensures privacy and data protection.
  • Pros:
    • Ease of Integration: Daily.co provides a simple API and SDKs that allow for quick integration of video chat features into apps, which is especially beneficial for us if we are trying to rapidly prototype.
    • Pre-Built UI Components: The platform offers pre-built user interface components, which can significantly speed up development time and reduce the effort needed for designing and testing the UI from scratch.
    • Generous Free Tier: Like Agora, Daily.co provides a similarly generous free tier where the first 10,000 minutes of video chat are free per month.
    • Security Focus: With encrypted video streams, Daily.co ensures a high level of security and privacy for communications, which is crucial for user trust and compliance with data protection regulations.
  • Cons:
    • Reliability Concerns as a Startup: As a relatively new player in the market, there may be concerns regarding the long-term stability and reliability of Daily.co, both in terms of business continuity and technical performance.
    • Dependence on Third-Party Service: Like with Agora and Twilio, relying on Daily.co’s infrastructure means dependence on their service for a critical application feature, which can pose risks in service availability and data control.

Conclusion

In conclusion, we looked at various technologies for adding video chat to our senior design project. Each had unique features, advantages, and disadvantages. When selecting a technology, we considering three main factors: (1) cost, (2) reliability, and (3) ease of integration. We initially thought of using WebSockets and WebRTC because it would be completely free since it can be run locally, which is fine for testing. However, adding additional features became complicated with that stack, so we decided to use Agora instead. Agora, as explained above, is very easy to integrate with, has a generous free tier (unlike Twilio), and is a more established product (unlike Daily.co).

References

https://webrtc.org/

https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_API

https://www.twilio.com/en-us/company

https://www.agora.io/en/about-us/

https://www.daily.co/company/

Arvind Kasiliya