Skip to content
This repository has been archived by the owner on Feb 21, 2022. It is now read-only.

Protocol Overview

Alexander Thoukydides edited this page Dec 26, 2017 · 4 revisions

SkyBell HD Doorbell Protocol Overview

⚠️ The following description is based on sniffing the (encrypted) packets sent and received by a SkyBell HD doorbell and its companion iPhone app. There may be significant errors and omissions.

SkyBell HD uses three main protocols between the doorbell, its cloud services hosted on AWS (Amazon Web Services), and the SkyBell HD phone app:

  • CoAP (Constrained Application Protocol), presumably over DTLS (Datagram Transport Layer Security):
  • HTTPS (HTTP Secure):
    • This is used by the doorbell to send avatar images and recorded video to the cloud.
    • It is also used by the app to communicate with the cloud, including to download previously recorded videos. See details of the HTTPS requests.
  • SRTP (Secure Real-time Protocol):
    • This is used to transfer the video and audio for Watch Live calls between the doorbell and app (proxied via the cloud) only. The cloud does not record this stream. See details of the SRTP streams.

(Other protocols such as ARP, DHCP and NTP are also used for their usual purposes.)

All communication between the doorbell and app is routed via the cloud services. The app never connects directly to the doorbell.

Idle

When the doorbell is idle it periodically:

  • Sends messages over CoAP to SkyBell's cloud services, approximately every 30 seconds. These are presumably to indicate the IP address at which the doorbell is reachable. Additional CoAP exchanges of unknown purpose occur occasionally, typically every one or two hours.
  • Queries an NTP server to update its clock, approximately every 340 seconds. Each request is to a different server.
  • Uploads a new avatar image via HTTPS once per hour. Each upload involves a short request followed by a longer request to a different server. Presumably the first is asking where the image should be pushed.

Button Press or Motion Event

When the doorbell button is pressed (or motion is detected):

  1. The doorbell sends a notifcation over CoAP to SkyBell's cloud services, which forwards the notification via Apple Push Notification service (APNs) or Google Cloud Messaging (GCM) as appropriate. Meanwhile the doorbell begins to record a video locally.
  2. When the recording ends the doorbell pushes it to the cloud for storage via an HTTPS request.
  3. The app can subsequently poll the list of recordings and download videos for playback, all over HTTPS.
    ┌────────────┐        ┌─────┐          ┌──────────┐     ┌─────────────┐
    │ SkyBell HD │        │ AWS │          │ APNs/GCM │     │ iOS/Android │
    └─────┬──────┘        └──┬──┘          └────┬─────┘     └──────┬──────┘
          │                  │                  │                  │
1. <Button pressed>          │                  │                  │
          │                  │                  │                  │
        ┌─┴─┐Notification    │                  │                  │
        │ R │(CoAP)          │Notification      │                  │
        │ e ├───────────────►│(HTTPS)           │Notification      │
        │ c │                ├─────────────────►│(APNS/GCM)        │
        │ o │                │                  ├─────────────────►│
        │ r │                │                  │                  │
        │ d │                │                  │                  │
        │   │                │                  │                  │
        │ v │                │                  │                  │
        │ i │                │                  │                  │
        │ d │                │                  │                  │
        │ e │                │                  │                  │
        │ o │                │                  │                  │
        └─┬─┘                │                  │                  │
2.        │Push video        │                  │                  │
          │(HTTPS)           │                  │                  │
          ├─────────────────►│                  │                  │
          │                  │                  │                  │
3.        │                  │                  │           <App launched>
          │                  │                  │                  │
          │                  │                  │   Poll activities│
          │                  │                  │           (HTTPS)│
          │                  │◄────────────────────────────────────┤
          │                  │                  │    Download video│
          │                  │                  │           (HTTPS)│
          │                  │◄────────────────────────────────────┤
          │                  │                  │                  │

Watch Live

If Watch Live is requested from the app then:

  1. The app requests a call by sending an HTTPS request to the SkyBell cloud. The cloud selects SRTP parameters and provides them to the app via the HTTPS response and to the doorbell via CoAP.
  2. The doorbell and app then start the SRTP streams which are transparently proxied via the cloud services. The doorbell also begins to record a video locally.
  3. Either the doorbell or app can terminate the call, via CoAP or HTTPS respectively. The doorbell then pushes the recorded video to the cloud via HTTPS.
    ┌────────────┐                  ┌─────┐                  ┌─────────────┐
    │ SkyBell HD │                  │ AWS │                  │ iOS/Android │
    └─────┬──────┘                  └──┬──┘                  └──────┬──────┘
          │                            │                            │
1.        │                            │                       <Watch Live>
          │                            │                            │
          │                            │                Request call│
          │              Request call  │                     (HTTPS)│
          │                    (CoAP)┌─┴─┐◄─────────────────────────┤
          │◄─────────────────────────┤   │                          │
          │                          │   │                          │
2.      ┌─┴─┐Video & audio           │ S │             Video & audio│
        │ R │(SRTP)                  │ R │                    (SRTP)│
        │ e ├───────────────────────►│ T ├─────────────────────────►│
        │ c │───────────────────────►│ P ├─────────────────────────►│
        │ o │◄───────────────────────┤   │◄─────────────────────────┤
        │ r │                        │ p │                          │
        │ d │                        │ r │                          │
        │   │                        │ o │                          │
3.      │ v │                        │ x │            Terminate call│
        │ i │          Terminate call│ y │                   (HTTPS)│
        │ d │                  (CoAP)│   │◄─────────────────────────┤
        │ e │◄───────────────────────┤   │                          │
        │ o │                        └─┬─┘                          │
        └─┬─┘                          │                            │
          │Push video                  │                            │
          │(HTTPS)                     │                            │
          ├───────────────────────────►│                            │

A similar sequence occurs if a call is initiated following a button press, except that the doorbell continues its existing video recording.