Skip to content
cayden edited this page Jan 12, 2022 · 14 revisions

Documentation

Welcome to the WearableIntelligenceSystem (WIS) documentation wiki! Here, we provide an overview of what the system is, what it can do, how it works, how you can contribute, etc. If you're keen to start and just want a quick overview, go ahead and read the Getting Started section below.

This is mainly written for developers and researchers. If you just want to use the WIS as a user, checkout the README, which describes how to get it set up.

If you read through this and can't find an answer to your question, please create an issue.

Getting Started

The Wearable Intelligence System is a smart glasses home page and suite of powerful applications for users. But, you already know that from the README.

As a developer, the WIS represents a better way to develop applications for smart glasses. There are so many things to get right in a successful smart glasses application (including phone connection, voice transcription, sensor access, voice control, UI, edge AI, etc.) that building even simple apps for smart glasses can be difficult and time-consuming. The WIS has done a lot of the tough legwork for smart glasses development, allowing you to focus on building your app. It's also a step towards an egocentric operation system (OS), an OS which needs to work very differently because of the form factor, interface, and use cases that wearable applications demand.

Checkout the main README for a high level view of what the system is, what it's trying to accomplish, and how it helps users.

Ok, if you're here, you've already read the README. You're a developer, industry, start-up, or otherwise super user that wants to upgrade or modify the system in some way. If so, you or your team will need some basic knowledge of Android and Android Studio. If you've never used Android/Android Studio before, we recommend you go run through this tutorial first.

Finally, if you upgrade the system in any way, please consider making a pull request, so everyone else can benefit from that change, too.

Abbreviations

These abbreviations will be used everywhere, so read them twice.

WIS - Wearable Intelligence System
ASP - Android Smart Phone
ASG - Android Smart Glasses
GLBOX - GNU/Linux 'Single Board Computer'/Laptop (just a regular old computer server)

Required Devices/Hardware + Abbreviations

You'll need at two pieces of hardware to run the system:

  1. ASP - Android Smart Phone, currently supporting Android 9+
  2. ASG - Android Smart Glasses, currently supporting Android 5.1 (version update coming soon)
Officially Supported Smart Glasses

Vuzix Blade - Supported
Nreal Light - In development
Vuzix Shield - Coming Soon
Epson Moverio BT200 - Previously supported, may still work

If you device isn't supported, create an issue and we may be able to support it.

Install

User / Industry / Researcher Install

Please follow the instructions in the README for how to get the system up and running. Note that the consumer facing application uses a server hosted by Emex Labs. If you want to use your own server, you'll need to follow the developer setup and install instructions.

Developer Install

If you're a developer and you want to build the system on your own machine, follow the instructions here. Remember that there are 3 main hardware components to this system, and each has its own build process.

To install the system, you have three options:

  1. USER INSTALL - Install the pre-built APKs on your ASP and ASG, and use the Emex Labs public GLBOX.
  2. DEVELOPER INSTALL - Build your own APKs for ASP and ASG, and use the Emex Labs public GLBOX
  3. DEVELOPER+ INSTALL - Build your own APKs for ASP and ASG, and setup your own GLBOX
1. User Install

Head on back to the README Install section for instructions on how to install without any modifications to the application.

2. Developer Install
  1. Clone this repo:
git clone [email protected]:emexlabs/WearableIntelligenceSystem.git #clone main repo
git submodule update --init --recursive #clone submodules
  1. Setup and build the ASG app with Android Studio
  2. Setup and build the ASP app with Android Studio
  3. Start the ASG app on the ASG and the ASP app on the ASP, then follow the WiFi hotspot section in README Install section to get the system running.
3. Developer+ Install
  1. Clone this repo:
git clone [email protected]:emexlabs/WearableIntelligenceSystem.git #clone main repo
git submodule update --init --recursive #clone submodules
  1. Setup and build the ASG app with Android Studio
  2. Setup and build the ASP app with Android Studio
  3. Modify the ASP application at comms/RestServerComms.java and change the variable serverUrl to point to your GLBOX domain name or public IP address.
  4. Install and run the GLBOX
  5. Start the ASG app on the ASG and the ASP app on the ASP, then follow the WiFi hotspot section in README Install section to get the system running.

Architecture

The system is architected such that the ASG is treated as only an input/output device. The ASG receives human input and sensor input and sends all of that data immediately to the ASP. The ASP then processes and saves the data. Outputs from data processing (commands, insights, events, etc.) are then passed from the ASP to the ASG, and the ASG displays that info to the user. Thus, the ASG is running as little computation as possible.

There is also a cloud server. Due to data bandwidth limits (e.g. 2Gb/month ISP data plan) streaming video and audio all the time to a backend is not possible. The backend handles functionality that is best suited to run on the cloud (mostly third party API calls that require an external key) and will be expanded in the future to include a web interface.

Comms/Events/Data

In order to remain modular and decoupled with such a large and fast-changing system, the system uses JSON IPC between the ASP and ASG.

Data and function calls are passed around the application on an event bus. Right now, we are using RXJAVA as the event bus, with our own custom parsing, and all event keys can be found in /commes/MessageTypes.java.

Instead of calling functions directly, which requires passing many objects around and becomes too complex with a big system like this, we only pass around the "dataObservable" rxjava object, which handles sending data and triggering messages anywhere in the app. These events are multicast, so multiple different systems can respond to the same message.

  • Soon, we'll move RXJAVA to Android EventBus

Codebase Overview

ASP
  • MainActivity.java - first class that is run on app launch, in charge of the UI and launching the WearableAiAspService.java background service
  • WearableAiAspService.java - where connection to the ASG starts and all processing happens. This launches connections, moves data around, run processing and stays alive in the background.
  • ASGRepresentative.java - a system that communicates with the ASG
  • GLBOXRepresentative.java - a system that communicates with the GLBOX
  • comms/ - handle all communications like WiFi and Bluetooth
  • comms/AudioSystem.java - handles decrypting encrypted audio
  • speechrecognition/ - handles transcribing speech
  • voicecommand/ - handle the voice command parsing and generating tasks
  • nlp/ - handles Natural language processing (NLP) tasks
  • facialrecognition/ - runs processsing and saving of facial recognition data
  • ui/ - all the ui interactive fragments are here
  • database/ - handles saving and recall of data
  • utils/ - a number of utils to support file saving, common functions, etc.
ASG
  • MainActivity.java - first class that is run on app launch, in charge of the UI and launching the WearableAiAspService.java background service
  • WearableAiAspService.java - where connection to the ASP starts, and data is passed from the ASP to the ASG UI at MainActivity.java
  • comms/ - handle all communications like WiFi and Bluetooth
  • sensors/ - handle all sensors like Microphone, EEG, etc.
  • utils/ - a number of utils to support file saving, common functions, etc.
GLBOX
  • main_webserver.py - main system which runs the web server
  • api/ - all of the REST API endpoints
  • main_tools.py - a helper class that contains top level functionality that classes in /api can call and utilizes lower level functions in - utils/
  • utils/ - a bunch of supporting function, classes, and API keys

Voice Commands

For a list of all the voice commands, see the README Voice Commands section.

The voice command system runs entirely on the ASP. Audio is received from the ASG and transcribed on the ASG (see Speech Recognition. The voice transcripts are then sent to the Voice Command system on the ASP for processing and throwing off commands.

  • voicecommand/VoiceCommandServer.java is the top level system which receives transcripts, parses wake words, parses commands, and runs voice commands.
  • voicecommand/VoiceCommand.java is the interface that all voice commands implement.
  • The other files in voicecommand/ implement individual voice commands.
Add a new voice command
  1. Copy one of the existing voice commands to a new file name and class name.
  2. Modify the wakeWords list, commandList, commandName, and run functions to perform the function you wish. Remember, the system uses an rxjava event bus, so don't write the business logic in the command, just send off an event/message on the event bus so the appropriate subsystem can run the actual functionality. If that isn't clear, go reread Architecture.
  3. Add your new command to the list of live commands in the voicecommands/VoiceCommand.java voiceCommand list.

Speech Recognition

We use Vosk for automatic speech recognition (ASR). This is because the system is high accuracy, runs locally on Android, and is almost completely open source. The audio data is streamed from the ASG microphone (connected Bluetooth SCO microphone) to the ASP, where's it transcribed by Vosk.

Audio streaming from ASG - android_smart_glasses/.../AudioSystem.java
Audio receiving on ASP - android_smart_phone/.../comms/AudioSystem.java
Vosk Speech recognition system - android_smart_phone/.../speechrecognition/SpeechRecVosk.java

Vosk Model

We use the main Vosk android model: vosk-model-small-en-us-0.15 included as a dependency in the app.

However we have successfully tested using both vosk-model-en-us-0.22 and vosk-model-en-us-0.22-lgraph. The problem with vosk-model-en-us-0.22 is that it makes the build time ~10 minutes because it's so large, which is unreasonable, and it lags behind except on the most powerful ASPs (older chipsets can't keep up). For now we will use the standard model, with a future interest in upgrading the model, especially for far field, conversational voice recognition.

In case it wasn't clear - all speech recognition runs locally, with no audio streamed over the internet.

ASP - Android Smart Phone

This app runs on any Android 9+ smart phone. We recommend significant computing power for the ASP, something like a Snapdragon 855+ or better, and something that supports WiFi sharing.

Build and Install - ASP

Main Application

Open, build, and run the app in main/ from Android Studio, just like any other Android app.

Mediapipe AAR

Not currently working with the latest master since we switched to Gradle. The mediapipe library is how we run AIML on the edge. Since we just moved to Gradle/Android-Studio and Mediapipe is in Bazel, we still need to either convert the Mediapipe system to an AAR we build in Bazel and import in the main Gradle app, or deprecate the mediapipe system. Below is how to build the MediaPipe system:

  1. Follow these instructions to setup Bazel and MediaPipe: https://google.github.io/mediapipe/getting_started/android.html (including the external link on this page on how to install MediaPipe)
  2. Change the SDK and NDK in ./main/WORKSPACE to point to your own Android SDK install (if you don't have one, install Android Studio and download an SDK and NDK)
  3. Run this command:
bazel build -c opt --config=android_arm64 --java_runtime_version=1.8 --noincremental_dexing --verbose_failures mediapipe/examples/android/src/java/com/google/mediapipe/apps/wearableai:wearableai;
  1. You have now built the application!
  2. For subsequent builds where you don't change anything in WORKSPACE file, use the following command for faster build:
bazel build -c opt --config=android_arm64 --java_runtime_version=1.8 --noincremental_dexing --verbose_failures --fetch=false mediapipe/examples/android/src/java/com/google/mediapipe/apps/wearableai:wearableai;

Autociter - temporary note

For now, to add references to your database, update the CSV in ASP application assets/wearable_referencer_references.csv. This also will be moved to a user-facing UI shortly.

Edge AIML - Wearable AI Pipeline - Mediapipe

This is how we run a number of different machine learning models, on the edge, all at the same time, using the ASP's GPU. This system makes it easier to add new models and possible to cascade models which run inference on the output of models higher in the perception pipeline.

To do so, we us Google MediaPipe, which is way to define intelligence graphs ("perception pipelines") which take input, do intelligence processing (by creating flow of data between machine learning models and hard coded functions known as "Calculators"). This app is built on the Google MediaPipe even though ./main/ is not currently tracking Google MediaPipe repo. In the future, if we want to pull in new work from the main MediaPipe repository, we will set things up again to track Google Mediapipe.

Places / Scene Classification

This is a WIP. Detect/classify/recognise the scene/location/place from POV video.

Keras-VGG16-places365/ is the places365 system converted to a tensorflowlite model for our WearableAI graph that is currently running on the ASP

ASG - Android Smart Glasses

This app runs on Android Smart Glasses (ASG). It's designed to be able to work on any pair of Android smart glasses. See Officially Supported Smart Glasses for officially supported hardware.

Build and Install - ASG

android_smart_glasses/main is the main Android application to run on the ASG.

Open android_smart_glasses/main in Android studio. Setup the ASG USB debugging (see below). Use Android Studio to build + run + upload to glasses + edit in Android Studio.

Android Smart Glasses USB Debugging

In order to install the application from Android Studio to your ASG, you need to enable USB debugging on the ASG.

This setup depends on your hardware. There is usually a similar method for all Android devices, with slight differences between hardware.

Vuzix Blade
  1. Go to "Settings" -> "System" -> "About" -> "Device Info" -> (swipe forward ten times)
  2. Go to "Settings" -> "System" -> "Dev Options" -> "USB Debugging" -> (turn this on)

GLBOX - GNU Linux Box

There is already a publicly acessable GLBOX running at https://wis.emexwearables.com/api. You only need to follow these steps if you want to use your own server.

This is simply a cloud server, running Flask (a Python HTTP server) with Flask-Restful for API handling.

There are two main things:

  • main_webserver.py - the main web server which runs function for the mobile computer
  • (DEPRECATED) main_socket.py - this is being deprecated, but it contains a lot of the fundamental system that we are moving to the web server

Build and Install - GLBOX

  1. Clone this repo:
git clone [email protected]:emexlabs/WearableIntelligenceSystem.git #clone main repo
git submodule update --init --recursive #clone submodules
  1. cd to /gnu_linux_box/backend
  2. Make and activate a virtualenv. Example: python3 -m virtualenv venv && source venv/bin/activate
  3. Install requirements (pip3 install -r requirements.txt)
  4. Install spacy NLP artifact: python3 -m spacy download en_core_web_sm
  5. Make file ./utils/wolfram_api_key.txt and paste your WolframOne App Id in there. If you don't have a WolframOne App Id, make one here. This is, right now, required for natural language queries.
  6. (IGNORE UNLESS DEVELOPING WITH GOOGLE TRANSLATE OR GOOGLE SPEECH TO TEXT) Setup GCP to allow Speech to Text and Translation, make credentials for it, and place your JSON credentials at ./utils/creds.json. This is used for speech to text and translation services.
  7. Setup Microsoft Azure to allow Bing search API, get the key and paste just the key string in ./utils/azure_key.txt. This is needed, for now, for Visual Search
  8. Run main_webserver.py

If you follow the above steps above, follow the steps in the main README (connect ASG to ASP hotspot), and run Android apps on the ASG and ASP, you will be running the WIS on a local server. You must follow the below Deploy steps to deploy this to the internet.

Deploy GLBOX

This setup is arbitrary, do what works best for your stack, whether that be Nginx, Apache, etc. However, a ready to go setup is provided in this repo, and the steps are:

  1. Setup a cloud connected Linux box (tested on Ubuntu 18 LTS AWS EC2) with a domain name or static IP that you can SSH into
  2. Install Nginx: sudo apt-get install nginx
  3. Enable Nginx: sudo systemctl start nginx && sudo systemctl enable nginx
  4. Clone the repo at /var/www/html and ensure permissions are properly set for /var/www/html: https://askubuntu.com/questions/767504/permissions-problems-with-var-www-html-and-my-own-home-directory-for-a-website
  5. Add the two .conf files at gnu_linux_box/backend/deploy to /etc/nginx/site-available and activate them with:
sudo rm /etc/nginx/sites-enabled/default
sudo ln /etc/nginx/sites-available/wis_backend.conf /etc/nginx/sites-enabled/
sudo ln /etc/nginx/sites-available/wis_ssl.conf /etc/nginx/sites-enabled/
sudo systemctl restart nginx
  1. Setup the backend to run by setting up virtualenv and installing requirements (follow the [Install / Setup](#build-and-install-glbox above)
  2. Copy the .service file from gnu_linux_box/backend/deploy to /etc/systemd/system
  3. Enable the service file with sudo systemctl start wis_gunicorn && sudo systemctl enable wis_gunicorn

Testing Backend

Visual Search

Change url and image location as required

 (echo -n '{"image": "'; base64 ~/Pictures/grp.jpg; echo '"}') |
curl -H "Content-Type: application/json" -d @-  http://localhost:5000/visual_search_search
Text based queries

Change url and query text as required

curl -X POST -F "query=who's the president of the us" http://127.0.0.1:5000/natural_language_query -vvv

Demonstrate / Screen Share / Screen Mirror

To show your ASG or ASP screen to others (e.g. over video chat with "share screen") you can open a window on your computer that will mirror the ASG or ASP display. The steps to do so are:

  1. Install scrcpy: https://github.com/Genymobile/scrcpy
  2. Run scrcpy

Technical References / Technical Acknowledgements

Authors / Contributors