Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 71 additions & 0 deletions docs/system/angel_system_plus_speech_QA.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
Running Angel System with Speech QA
======================

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For RST files, these section demoninators are going to want to be the same character length as the section title. This will be a warning that is output by a Sphinx build. This comment applies to various sections below this.

The following are the instructions to run the angel system to track tasks whilst also interacting with GPT-4 using your voice for Question Answering (QA).

At a high level, you must:

1) start up the Angel ARUI on your HoloLens
2) start the tmuxinator script which runs the angel system and the QA nodes, for example for medical task R18:

.. code-block:: bash

tmuxinator start demos/medical/Kitware-R18-qa

Note that for the QA portion, you'll need an OpenAI API key, which you export in the tmuxinator QA window after tmuxinator starts.

.. code-block:: bash

export OPENAI_API_KEY=your_openapi_key
export OPENAI_ORG_ID=your-openai-org-id

3) Run the Angel System ASR server. Installation and running are described below.

Angel System ASR Server
======================

Git-cloning the separate repo
---------------------
For now, this repo lives separately from the angel_system. Clone it by running:

.. code-block:: bash

git clone git@github.com:ColumbiaNLP/angel-system-speech-processor.git

Installing dependencies with apt
----------------------

.. code-block:: bash

sudo apt update && apt install -y sox ffmpeg

Create conda environment
----------------------

.. code-block:: bash

conda env create -f speech_server.yml
conda activate speech_server

Running the Server
----------------------

The server can then be instantiated with:

.. code-block:: bash

export CUDA_VISIBLE_DEVICES=4; python speech_server.py

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically, this should be equivalent to running CUDA_VISIBLE_DEVICES=4 python speech_server.py, but if what you have works, it works.


(Note: you may need to remove the "export" command above, for example if you only have one GPU, so device 4 does not exist.)

Running the Client (Not necessary when running a tmuxinator config)
--------------------

Ensure the server is actively running on the server machine.
Also ensure the client is connected to a microphone peripheral.
This script will indicate when recording has begun. Otherwise, you can
optionally pass in a prerecorded file using the `-f/--file` flag.

.. code-block:: bash

python speech_client.py --asr/--vd