From b45e90276cd74d933ea33cff36f12e0edfd6fe1e Mon Sep 17 00:00:00 2001 From: cameron-a-johnson Date: Mon, 17 Jun 2024 17:05:56 -0400 Subject: [PATCH 1/3] New doc for QA running --- docs/system/angel_system_plus_speech_QA.rst | 71 +++++++++++++++++++++ 1 file changed, 71 insertions(+) create mode 100644 docs/system/angel_system_plus_speech_QA.rst diff --git a/docs/system/angel_system_plus_speech_QA.rst b/docs/system/angel_system_plus_speech_QA.rst new file mode 100644 index 000000000..1e1fa8521 --- /dev/null +++ b/docs/system/angel_system_plus_speech_QA.rst @@ -0,0 +1,71 @@ +Running Angel System with Speech QA +====================== +The following are the instructions to run the angel system to track tasks whilst also interacting with GPT-4 using your voice for Question Answering (QA). + +At a high level, you must: + +1) start up the Angel ARUI on your HoloLens +2) start the tmuxinator script which runs the angel system and the QA nodes, for example for medical task R18: + +.. code-block:: bash + + tmuxinator start demos/medical/Kitware-R18-qa + +Note that for the QA portion, you'll need an OpenAI API key, which you export in the tmuxinator QA window after tmuxinator starts. + +.. code-block:: bash + + export OPENAI_API_KEY=your_openapi_key + export OPENAI_ORG_ID=your-openai-org-id + +3) Run the Angel System ASR server. Installation and running are described below. + +Angel System ASR Server +====================== + +Installing dependencies with apt +---------------------- + +.. code-block:: bash + + sudo apt update && apt install -y sox ffmpeg + +Running the Server +---------------------- + +Create conda environment +---------------------- + +.. code-block:: bash + + conda env create -f speech_server.yml + conda activate speech_server + +The server can then be instantiated with: + +.. code-block:: bash + + export CUDA_VISIBLE_DEVICES=4; python speech_server.py + +(Note: you may need to remove the "export" command above, for example if you only have one GPU, so device 4 does not exist.) + +Running the Client +-------------------- + +Create conda environment +-------------------- + +.. code-block:: bash + + conda env create -f speech_client.yml + conda activate speech_client + +Ensure the server is actively running on the server machine. +Also ensure the client is connected to a microphone peripheral. +This script will indicate when recording has begun. Otherwise, you can +optionally pass in a prerecorded file using the `-f/--file` flag. + +.. code-block:: bash + + python speech_client.py --asr/--vd + From f644f556529332bbd04547cfddde03f68b268781 Mon Sep 17 00:00:00 2001 From: cameron-a-johnson Date: Mon, 17 Jun 2024 17:10:50 -0400 Subject: [PATCH 2/3] add the step of cloning the repo --- docs/system/angel_system_plus_speech_QA.rst | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/docs/system/angel_system_plus_speech_QA.rst b/docs/system/angel_system_plus_speech_QA.rst index 1e1fa8521..c029c8587 100644 --- a/docs/system/angel_system_plus_speech_QA.rst +++ b/docs/system/angel_system_plus_speech_QA.rst @@ -23,6 +23,14 @@ Note that for the QA portion, you'll need an OpenAI API key, which you export in Angel System ASR Server ====================== +Git-cloning the separate repo +--------------------- +For now, this repo lives separately from the angel_system. Clone it by running: + +.. code-block:: bash + + git clone git@github.com:ColumbiaNLP/angel-system-speech-processor.git + Installing dependencies with apt ---------------------- From 4caa42d27294405b2f82bdd0d010946c80a85a19 Mon Sep 17 00:00:00 2001 From: cameron-a-johnson Date: Tue, 18 Jun 2024 17:25:29 -0400 Subject: [PATCH 3/3] Refining docs --- docs/system/angel_system_plus_speech_QA.rst | 16 ++++------------ 1 file changed, 4 insertions(+), 12 deletions(-) diff --git a/docs/system/angel_system_plus_speech_QA.rst b/docs/system/angel_system_plus_speech_QA.rst index c029c8587..a385163e0 100644 --- a/docs/system/angel_system_plus_speech_QA.rst +++ b/docs/system/angel_system_plus_speech_QA.rst @@ -38,9 +38,6 @@ Installing dependencies with apt sudo apt update && apt install -y sox ffmpeg -Running the Server ----------------------- - Create conda environment ---------------------- @@ -49,6 +46,9 @@ Create conda environment conda env create -f speech_server.yml conda activate speech_server +Running the Server +---------------------- + The server can then be instantiated with: .. code-block:: bash @@ -57,17 +57,9 @@ The server can then be instantiated with: (Note: you may need to remove the "export" command above, for example if you only have one GPU, so device 4 does not exist.) -Running the Client +Running the Client (Not necessary when running a tmuxinator config) -------------------- -Create conda environment --------------------- - -.. code-block:: bash - - conda env create -f speech_client.yml - conda activate speech_client - Ensure the server is actively running on the server machine. Also ensure the client is connected to a microphone peripheral. This script will indicate when recording has begun. Otherwise, you can