From e4edef17fbee3ae115a5803891faafc24d783fdb Mon Sep 17 00:00:00 2001 From: Lukas Schaefer Date: Tue, 29 Apr 2025 20:11:19 -0400 Subject: [PATCH 1/3] docs for text to speech Signed-off-by: Lukas Schaefer --- admin_manual/ai/app_assistant.rst | 7 +++++++ admin_manual/ai/overview.rst | 1 + .../app_upgrade_guide/upgrade_to_32.rst | 2 +- developer_manual/digging_deeper/task_processing.rst | 5 +++++ 4 files changed, 14 insertions(+), 1 deletion(-) diff --git a/admin_manual/ai/app_assistant.rst b/admin_manual/ai/app_assistant.rst index 89796d6f16b..a5e5656e92a 100644 --- a/admin_manual/ai/app_assistant.rst +++ b/admin_manual/ai/app_assistant.rst @@ -111,6 +111,13 @@ In order to make use of our AI agent feature, offering the execution of actions You will also need a text processing provider as specified above (ie. *llm2* or *integration_openai*). +Text-To-Speech +~~~~~~~~~~~~~~ + +In order to make use of Text-To-Speech, you will need an app that provides a Text-To-Speech backend: + +* *integration_openai* - Integrates with the OpenAI API to provide AI functionality from OpenAI servers (Customer support available upon request; see :ref:`AI as a Service`) + Configuration ------------- diff --git a/admin_manual/ai/overview.rst b/admin_manual/ai/overview.rst index cf1e2be875d..2cf812f0dc8 100644 --- a/admin_manual/ai/overview.rst +++ b/admin_manual/ai/overview.rst @@ -66,6 +66,7 @@ Nextcloud uses modularity to separate raw AI functionality from the Graphical Us "Context Chat","`Nextcloud Assistant Context Chat `_","Yellow","Yes","Yes","No","Yes" "","`Nextcloud Assistant Context Chat (Backend) `_","Yellow","Yes","Yes","No","Yes" "Context Agent","`Nextcloud Context Agent `_","Green","Yes","Yes","Yes","Yes" + "Text To Speech","`Open AI Text To Speech `_","Red","No","No","No","No" Ethical AI Rating diff --git a/developer_manual/app_publishing_maintenance/app_upgrade_guide/upgrade_to_32.rst b/developer_manual/app_publishing_maintenance/app_upgrade_guide/upgrade_to_32.rst index d960f5bb1f3..2a0f1b61864 100644 --- a/developer_manual/app_publishing_maintenance/app_upgrade_guide/upgrade_to_32.rst +++ b/developer_manual/app_publishing_maintenance/app_upgrade_guide/upgrade_to_32.rst @@ -36,7 +36,7 @@ Back-end changes Added APIs ^^^^^^^^^^ -- TBD +- New service ``OCP\TaskProcessing\TextToSpeech`` to convert text to speech. Changed APIs ^^^^^^^^^^^^ diff --git a/developer_manual/digging_deeper/task_processing.rst b/developer_manual/digging_deeper/task_processing.rst index d6267dcba96..7d3e30f5af2 100644 --- a/developer_manual/digging_deeper/task_processing.rst +++ b/developer_manual/digging_deeper/task_processing.rst @@ -116,6 +116,11 @@ The following built-in task types are available: * ``input``: ``Text`` * Output shape: * ``output``: ``Text`` + * ``'core:text2speech'``: This task type is for generating images from text prompts. It is implemented by ``\OCP\TaskProcessing\TaskTypes\TextToSpeech`` + * Input shape: + * ``input``: ``Text`` + * Output shape: + * ``speech``: ``Audio`` Task types can be disabled in the AI admin settings so they are not available for the Assistant or other apps even if they are implemented. All implemented Task types are enabled by default. From 332130b31b6bb5cbcb38c30a2d96d9dcdd7171d8 Mon Sep 17 00:00:00 2001 From: Joas Schilling <213943+nickvergessen@users.noreply.github.com> Date: Wed, 30 Apr 2025 08:42:28 +0200 Subject: [PATCH 2/3] Update developer_manual/app_publishing_maintenance/app_upgrade_guide/upgrade_to_32.rst Signed-off-by: Joas Schilling <213943+nickvergessen@users.noreply.github.com> --- .../app_upgrade_guide/upgrade_to_32.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/developer_manual/app_publishing_maintenance/app_upgrade_guide/upgrade_to_32.rst b/developer_manual/app_publishing_maintenance/app_upgrade_guide/upgrade_to_32.rst index 2a0f1b61864..a73b0d8ee02 100644 --- a/developer_manual/app_publishing_maintenance/app_upgrade_guide/upgrade_to_32.rst +++ b/developer_manual/app_publishing_maintenance/app_upgrade_guide/upgrade_to_32.rst @@ -36,7 +36,7 @@ Back-end changes Added APIs ^^^^^^^^^^ -- New service ``OCP\TaskProcessing\TextToSpeech`` to convert text to speech. +- New task processing task type ``OCP\TaskProcessing\TextToSpeech`` to convert text to speech. Changed APIs ^^^^^^^^^^^^ From b9d1bae0c8a1d7280dc9f7dd8c8bffa048d4945c Mon Sep 17 00:00:00 2001 From: Joas Schilling <213943+nickvergessen@users.noreply.github.com> Date: Wed, 30 Apr 2025 08:42:34 +0200 Subject: [PATCH 3/3] Update developer_manual/digging_deeper/task_processing.rst Co-authored-by: Anupam Kumar Signed-off-by: Joas Schilling <213943+nickvergessen@users.noreply.github.com> --- developer_manual/digging_deeper/task_processing.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/developer_manual/digging_deeper/task_processing.rst b/developer_manual/digging_deeper/task_processing.rst index 7d3e30f5af2..a52395ac782 100644 --- a/developer_manual/digging_deeper/task_processing.rst +++ b/developer_manual/digging_deeper/task_processing.rst @@ -116,7 +116,7 @@ The following built-in task types are available: * ``input``: ``Text`` * Output shape: * ``output``: ``Text`` - * ``'core:text2speech'``: This task type is for generating images from text prompts. It is implemented by ``\OCP\TaskProcessing\TaskTypes\TextToSpeech`` + * ``'core:text2speech'``: This task type is for generating speech from text prompts. It is implemented by ``\OCP\TaskProcessing\TaskTypes\TextToSpeech`` * Input shape: * ``input``: ``Text`` * Output shape: