Skip to content
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 18 additions & 18 deletions docs/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,27 +12,35 @@ Use the following documentation to get started with the NVIDIA RAG Blueprint.
- [Deploy With Helm Chart](#deploy-with-helm-chart)
- [Data Ingestion](#data-ingestion)

## Obtain API Keys

## Obtain an API Key
### Obtain NVIDIA API Key

You need to obtain a single API key for accessing NIM services, to pull models on-prem, or to access models hosted in the NVIDIA API Catalog.
Use one of the following methods to generate an API key:

- Option 1: Sign in to the [NVIDIA Build](https://build.nvidia.com/explore/discover?signin=true) portal with your email.
- Click any [model](https://build.nvidia.com/meta/llama-3_1-70b-instruct), then click **Get API Key**, and finally click **Generate Key**.
- Option 1: Sign in to the [NVIDIA Build](https://build.nvidia.com/explore/discover?signin=true) portal with your email.
- Click any [model](https://build.nvidia.com/meta/llama-3_1-70b-instruct), then click **Get API Key**, and finally click **Generate Key**.

- Option 2: Sign in to the [NVIDIA NGC](https://ngc.nvidia.com/) portal with your email.
- Select your organization from the dropdown menu after logging in. You must select an organization which has NVIDIA AI Enterprise (NVAIE) enabled.
- Click your account in the top right, and then select **Setup**.
- Click **Generate Personal Key**, and then click **+ Generate Personal Key** to create your API key.
- Later, you use this key in the `NVIDIA_API_KEY` environment variables.
- Option 2: Sign in to the [NVIDIA NGC](https://ngc.nvidia.com/) portal with your email.
- Select your organization from the dropdown menu after logging in. You must select an organization which has NVIDIA AI Enterprise (NVAIE) enabled.
- Click your account in the top right, and then select **Setup**.
- Click **Generate Personal Key**, and then click **+ Generate Personal Key** to create your API key.
- Later, you use this key in the `NVIDIA_API_KEY` environment variables.

Finally export your NVIDIA API key as an environment variable.

```bash
export NVIDIA_API_KEY="nvapi-..."
```

### Obtain Pinecone API Key

You will also need a Pinecone API key to make use of the vector database. You can do by signing into the [Pinecone console](https://app.pinecone.io).

- Select the Organization and Project you are using and click "API Keys" on the left side of the screen.
- Click "Create API Key"
- Be sure to save the key once created as it will not be displayed again later.

## Deploy With Docker Compose

Expand Down Expand Up @@ -66,7 +74,6 @@ For both retrieval and ingestion services, by default all the models are deploye

5. Ensure you meet [the hardware requirements if you are deploying models on-prem](../README.md#hardware-requirements).


### Start using on-prem models

Use the following procedure to start all containers needed for this blueprint. This launches the ingestion services followed by the rag services and all of its dependent NIMs on-prem.
Expand Down Expand Up @@ -223,7 +230,6 @@ If the NIMs are deployed in a different workstation or outside the nvidia-rag do
NEXT_PUBLIC_VDB_BASE_URL: "http://ingestor-server:8082/v1"
```


### Start using nvidia hosted models

1. Verify that you meet the [prerequisites](#prerequisites).
Expand Down Expand Up @@ -305,12 +311,10 @@ If the NIMs are deployed in a different workstation or outside the nvidia-rag do

7. Open a web browser and access `http://localhost:8090` to use the RAG Playground. You can use the upload tab to ingest files into the server or follow [the notebooks](../notebooks/) to understand the API usage.


## Deploy With Helm Chart

Use these procedures to deploy with Helm Chart to deploy on a Kubernetes cluster. Alternatively, you can [Deploy With Docker Compose](#deploy-with-docker-compose) for a single node deployment.


### Prerequisites

- Verify that you meet the [prerequisites](#prerequisites).
Expand Down Expand Up @@ -463,7 +467,6 @@ rag-zipkin-5dc8d6d977-nqvvc 1/1 Running 0
kubectl get events -n rag
```


##### List Services
```sh
kubectl get svc -n rag
Expand Down Expand Up @@ -702,7 +705,7 @@ To use a custom Milvus endpoint, you need to update the `APP_VECTORSTORE_URL` en

If you have a plan to customize the RAG server deployment like LLM Model Change then please follow the steps to deploy the Frontend

- Build the new docker image with updated model name from docker compose
- Build the new docker image with updated model name from docker compose

```
cd ../deploy/compose
Expand Down Expand Up @@ -748,7 +751,7 @@ To use a custom Milvus endpoint, you need to update the `APP_VECTORSTORE_URL` en

Once docker image has been build to push the image to a docker a registry

- Run the following command to install the RAG server with the Ingestor Server and New Frontend with updated `<new-image-repository>` and `<new-image-tag>`:
- Run the following command to install the RAG server with the Ingestor Server and New Frontend with updated `<new-image-repository>` and `<new-image-tag>`:

```sh
helm install rag -n rag rag-server/ \
Expand All @@ -767,7 +770,6 @@ For troubleshooting issues with Helm deployment, checkout the troubleshooting se
[!IMPORTANT]
Before you can use this procedure, you must deploy the blueprint by using [Deploy With Docker Compose](#deploy-with-docker-compose) or [Deploy With Helm Chart](#deploy-with-helm-chart).


1. Download and install Git LFS by following the [installation instructions](https://git-lfs.com/).

2. Initialize Git LFS in your environment.
Expand Down Expand Up @@ -798,8 +800,6 @@ Before you can use this procedure, you must deploy the blueprint by using [Deplo

Follow the cells in the notebook to ingest the PDF files from the data/dataset folder into the vector store.



## Next Steps

- [Change the Inference or Embedding Model](change-model.md)
Expand Down
Loading