Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "09c3cdab",
"metadata": {},
Expand Down Expand Up @@ -33,7 +32,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "3aea9407",
"metadata": {},
Expand Down Expand Up @@ -316,6 +314,54 @@
"\n"
]
},
{
"cell_type": "markdown",
"id": "c25c32d3",
"metadata": {},
"source": [
"#### Create S3 buckets to store data through the workshop\n",
"Buckets are named mmu-workshop-[ACCOUNT-ID] and mmu-workshop-tmp-[ACCOUNT-ID], and will be used throughout the workshop"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8270ef6c",
"metadata": {},
"outputs": [],
"source": [
"def create_buckets():\n",
" # Get AWS account ID\n",
" sts = boto3.client('sts')\n",
" account_id = sts.get_caller_identity()['Account']\n",
" \n",
" # Create S3 client\n",
" s3_client = boto3.client('s3')\n",
" \n",
" # Define bucket names with account ID\n",
" tmp_bucket = f\"mmu-workshop-tmp-{account_id}\"\n",
" main_bucket = f\"mmu-workshop-{account_id}\"\n",
" \n",
" # Create tmp bucket if it doesn't exist\n",
" try:\n",
" s3_client.create_bucket(Bucket=tmp_bucket)\n",
" print(f\"Created bucket: {tmp_bucket}\")\n",
" except s3_client.exceptions.BucketAlreadyExists:\n",
" print(f\"Bucket already exists: {tmp_bucket}\")\n",
" \n",
" # Create main bucket if it doesn't exist \n",
" try:\n",
" s3_client.create_bucket(Bucket=main_bucket)\n",
" print(f\"Created bucket: {main_bucket}\")\n",
" except s3_client.exceptions.BucketAlreadyExists:\n",
" print(f\"Bucket already exists: {main_bucket}\")\n",
" \n",
" return tmp_bucket, main_bucket\n",
"\n",
"if __name__ == \"__main__\":\n",
" create_buckets()"
]
},
{
"cell_type": "markdown",
"id": "31e9d37a",
Expand Down Expand Up @@ -591,37 +637,7 @@
"source": [
"Now lets upload the same video to s3 and experience the s3 uri way of video understading\n",
"\n",
"We have already created 2 bucket for you to use. Lets run `ls` command to look at the bucket created, we will make use of the bucket `mmu-workshop-********`"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "41d289f3-5ee3-44d9-9234-f0671e4832d6",
"metadata": {},
"outputs": [],
"source": [
"!aws s3 ls"
]
},
{
"cell_type": "markdown",
"id": "7763db6c-203a-4892-9522-1b2f8bc222d2",
"metadata": {},
"source": [
"##### Copy paste the s3 bucket from above and replace it below "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b1839c23-1c6b-473a-be0e-9c510cf6e13b",
"metadata": {},
"outputs": [],
"source": [
"# ⚠️ !! change the s3 bucket name ⚠️ !! \n",
"\n",
"bucket_name =\"<Enter your bucket name here>\""
"We have already created 2 bucket in step 1, named mmu-workshop-[ACCOUNT-NUMBER]"
]
},
{
Expand All @@ -631,7 +647,30 @@
"metadata": {},
"outputs": [],
"source": [
" ! aws s3 cp images/getting_started_imgs/the-sea.mp4 s3://$bucket_name/the-sea.mp4"
"# Get AWS account ID\n",
"sts = boto3.client('sts')\n",
"account_id = sts.get_caller_identity()['Account']\n",
"\n",
"# Create S3 client\n",
"s3_client = boto3.client('s3')\n",
"\n",
"# Define file paths\n",
"bucket_name = f\"mmu-workshop-{account_id}\"\n",
"local_file = \"images/getting_started_imgs/the-sea.mp4\"\n",
"s3_key = \"the-sea.mp4\"\n",
"\n",
"try:\n",
" # Upload file\n",
" print(f\"Uploading {local_file} to bucket {bucket_name}\")\n",
" s3_client.upload_file(\n",
" Filename=local_file,\n",
" Bucket=bucket_name,\n",
" Key=s3_key\n",
" )\n",
" print(f\"Successfully uploaded file to s3://{bucket_name}/{s3_key}\")\n",
" \n",
"except Exception as e:\n",
" print(f\"Error uploading file: {e}\")\n"
]
},
{
Expand Down Expand Up @@ -1219,7 +1258,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.7"
"version": "3.11.10"
}
},
"nbformat": 4,
Expand Down
32 changes: 22 additions & 10 deletions multimodal-understanding/workshop/03_MM_RAG_KB_UI.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -295,10 +295,10 @@
"source": [
"### We will use 2 S3 buckets \n",
"1. **Data Source Input S3 Bucket**: This s3 bucket will serve as an input for creating our Data Source which will create a Vector Database using OpenSearch Serverless. For this we will use the pre-created bucket of the format\n",
"`mmu-workshop-<ACCOUNT_ID>-*****`\n",
"`mmu-workshop-<ACCOUNT_ID>`\n",
"\n",
"2. **Multimodal Storage Bucket**: This s3 bucket will be used to write and read any extracted images from multimodal documents that needs to be refrenced while answering questions related to images. For this we will use a pre-created bucket of the format\n",
"`mmu-workshop-tmp-<ACCOUNT_ID>-*****`"
"`mmu-workshop-tmp-<ACCOUNT_ID>`"
]
},
{
Expand All @@ -316,18 +316,30 @@
"metadata": {},
"outputs": [],
"source": [
"# Upload data to s3 to the bucket that was configured as a data source to the knowledge bas\n",
"\n",
"# Upload data to s3 to the bucket that was configured as a data source to the knowledge base\n",
"s3_client = boto3.client(\"s3\")\n",
"sts = boto3.client(\"sts\")\n",
"\n",
"def uploadDirectory(path, bucket_name, s3_path):\n",
" for root, dirs, files in os.walk(path):\n",
" for file in files:\n",
" local_file_path = os.path.join(root, file)\n",
" s3_key = os.path.join(s3_path, os.path.relpath(local_file_path, path))\n",
" # Upload the file with the new S3 key\n",
" s3_client.upload_file(local_file_path, bucket_name, s3_key)\n",
" try:\n",
" # Upload file\n",
" local_file_path = os.path.join(root, file)\n",
" s3_key = os.path.join(s3_path, os.path.relpath(local_file_path, path))\n",
" # Upload the file with the new S3 key\n",
" print(f\"Uploading {local_file_path} to bucket {bucket_name}/{s3_key}\")\n",
" s3_client.upload_file(local_file_path, bucket_name, s3_key)\n",
" print(f\"Successfully uploaded file to s3://{bucket_name}/{s3_key}\")\n",
" \n",
" except Exception as e:\n",
" print(f\"Error uploading file: {e}\")\n",
"\n",
"uploadDirectory('kb_data', bucket_name, 'mm-data')"
"if __name__ == \"__main__\":\n",
" # Define bucket names with account ID\n",
" account_id = sts.get_caller_identity()['Account']\n",
" bucket_name = f\"mmu-workshop-{account_id}\"\n",
" uploadDirectory('kb_data', bucket_name, 'mm-data')"
]
},
{
Expand Down Expand Up @@ -707,7 +719,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.7"
"version": "3.11.10"
}
},
"nbformat": 4,
Expand Down