diff --git a/CLAUDE.md b/CLAUDE.md index 4bda281..baddb2c 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -6,6 +6,20 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co PDF Converter Service - A serverless PDF to image conversion service built with AWS SAM. This application provides secure, synchronous PDF processing with JWT authentication and optional webhook notifications. It uses containerized Ruby Lambda functions for scalable document processing. +## Initial Setup + +For first-time deployment to AWS: + +1. **Configure AWS CLI**: Run `aws configure` with your credentials +2. **Create JWT Secret**: Use AWS Secrets Manager to create `pdf-converter/jwt-secret` + ```bash + SECRET_VALUE=$(openssl rand -base64 32) + aws secretsmanager create-secret --name pdf-converter/jwt-secret --secret-string "$SECRET_VALUE" --region us-east-1 + ``` +3. **Deploy**: Run `sam build && sam deploy --guided` + +See README.md for detailed setup instructions including JWT token generation and testing. + ## Development Commands ### Build and Deploy diff --git a/README.md b/README.md index e68d18d..12296bc 100644 --- a/README.md +++ b/README.md @@ -16,7 +16,167 @@ A serverless PDF to image conversion service built with AWS SAM. This applicatio - **template.yaml** - SAM template defining AWS resources - **samconfig.toml** - SAM CLI deployment configuration -## Prerequisites +## Getting Started + +This guide walks you through deploying the PDF Converter Service to your AWS account from scratch. + +### Prerequisites + +- [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) installed and configured with your AWS credentials +- [SAM CLI](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-install.html) installed +- [Docker](https://hub.docker.com/search/?type=edition&offering=community) installed and running +- [Ruby 3.4](https://www.ruby-lang.org/en/documentation/installation/) (optional, for local development) +- An AWS account with permissions to create Lambda functions, API Gateway, ECR repositories, and Secrets Manager secrets + +### Step 1: Configure AWS CLI + +If you haven't already, configure the AWS CLI with your credentials: + +```bash +aws configure +``` + +Enter your AWS Access Key ID, Secret Access Key, default region (e.g., `us-east-1`), and output format (e.g., `json`). + +### Step 2: Create JWT Secret + +The service uses JWT authentication. Create a secret in AWS Secrets Manager to store your JWT signing key: + +```bash +# Generate a secure random secret (256-bit recommended) +SECRET_VALUE=$(openssl rand -base64 32) + +# Create the secret in AWS Secrets Manager +aws secretsmanager create-secret \ + --name pdf-converter/jwt-secret \ + --secret-string "$SECRET_VALUE" \ + --region us-east-1 + +# Save the secret value for later use in generating tokens +echo "Your JWT secret: $SECRET_VALUE" +``` + +**Important:** Save the secret value securely - you'll need it to generate JWT tokens for API authentication. + +### Step 3: Clone and Deploy + +Clone the repository and deploy using SAM: + +```bash +# Clone the repository +git clone https://github.com/your-username/content_processing.git +cd content_processing + +# Build the application +sam build + +# Deploy (first time - this will prompt for configuration) +sam deploy --guided +``` + +During `sam deploy --guided`, you'll be prompted for: +- **Stack Name**: Press Enter to use default `content-processing` +- **AWS Region**: Enter your preferred region (e.g., `us-east-1`) +- **Confirm changes before deploy**: `Y` (recommended) +- **Allow SAM CLI IAM role creation**: `Y` (required) +- **Disable rollback**: `N` (recommended) +- **Save arguments to configuration file**: `Y` (saves settings for future deploys) + +The deployment will: +1. Create an ECR repository for the Docker image +2. Build and push the container image +3. Create the Lambda function +4. Set up API Gateway with a `/convert` endpoint +5. Configure IAM roles and permissions + +### Step 4: Get Your API Endpoint + +After successful deployment, note the API endpoint URL from the outputs: + +``` +Outputs +------------------------------------------------------------------- +PdfConverterApi = https://xxxxxxxxxx.execute-api.us-east-1.amazonaws.com/Prod/convert/ +``` + +### Step 5: Generate JWT Tokens + +To call the API, you need a valid JWT token. Here's how to generate one using Ruby: + +```ruby +require 'jwt' + +# Use the secret you created in Step 2 +secret = 'your-secret-from-step-2' + +# Generate a token that expires in 1 hour +payload = { + sub: 'user-identifier', + exp: Time.now.to_i + 3600 +} + +token = JWT.encode(payload, secret, 'HS256') +puts "Authorization: Bearer #{token}" +``` + +Or using Python: + +```python +import jwt +import time + +# Use the secret you created in Step 2 +secret = 'your-secret-from-step-2' + +# Generate a token that expires in 1 hour +payload = { + 'sub': 'user-identifier', + 'exp': int(time.time()) + 3600 +} + +token = jwt.encode(payload, secret, algorithm='HS256') +print(f"Authorization: Bearer {token}") +``` + +### Step 6: Test Your Deployment + +Create pre-signed S3 URLs for source (PDF) and destination (images), then call the API: + +```bash +# Example using curl (replace with your actual URLs and token) +curl -X POST https://your-api-endpoint.amazonaws.com/Prod/convert \ + -H "Authorization: Bearer YOUR_JWT_TOKEN" \ + -H "Content-Type: application/json" \ + -d '{ + "source": "https://s3.amazonaws.com/your-bucket/input.pdf?X-Amz-...", + "destination": "https://s3.amazonaws.com/your-bucket/output/?X-Amz-...", + "webhook": "https://your-webhook-endpoint.com/notify", + "unique_id": "test-123" + }' +``` + +For instructions on generating pre-signed S3 URLs, see the [AWS documentation](https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-presigned-url.html). + +### Testing Scripts + +To simplify testing, this repository includes utility scripts in the `scripts/` directory. The scripts automatically install their dependencies on first run using `bundler/inline` - no manual gem installation needed! + +**Generate JWT Token:** +```bash +./scripts/generate_jwt_token.rb +``` + +**Generate Pre-signed S3 URLs:** +```bash +./scripts/generate_presigned_urls.rb \ + --bucket my-bucket \ + --source-key pdfs/test.pdf \ + --dest-prefix output/ +``` + +See [scripts/README.md](scripts/README.md) for detailed usage instructions and examples. + +## Prerequisites (Local Development) - [SAM CLI](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-install.html) - [Docker](https://hub.docker.com/search/?type=edition&offering=community) diff --git a/REFACTORING.md b/REFACTORING.md deleted file mode 100644 index 81b01c6..0000000 --- a/REFACTORING.md +++ /dev/null @@ -1,245 +0,0 @@ -# Code Quality Refactoring Plan - -Based on RubyCritic analysis conducted on 2025-11-04. - -## Current Status - -**Score Distribution:** - -- **A (Excellent)**: 3 files - AwsConfig, SpecHelper, TestVips -- **B (Good)**: 2 files - JwtAuthenticator, JwtSetupSpec -- **C (Acceptable)**: 3 files - DockerEnvironmentSpec, LocalstackIntegrationSpec, PdfConverter -- **D (Needs Improvement)**: 7 files - App (45 smells), ImageUploader (19 smells), PdfDownloader (21 smells), UrlValidator (36 smells), and their specs - -## Phase 1: Quick Wins ✅ COMPLETE - -Low-risk improvements that increase code quality immediately. - -- [x] Replace generic exception variable names (`e` → `error`) across all files -- [x] Add class/module documentation comments -- [x] Extract duplicate method calls to local variables in app.rb -- [x] Remove unused parameters (lambda_handler `context`, upload_images_to_s3 `unique_id`) - -**Impact**: Successfully reduced 27 smells across the codebase - -### Phase 1 Results - -- **JwtAuthenticator**: 19 → 15 smells (-4) -- **PdfConverter**: 18 → 14 smells (-4) -- **App**: 42 → 31 smells (-11) -- **ImageUploader**: 19 → 17 smells (-2) -- **PdfDownloader**: 21 → 19 smells (-2) -- **AwsConfig**: 3 → 2 smells (-1) -- **Other files**: ~3 additional smells removed -- **Total: 27 smells eliminated** -- **App.rb complexity**: 231.82 → 206.55 (-25.27 points) -- **lambda_handler flog score**: 102 → 78 (-24 points) - -## Phase 2: Extract Reusable Components ✅ COMPLETE - -Create shared infrastructure before refactoring main classes. - -- [x] Extract retry logic into a reusable RetryHandler module -- [x] Refactor PdfDownloader to use RetryHandler module -- [x] Refactor ImageUploader to use RetryHandler module and create ContentData value object -- [x] Extract S3UrlParser class from UrlValidator to handle URL parsing -- [x] Consolidate duplicate validation logic in UrlValidator - -### Phase 2 Results - -- **RetryHandler module created**: A-rated, 41.54 complexity, 5 smells -- **PdfDownloader**: 19 → 11 smells (-8, -42%) -- **ImageUploader**: D → C rating, 19 → 14 smells (-5, -26%), 144.09 → 102.7 complexity (-29%) -- **S3UrlParser module created**: Centralized S3 URL parsing logic -- **UrlValidator**: C → B rating, 35 → 12 smells (-23, -66%!), 141.09 → 60.78 complexity (-57%!), 36 → 0 duplication -- **Tests**: All unit tests passing (42 new examples for S3UrlParser and UrlValidator) -- **Overall Score**: 71.27 → 78.18 (+6.91 points, +9.7% improvement) - -**Key Achievements:** - -- Eliminated ~200 lines of duplicate retry logic -- Eliminated all duplication in UrlValidator -- Created reusable, well-tested infrastructure modules -- Significantly improved maintainability and testability - -## Phase 3: Extract Service Classes ✅ COMPLETE - -Break down the monolithic app.rb (45 smells). - -- [x] Extract RequestValidator class from app.rb -- [x] Extract WebhookNotifier class from app.rb -- [x] Extract ResponseBuilder helper class from app.rb -- [x] Refactor lambda_handler in app.rb to use extracted service classes - -### Phase 3 Results - -- **RequestValidator created**: A-rated, 35.52 complexity, 6 smells -- **ResponseBuilder created**: A-rated, 5.68 complexity, 4 smells -- **WebhookNotifier created**: A-rated, 23.76 complexity, 5 smells -- **App.rb**: D → C rating, 45 → 21 smells (-24, -53%!), 231.82 → 142.08 complexity (-39%) -- **lambda_handler flog score**: 102 → 66 (-36 points, -35% reduction!) -- **Tests**: All unit tests passing (40 examples, 0 failures) -- **Overall Score**: 78.18 → 77.6 (-0.58 points, slight decrease due to new files) - -**Key Achievements:** - -- Created 3 well-structured, A-rated service classes -- Reduced lambda_handler complexity by 35% -- Reduced app.rb smell count by 53% -- Significantly improved code organization and maintainability -- All tests passing after refactoring - -## Phase 4: Targeted Complexity Reduction ✅ COMPLETE - -Address remaining high-complexity methods. - -- [x] Simplify JwtAuthenticator#retrieve_secret method (complexity score: 29) -- [x] Extract error handling logic in JwtAuthenticator to reduce duplication -- [x] Refactor PdfConverter#convert_to_images to reduce complexity (score: 39) - -### Phase 4 Results - -**JwtAuthenticator Improvements:** - -- Extracted `build_client_config` method to simplify AWS client setup -- Extracted `handle_secret_error` method to consolidate error handling -- Reduced code duplication in error handling rescue blocks -- Rating: B, 73.11 complexity, 14 smells -- Cleaner separation of concerns with LocalStack configuration isolated - -**PdfConverter Improvements:** - -- Extracted `validate_page_count` method for page validation logic -- Extracted `convert_all_pages` method to handle the page conversion loop -- Extracted `success_result` helper to build success response -- Extracted `cleanup_temp_file` helper for cleanup logic -- Rating: C, 102.93 complexity, 19 smells -- Significantly improved readability of main `convert_to_images` method - -**Tests:** All unit tests passing (26 examples, 0 failures) -**Overall Score:** 77.6 → 82.65 (+5.05 points, +6.5% improvement!) - -**Key Achievements:** - -- Reduced complexity in high-complexity methods through extraction -- Improved code organization and readability -- Eliminated duplicate error handling patterns -- Created reusable helper methods for common operations -- Maintained all test coverage with zero regressions - -**Impact**: Successfully improved code quality and organization, moving towards A/B ratings - -## Phase 5: Validation ✅ COMPLETE - -- [x] Run full test suite after all changes -- [x] Re-run RubyCritic to measure improvement - -### Phase 5 Results - -**Test Suite Status:** - -- **Unit Tests:** 83 examples, 0 failures ✅ -- **Integration Tests:** Require LocalStack to be running (expected) -- **All refactoring changes validated with zero regressions** - -**Final RubyCritic Score:** 82.65 - -**Final File Ratings:** - -- **A-rated (Excellent):** 6 files - - AwsConfig - - RequestValidator - - ResponseBuilder - - RetryHandler - - WebhookNotifier - - SpecHelper - -- **B-rated (Good):** 3 files - - JwtAuthenticator - - S3UrlParser - - UrlValidator - -- **C-rated (Acceptable):** 3 files - - ImageUploader - - PdfConverter - - PdfDownloader - -**Outcome**: Successfully improved overall code quality with 6 A-rated files and 3 B-rated files! - -## Refactoring Summary - -### Overall Progress - -**Initial State (before Phase 1):** - -- Overall Score: 71.27 -- D-rated files: 7 (App, ImageUploader, PdfDownloader, UrlValidator, and their specs) -- A-rated files: 3 - -**Final State (after Phase 5):** - -- Overall Score: 82.65 (+11.38 points, **+15.9% improvement**) -- A-rated files: 6 (doubled!) -- B-rated files: 3 -- C-rated files: 3 -- D-rated files: 0 ✅ - -### Key Accomplishments by Phase - -**Phase 1 - Quick Wins:** - -- Eliminated 27 smells across the codebase -- Improved variable naming and documentation -- Score: 71.27 → 78.18 (+9.7%) - -**Phase 2 - Reusable Components:** - -- Created RetryHandler module (A-rated) -- Created S3UrlParser module -- Eliminated ~200 lines of duplicate code -- Score: 78.18 → 78.18 (maintained, added new files) - -**Phase 3 - Service Classes:** - -- Created RequestValidator, ResponseBuilder, WebhookNotifier (all A-rated) -- Reduced app.rb complexity by 39% -- Reduced lambda_handler complexity by 35% -- Score: 78.18 → 77.6 (slight dip due to new files) - -**Phase 4 - Complexity Reduction:** - -- Refactored JwtAuthenticator#retrieve_secret -- Refactored PdfConverter#convert_to_images -- Improved code organization and readability -- Score: 77.6 → 82.65 (+6.5%) - -**Phase 5 - Validation:** - -- All 83 unit tests passing -- Zero regressions introduced -- Final score: 82.65 - -### Impact Metrics - -- **Smells Reduced:** 50+ smells eliminated across the codebase -- **Complexity Reduction:** - - App.rb: 231.82 → 142.08 (-38.7%) - - lambda_handler: 102 → 66 flog score (-35.3%) -- **Code Duplication:** Eliminated ~200 lines of duplicate retry logic -- **Test Coverage:** Maintained 100% of existing test coverage -- **New Classes Created:** 6 well-structured, A-rated service classes - -### Maintainability Improvements - -1. **Better Separation of Concerns:** Business logic extracted into dedicated service classes -2. **Reusable Infrastructure:** Retry logic and URL parsing now centralized -3. **Improved Readability:** Complex methods broken down into smaller, focused functions -4. **Enhanced Testability:** Service classes easier to test in isolation -5. **Reduced Technical Debt:** No D-rated files remaining - -## Notes - -- Run tests after each phase to ensure no regressions -- Run RubyCritic periodically to track progress -- Update this file as tasks are completed -- All phases completed successfully on 2025-11-05 diff --git a/TEST_REFACTOR.md b/TEST_REFACTOR.md deleted file mode 100644 index 9677164..0000000 --- a/TEST_REFACTOR.md +++ /dev/null @@ -1,492 +0,0 @@ -# Test Refactoring Plan - -## Status: In Progress - Phase 0 Complete - -**Created:** 2025-11-08 -**Last Updated:** 2025-11-08 - ---- - -## Baseline Metrics (Before Refactoring) - -### Test Coverage (as of 2025-11-08) -- **Line Coverage**: 65.43% (371 / 567 lines) -- **Branch Coverage**: 49.72% (88 / 177 branches) -- **Target**: 100% line and branch coverage - -### Code Quality (RubyCritic) -**A-Rated Classes** (7 files): -- AwsConfig, RequestValidator, ResponseBuilder, RetryHandler -- SpecHelper, TestVips, UrlUtils, WebhookNotifier - -**B-Rated Classes** (5 files): -- App.rb (complexity: 89.03, 11 smells) -- JwtAuthenticator (complexity: 73.11, 15 smells) -- JwtSetupSpec, PdfDownloader (complexity: 79.4, 9 smells) -- S3UrlParser, UrlValidator - -**C-Rated Classes** (4 files): -- DockerEnvironmentSpec (duplication: 54) -- ImageUploader (complexity: 139.56, 19 smells) -- LocalstackIntegrationSpec, PdfConverter (complexity: 102.93, 20 smells) - -**D-Rated Test Files** (4 files): -- ImageUploaderSpec, PdfConverterSpec, PdfDownloaderSpec, UrlValidatorSpec -- Issues: High complexity in tests, code duplication - -**F-Rated Test Files** (2 files): -- AuthenticatedHandlerSpec (complexity: 520.98, duplication: 368, 15 smells) -- ErrorHandlingSpec (complexity: 436.8, duplication: 40, 12 smells) - -### Test Suite Status -- **Total Tests**: 83 examples -- **Failures**: 5 (integration test mocking issues) -- **Pending**: 12 (ruby-vips not available locally) - ---- - -## Current State Analysis - -### Issues Identified - -1. **Structure misalignment**: Test structure doesn't mirror application structure (`app/` and `lib/` directories not reflected in specs) -2. **Missing coverage**: No unit tests for 5 classes: - - `app/request_validator.rb` - - `app/response_builder.rb` - - `app/webhook_notifier.rb` - - `lib/aws_config.rb` - - `lib/url_utils.rb` -3. **Mixed concerns**: Integration tests mix concerns (full handler tests vs. component integration) -4. **Low-value tests**: Infrastructure tests (`jwt_setup_spec.rb`, `docker_environment_spec.rb`) that don't add value -5. **Misplaced tests**: `error_handling_spec.rb` tests PdfDownloader behavior, not a distinct unit -6. **Orphan files**: `test_with_localstack.rb`, `pdf_converter/test_vips.rb` - -### Current Test Files - -**Unit Tests (spec/unit/):** - -- `docker_environment_spec.rb` - Infrastructure test, low value -- `error_handling_spec.rb` - Actually tests PdfDownloader, misplaced -- `image_uploader_spec.rb` - Good unit test -- `jwt_authenticator_spec.rb` - Good unit test -- `pdf_converter_spec.rb` - Good unit test -- `pdf_downloader_spec.rb` - Good unit test -- `retry_handler_spec.rb` - Good unit test -- `s3_url_parser_spec.rb` - Good unit test -- `url_validator_spec.rb` - Good unit test - -**Integration Tests (spec/integration/):** - -- `authenticated_handler_spec.rb` - Duplicates unit test coverage -- `localstack_integration_spec.rb` - Good integration test, keep -- `pdf_download_integration_spec.rb` - Covered by unit tests - -**Infrastructure Tests (spec/infrastructure/):** - -- `jwt_setup_spec.rb` - Low value, just checks gems are installed - -**Orphan Files:** - -- `test_with_localstack.rb` (root level) -- `pdf_converter/test_vips.rb` - ---- - -## Proposed Test Structure - -``` -spec/ -├── spec_helper.rb # Keep, update if needed -├── support/ # Test helpers and shared contexts -│ ├── jwt_helper.rb # Extract JWT test helpers -│ └── s3_stub_helper.rb # Extract S3 stubbing helpers -├── fixtures/ # Keep test fixtures -├── app_spec.rb # NEW: Unit tests for app.rb helper functions -├── app/ # NEW: Mirror app/ directory -│ ├── image_uploader_spec.rb # Reorganized from unit/ -│ ├── jwt_authenticator_spec.rb # Reorganized from unit/ -│ ├── pdf_converter_spec.rb # Reorganized from unit/ -│ ├── pdf_downloader_spec.rb # Reorganized from unit/ -│ ├── request_validator_spec.rb # NEW -│ ├── response_builder_spec.rb # NEW -│ ├── url_validator_spec.rb # Reorganized from unit/ -│ └── webhook_notifier_spec.rb # NEW -├── lib/ # NEW: Mirror lib/ directory -│ ├── aws_config_spec.rb # NEW -│ ├── retry_handler_spec.rb # Reorganized from unit/ -│ ├── s3_url_parser_spec.rb # Reorganized from unit/ -│ └── url_utils_spec.rb # NEW -└── integration/ - └── localstack_integration_spec.rb # Keep, enhance for full system verification -``` - ---- - -## Implementation Plan - -### Phase 0: Setup Coverage and Quality Tools ✅ COMPLETE - -**Objective:** Establish baseline metrics and tooling - -**Tasks:** -- [x] Add SimpleCov gem to Gemfile -- [x] Configure SimpleCov in spec_helper.rb -- [x] Set coverage requirements (100% line and branch) -- [x] Run baseline coverage report -- [x] Run RubyCritic analysis -- [x] Document baseline metrics - -**Results:** -- SimpleCov configured with 100% line and branch coverage requirements -- Baseline: 65.43% line coverage, 49.72% branch coverage -- RubyCritic analysis complete (see baseline metrics above) - ---- - -### Phase 1: Cleanup ✅ COMPLETE - -**Objective:** Remove outdated and low-value tests - -**Tasks:** - -- [x] Delete `spec/unit/` directory entirely (9 test files) -- [x] Delete `spec/integration/authenticated_handler_spec.rb` -- [x] Delete `spec/integration/pdf_download_integration_spec.rb` -- [x] Delete `spec/infrastructure/` directory -- [x] Delete `test_with_localstack.rb` (root level) -- [x] Delete `pdf_converter/test_vips.rb` -- [x] Commit cleanup changes - -**Rationale:** - -- `spec/unit/` tests will be reorganized to match app structure -- Integration tests duplicate unit test coverage -- Infrastructure tests don't add value (just check gems are installed) -- Orphan test files are outdated - -**Results:** - -- Deleted 13 test files and 3 directories -- Only `spec/integration/localstack_integration_spec.rb` remains for integration testing -- Ready to create new structure that mirrors application code - ---- - -### Phase 2: Create New Directory Structure ✅ COMPLETE - -**Objective:** Set up directory structure that mirrors application code - -**Tasks:** - -- [x] Create `spec/app/` directory -- [x] Create `spec/lib/` directory -- [x] Create `spec/support/` directory (already existed) -- [x] Verify structure matches application layout - -**Results:** - -- Created `spec/app/` to mirror `app/` directory -- Created `spec/lib/` to mirror `lib/` directory -- `spec/support/` already existed with some helper files -- Directory structure now ready for new test files - ---- - -### Phase 3: Create Test Support Files ✅ COMPLETE - -**Objective:** Extract common test helpers for reusability - -**Tasks:** - -- [x] Create `spec/support/jwt_helper.rb` with JWT test utilities -- [x] Create `spec/support/s3_stub_helper.rb` with S3 stubbing helpers -- [x] Update `spec_helper.rb` to load support files automatically -- [x] Test that support files are properly loaded - -**Test Utilities Created:** - -**JWT Helper:** -- `generate_valid_token` - Create valid JWT for testing -- `generate_expired_token` - Create expired JWT for testing -- `generate_invalid_signature_token` - Create JWT with wrong signature -- `mock_secrets_manager` - Mock AWS Secrets Manager responses -- `mock_secrets_manager_error` - Mock Secrets Manager errors - -**S3 Stub Helper:** -- `minimal_pdf_content` - Generate valid minimal PDF for tests -- `stub_s3_get_success` - Stub successful S3 GET requests -- `stub_s3_put_success` - Stub successful S3 PUT requests -- `stub_s3_error` - Stub S3 error responses -- `stub_s3_timeout` - Stub S3 timeout errors -- `stub_s3_sequential` - Stub multiple sequential responses -- `stub_s3_pattern` - Stub pattern-matching URLs -- `s3_presigned_url` - Generate valid test S3 URLs - -**Results:** - -- Support files automatically loaded via `spec_helper.rb` -- DRY principle applied - no duplication of setup code -- All helpers are RSpec modules included automatically - ---- - -### Phase 4: Create Unit Tests for app/ Classes ⏳ NOT STARTED - -**Objective:** Create comprehensive unit tests following best practices - -**Best Practices:** - -- Test only the class in isolation -- Mock/stub all dependencies -- Follow RSpec structure: describe/context/it -- Focus on behavior, not implementation -- Test happy path, edge cases, and error conditions -- Use descriptive test names - -**Tasks:** - -#### 4.1: Create spec/app_spec.rb - -- [ ] Test `process_pdf_conversion` function -- [ ] Test `handle_failure` function -- [ ] Test `notify_webhook` function -- [ ] Test `send_webhook` function -- [ ] Test `authenticate_request` function -- [ ] Test `lambda_handler` orchestration - -#### 4.2: Reorganize existing app/ specs - -- [ ] Move `spec/unit/image_uploader_spec.rb` → `spec/app/image_uploader_spec.rb` -- [ ] Move `spec/unit/jwt_authenticator_spec.rb` → `spec/app/jwt_authenticator_spec.rb` -- [ ] Move `spec/unit/pdf_converter_spec.rb` → `spec/app/pdf_converter_spec.rb` -- [ ] Move `spec/unit/pdf_downloader_spec.rb` → `spec/app/pdf_downloader_spec.rb` -- [ ] Move `spec/unit/url_validator_spec.rb` → `spec/app/url_validator_spec.rb` -- [ ] Refactor moved tests to ensure proper isolation and mocking -- [ ] Incorporate relevant tests from `spec/unit/error_handling_spec.rb` into `pdf_downloader_spec.rb` - -#### 4.3: Create missing app/ specs - -- [ ] Create `spec/app/request_validator_spec.rb` - - Test body parsing - - Test required field validation - - Test unique_id format validation - - Test error response generation -- [ ] Create `spec/app/response_builder_spec.rb` - - Test success responses - - Test error responses - - Test authentication error responses - - Test CORS headers -- [ ] Create `spec/app/webhook_notifier_spec.rb` - - Test successful notification - - Test retry logic - - Test timeout handling - - Test error responses - ---- - -### Phase 5: Create Unit Tests for lib/ Classes ⏳ NOT STARTED - -**Objective:** Create unit tests for library utilities - -**Tasks:** - -#### 5.1: Reorganize existing lib/ specs - -- [ ] Move `spec/unit/retry_handler_spec.rb` → `spec/lib/retry_handler_spec.rb` -- [ ] Move `spec/unit/s3_url_parser_spec.rb` → `spec/lib/s3_url_parser_spec.rb` -- [ ] Refactor moved tests to ensure proper isolation - -#### 5.2: Create missing lib/ specs - -- [ ] Create `spec/lib/aws_config_spec.rb` - - Test AWS client configuration - - Test region configuration - - Test endpoint configuration (for LocalStack) - - Test credential handling -- [ ] Create `spec/lib/url_utils_spec.rb` - - Test URL manipulation functions - - Test URL validation - - Test URL sanitization - ---- - -### Phase 6: Enhance Integration Tests ⏳ NOT STARTED - -**Objective:** Create comprehensive end-to-end test for LocalStack - -**Tasks:** - -- [ ] Review existing `spec/integration/localstack_integration_spec.rb` -- [ ] Enhance to cover complete workflow: - - Lambda handler invocation with real event - - JWT authentication flow - - S3 download from LocalStack - - PDF conversion - - S3 upload to LocalStack - - Webhook notification -- [ ] Add error scenario coverage: - - Invalid JWT - - Missing S3 object - - Invalid PDF - - S3 upload failure -- [ ] Document LocalStack setup requirements -- [ ] Add clear comments about what this test verifies - ---- - -### Phase 7: Update Test Configuration ⏳ NOT STARTED - -**Objective:** Improve test setup and documentation - -**Tasks:** - -- [ ] Update `spec_helper.rb`: - - Load support files - - Add shared configuration - - Document test environment setup -- [ ] Update `.rspec` file if needed -- [ ] Create/update `spec/README.md` with: - - How to run tests - - How to run unit tests only - - How to run integration tests with LocalStack - - Test organization explanation -- [ ] Update main `CLAUDE.md` with new test structure - ---- - -### Phase 8: Verification and Cleanup ⏳ NOT STARTED - -**Objective:** Ensure all tests pass with 100% coverage and improved code quality - -**Tasks:** - -- [ ] Run all unit tests: `bundle exec rspec spec/app spec/lib spec/app_spec.rb --format documentation` -- [ ] Run integration tests: `bundle exec rspec spec/integration --format documentation` -- [ ] Run full test suite: `bundle exec rspec` -- [ ] Verify 100% line coverage achieved -- [ ] Verify 100% branch coverage achieved (all conditionals tested) -- [ ] Run RubyCritic: `~/.claude/skills/rubycritic/scripts/check_quality.sh` -- [ ] Verify no F-rated test files (all should be A or B) -- [ ] Check for test duplication and refactor if needed -- [ ] Check for any remaining orphan test files -- [ ] Generate final coverage report for documentation -- [ ] Final commit with summary of changes - -**Success Criteria:** - -- All tests passing (0 failures, 0 pending except ruby-vips) -- 100% line coverage -- 100% branch coverage -- All test files rated A or B in RubyCritic -- Zero test code duplication -- Test suite runs in under 60 seconds (unit tests only) - ---- - -## Testing Best Practices Applied - -### Unit Test Principles - -1. **Isolation**: Each test tests only one class -2. **Fast**: No network calls, no file I/O (except minimal temp files) -3. **Deterministic**: Same input always produces same output -4. **Clear**: Descriptive test names that explain what's being tested -5. **Focused**: One assertion per test when possible - -### Test Structure - -```ruby -RSpec.describe ClassName do - describe '#method_name' do - context 'when condition' do - it 'does expected behavior' do - # Arrange - # Act - # Assert - end - end - end -end -``` - -### Mocking Strategy - -- Mock external services (AWS, HTTP calls) -- Stub file system when possible -- Use real objects for simple value objects -- Verify interactions when testing side effects - ---- - -## Benefits of New Structure - -1. **Clear organization**: Test structure mirrors application structure -2. **Complete coverage**: Every class has a corresponding test file -3. **Fast feedback**: Properly mocked unit tests run in milliseconds -4. **Meaningful integration**: Single LocalStack test verifies full system -5. **Maintainability**: Easy to find tests for any given class -6. **Onboarding**: New developers can understand test organization immediately -7. **Best practices**: Follows RSpec and Rails testing conventions - ---- - -## Progress Tracking - -### Summary - -- **Phase 0**: ✅ Complete (Coverage and quality tools setup) -- **Phase 1**: ✅ Complete (Cleanup - 13 files deleted) -- **Phase 2**: ✅ Complete (Directory structure created) -- **Phase 3**: ✅ Complete (Support files with test helpers) -- **Phase 4**: ⏳ Not Started (App tests - largest phase) -- **Phase 5**: ⏳ Not Started (Lib tests) -- **Phase 6**: ⏳ Not Started (Integration tests) -- **Phase 7**: ⏳ Not Started (Test configuration) -- **Phase 8**: ⏳ Not Started (Verification) - -### Coverage Progress - -- **Baseline**: 65.43% line, 49.72% branch -- **Current**: TBD (will update after each phase) -- **Target**: 100% line, 100% branch - -### Metrics - -- **Files to Delete**: 8 -- **Directories to Create**: 3 -- **Tests to Move**: 9 -- **Tests to Create**: 8 -- **Support Files to Create**: 2 -- **Total Test Files (After)**: 18 -- **Expected Coverage Gain**: +34.57% line, +50.28% branch - ---- - -## Notes - -### Files to Preserve - -These test files contain good tests and should be moved/refactored: - -- `spec/unit/image_uploader_spec.rb` -- `spec/unit/jwt_authenticator_spec.rb` -- `spec/unit/pdf_converter_spec.rb` -- `spec/unit/pdf_downloader_spec.rb` -- `spec/unit/retry_handler_spec.rb` -- `spec/unit/s3_url_parser_spec.rb` -- `spec/unit/url_validator_spec.rb` -- `spec/integration/localstack_integration_spec.rb` - -### Test Extraction Notes - -From `spec/unit/error_handling_spec.rb`: - -- All retry logic tests should be incorporated into `pdf_downloader_spec.rb` -- Tests are well-written and should be preserved - -From `spec/integration/authenticated_handler_spec.rb`: - -- Authentication tests should be in `jwt_authenticator_spec.rb` -- Request validation tests should be in `request_validator_spec.rb` -- Handler orchestration can inspire `app_spec.rb` tests -- Don't duplicate - extract patterns for reuse diff --git a/scripts/README.md b/scripts/README.md new file mode 100644 index 0000000..1af9e24 --- /dev/null +++ b/scripts/README.md @@ -0,0 +1,198 @@ +# Testing Scripts + +Utility scripts to help test the PDF Converter API against your deployed production environment. + +## Prerequisites + +**Ruby**: The scripts require Ruby to be installed. They use `bundler/inline` to automatically install required gems on first run. + +**AWS Credentials**: Ensure your AWS credentials are configured: + +```bash +aws configure +``` + +The scripts will automatically install their dependencies (JWT, AWS SDK) when first run - no manual gem installation needed! + +## Scripts + +### 1. Generate JWT Token + +Generate a JWT token for API authentication: + +```bash +./scripts/generate_jwt_token.rb +``` + +The script retrieves the JWT secret from AWS Secrets Manager and generates a valid token. + +**Options:** + +```bash +./scripts/generate_jwt_token.rb [options] + +Options: + -s, --secret SECRET JWT secret (if not using AWS Secrets Manager) + -n, --secret-name NAME AWS Secrets Manager secret name (default: pdf-converter/jwt-secret) + -r, --region REGION AWS region (default: us-east-1) + -e, --expiration SECONDS Token expiration in seconds (default: 3600) + -u, --subject SUBJECT Token subject/user identifier (default: test-user) + -h, --help Show help message +``` + +**Example:** + +```bash +# Generate token with default settings +./scripts/generate_jwt_token.rb + +# Generate token with 2-hour expiration +./scripts/generate_jwt_token.rb --expiration 7200 + +# Use a different secret name +./scripts/generate_jwt_token.rb --secret-name my-app/jwt-secret --region us-west-2 +``` + +### 2. Generate Pre-signed S3 URLs + +Generate pre-signed S3 URLs for source PDF and destination folder: + +```bash +./scripts/generate_presigned_urls.rb \ + --bucket my-bucket \ + --source-key pdfs/test.pdf \ + --dest-prefix output/ +``` + +**Required Arguments:** + +- `--bucket BUCKET`: S3 bucket name +- `--source-key KEY`: S3 key for source PDF (e.g., 'pdfs/test.pdf') +- `--dest-prefix PREFIX`: S3 prefix for destination images (e.g., 'output/') + +**Optional Arguments:** + +```bash + -r, --region REGION AWS region (default: us-east-1) + -e, --expiration SECONDS URL expiration in seconds (default: 3600) + -u, --unique-id ID Unique ID for this conversion (default: test-TIMESTAMP) + -f, --format FORMAT Output format: pretty, json, curl (default: pretty) + -h, --help Show help message +``` + +**Output Formats:** + +- `pretty`: Human-readable output with JSON payload (default) +- `json`: JSON output for programmatic use +- `curl`: Ready-to-use curl command template + +**Examples:** + +```bash +# Generate URLs with pretty output +./scripts/generate_presigned_urls.rb \ + --bucket my-bucket \ + --source-key pdfs/sample.pdf \ + --dest-prefix converted/ + +# Generate URLs as JSON +./scripts/generate_presigned_urls.rb \ + --bucket my-bucket \ + --source-key pdfs/sample.pdf \ + --dest-prefix converted/ \ + --format json + +# Generate URLs with curl template +./scripts/generate_presigned_urls.rb \ + --bucket my-bucket \ + --source-key pdfs/sample.pdf \ + --dest-prefix converted/ \ + --format curl + +# Custom expiration and unique ID +./scripts/generate_presigned_urls.rb \ + --bucket my-bucket \ + --source-key pdfs/sample.pdf \ + --dest-prefix converted/ \ + --expiration 7200 \ + --unique-id my-test-123 +``` + +## Complete Testing Workflow + +Here's how to test your deployed API end-to-end: + +### Step 1: Upload a test PDF to S3 + +```bash +aws s3 cp test.pdf s3://my-bucket/pdfs/test.pdf +``` + +### Step 2: Generate a JWT token + +```bash +./scripts/generate_jwt_token.rb +``` + +Copy the token from the output. + +### Step 3: Generate pre-signed URLs + +```bash +./scripts/generate_presigned_urls.rb \ + --bucket my-bucket \ + --source-key pdfs/test.pdf \ + --dest-prefix output/ +``` + +Copy the JSON payload from the output. + +### Step 4: Call the API + +Use the JWT token and JSON payload to call your deployed API: + +```bash +curl -X POST https://your-api-id.execute-api.us-east-1.amazonaws.com/Prod/convert \ + -H "Authorization: Bearer YOUR_JWT_TOKEN" \ + -H "Content-Type: application/json" \ + -d '{ + "source": "https://s3.amazonaws.com/...", + "destination": "https://s3.amazonaws.com/...", + "unique_id": "test-123" + }' +``` + +### Step 5: Check the results + +```bash +# List converted images +aws s3 ls s3://my-bucket/output/ + +# Download an image to verify +aws s3 cp s3://my-bucket/output/test-123-0.png ./ +``` + +## Troubleshooting + +### AWS Credentials Not Found + +Make sure you've configured AWS CLI: + +```bash +aws configure +``` + +### Secret Not Found + +Ensure the JWT secret exists in AWS Secrets Manager: + +```bash +aws secretsmanager describe-secret --secret-id pdf-converter/jwt-secret +``` + +### Permission Denied + +Your AWS user/role needs these permissions: +- `s3:GetObject` on the source bucket +- `s3:PutObject` on the destination bucket +- `secretsmanager:GetSecretValue` for the JWT secret diff --git a/scripts/generate_jwt_token.rb b/scripts/generate_jwt_token.rb new file mode 100755 index 0000000..0c83dc3 --- /dev/null +++ b/scripts/generate_jwt_token.rb @@ -0,0 +1,136 @@ +#!/usr/bin/env ruby +# frozen_string_literal: true + +require 'bundler/inline' + +gemfile do + source 'https://rubygems.org' + gem 'jwt', '~> 2.7' + gem 'aws-sdk-secretsmanager', '~> 1' +end + +require 'optparse' + +# Script to generate JWT tokens for testing the PDF Converter API +# Can retrieve secret from AWS Secrets Manager or use a provided secret + +class JwtTokenGenerator + DEFAULT_EXPIRATION = 3600 # 1 hour + + def initialize(secret:) + @secret = secret + end + + def generate_token(subject: 'test-user', expiration: DEFAULT_EXPIRATION) + payload = { + sub: subject, + iat: Time.now.to_i, + exp: Time.now.to_i + expiration + } + + JWT.encode(payload, @secret, 'HS256') + end + + # Retrieve secret from AWS Secrets Manager + def self.retrieve_secret_from_aws(secret_name:, region: 'us-east-1') + client = Aws::SecretsManager::Client.new(region: region) + response = client.get_secret_value(secret_id: secret_name) + response.secret_string + rescue Aws::SecretsManager::Errors::ResourceNotFoundException + raise "Secret '#{secret_name}' not found in region #{region}" + rescue Aws::Errors::ServiceError => e + raise "AWS error retrieving secret: #{e.message}" + end +end + +# Parse command line options +options = { + region: 'us-east-1', + expiration: JwtTokenGenerator::DEFAULT_EXPIRATION, + subject: 'test-user', + secret_name: 'pdf-converter/jwt-secret' +} + +OptionParser.new do |opts| + opts.banner = "Usage: #{$PROGRAM_NAME} [options]" + opts.separator "" + opts.separator "Generate JWT tokens for testing the PDF Converter API" + opts.separator "" + opts.separator "The script will retrieve the JWT secret from AWS Secrets Manager by default." + opts.separator "Alternatively, provide a secret directly with --secret." + opts.separator "" + opts.separator "Options:" + + opts.on("-s", "--secret SECRET", "JWT secret (if not using AWS Secrets Manager)") do |v| + options[:secret] = v + end + + opts.on("-n", "--secret-name NAME", "AWS Secrets Manager secret name (default: pdf-converter/jwt-secret)") do |v| + options[:secret_name] = v + end + + opts.on("-r", "--region REGION", "AWS region (default: us-east-1)") do |v| + options[:region] = v + end + + opts.on("-e", "--expiration SECONDS", Integer, "Token expiration in seconds (default: 3600)") do |v| + options[:expiration] = v + end + + opts.on("-u", "--subject SUBJECT", "Token subject/user identifier (default: test-user)") do |v| + options[:subject] = v + end + + opts.on("-h", "--help", "Show this help message") do + puts opts + exit + end +end.parse! + +# Get the secret +begin + secret = if options[:secret] + options[:secret] + else + puts "Retrieving secret from AWS Secrets Manager..." + JwtTokenGenerator.retrieve_secret_from_aws( + secret_name: options[:secret_name], + region: options[:region] + ) + end + + # Generate token + generator = JwtTokenGenerator.new(secret: secret) + token = generator.generate_token( + subject: options[:subject], + expiration: options[:expiration] + ) + + # Output + puts "=" * 80 + puts "JWT Token Generated" + puts "=" * 80 + puts "" + puts "Token:" + puts token + puts "" + puts "Authorization Header:" + puts "Authorization: Bearer #{token}" + puts "" + puts "Details:" + puts " Subject: #{options[:subject]}" + puts " Expires in: #{options[:expiration]} seconds (#{options[:expiration] / 60} minutes)" + puts " Issued at: #{Time.now}" + puts " Expires at: #{Time.now + options[:expiration]}" + puts "" + puts "=" * 80 + +rescue StandardError => e + puts "Error: #{e.message}" + puts "" + puts "Make sure you have:" + puts " 1. AWS credentials configured (run 'aws configure')" + puts " 2. Access to the secret in AWS Secrets Manager" + puts " 3. Or provide a secret directly with --secret" + exit 1 +end diff --git a/scripts/generate_presigned_urls.rb b/scripts/generate_presigned_urls.rb new file mode 100755 index 0000000..e257c3c --- /dev/null +++ b/scripts/generate_presigned_urls.rb @@ -0,0 +1,192 @@ +#!/usr/bin/env ruby +# frozen_string_literal: true + +require 'bundler/inline' + +gemfile do + source 'https://rubygems.org' + gem 'aws-sdk-s3', '~> 1' + gem 'rexml' # Required by aws-sdk +end + +require 'optparse' +require 'json' + +# Script to generate pre-signed S3 URLs for testing the PDF Converter API +# Uses local AWS credentials from ~/.aws/credentials or environment variables + +class PresignedUrlGenerator + DEFAULT_EXPIRATION = 3600 # 1 hour + + def initialize(bucket:, region: 'us-east-1', expiration: DEFAULT_EXPIRATION) + @bucket = bucket + @region = region + @expiration = expiration + @s3_client = Aws::S3::Client.new(region: @region) + end + + # Generate a pre-signed GET URL for downloading the source PDF + def generate_source_url(key) + signer = Aws::S3::Presigner.new(client: @s3_client) + signer.presigned_url( + :get_object, + bucket: @bucket, + key: key, + expires_in: @expiration + ) + end + + # Generate a pre-signed PUT URL for uploading converted images + # The key should be a prefix/folder path ending with / + def generate_destination_url(prefix) + # Ensure prefix ends with / for folder-style access + prefix = prefix.end_with?('/') ? prefix : "#{prefix}/" + + signer = Aws::S3::Presigner.new(client: @s3_client) + signer.presigned_url( + :put_object, + bucket: @bucket, + key: "#{prefix}placeholder.png", # Example key, actual keys will be unique_id-N.png + expires_in: @expiration + ).gsub('placeholder.png', '') # Remove placeholder to get base URL + end + + # Generate both URLs and return as a hash + def generate_urls(source_key:, destination_prefix:, unique_id: 'test') + { + source: generate_source_url(source_key), + destination: generate_destination_url(destination_prefix), + unique_id: unique_id, + bucket: @bucket, + region: @region, + expiration: @expiration + } + end +end + +# Parse command line options +options = { + region: 'us-east-1', + expiration: PresignedUrlGenerator::DEFAULT_EXPIRATION, + unique_id: "test-#{Time.now.to_i}", + output_format: 'pretty' +} + +OptionParser.new do |opts| + opts.banner = "Usage: #{$PROGRAM_NAME} --bucket BUCKET --source-key KEY --dest-prefix PREFIX [options]" + opts.separator "" + opts.separator "Generate pre-signed S3 URLs for testing the PDF Converter API" + opts.separator "" + opts.separator "Required arguments:" + + opts.on("-b", "--bucket BUCKET", "S3 bucket name") do |v| + options[:bucket] = v + end + + opts.on("-s", "--source-key KEY", "S3 key for source PDF (e.g., 'pdfs/test.pdf')") do |v| + options[:source_key] = v + end + + opts.on("-d", "--dest-prefix PREFIX", "S3 prefix for destination images (e.g., 'output/')") do |v| + options[:dest_prefix] = v + end + + opts.separator "" + opts.separator "Optional arguments:" + + opts.on("-r", "--region REGION", "AWS region (default: us-east-1)") do |v| + options[:region] = v + end + + opts.on("-e", "--expiration SECONDS", Integer, "URL expiration in seconds (default: 3600)") do |v| + options[:expiration] = v + end + + opts.on("-u", "--unique-id ID", "Unique ID for this conversion (default: test-TIMESTAMP)") do |v| + options[:unique_id] = v + end + + opts.on("-f", "--format FORMAT", "Output format: pretty, json, curl (default: pretty)") do |v| + options[:output_format] = v + end + + opts.on("-h", "--help", "Show this help message") do + puts opts + exit + end +end.parse! + +# Validate required arguments +unless options[:bucket] && options[:source_key] && options[:dest_prefix] + puts "Error: --bucket, --source-key, and --dest-prefix are required" + puts "Run with --help for usage information" + exit 1 +end + +# Generate URLs +begin + generator = PresignedUrlGenerator.new( + bucket: options[:bucket], + region: options[:region], + expiration: options[:expiration] + ) + + urls = generator.generate_urls( + source_key: options[:source_key], + destination_prefix: options[:dest_prefix], + unique_id: options[:unique_id] + ) + + # Output based on format + case options[:output_format] + when 'json' + puts JSON.pretty_generate(urls) + when 'curl' + # Output a ready-to-use curl command (requires JWT token to be added) + puts "# Copy this curl command and replace YOUR_JWT_TOKEN with an actual token" + puts "curl -X POST YOUR_API_ENDPOINT \\" + puts " -H \"Authorization: Bearer YOUR_JWT_TOKEN\" \\" + puts " -H \"Content-Type: application/json\" \\" + puts " -d '{" + puts " \"source\": \"#{urls[:source]}\"," + puts " \"destination\": \"#{urls[:destination]}\"," + puts " \"unique_id\": \"#{urls[:unique_id]}\"" + puts " }'" + else # pretty + puts "=" * 80 + puts "Pre-signed S3 URLs Generated" + puts "=" * 80 + puts "" + puts "Source URL (GET):" + puts " #{urls[:source]}" + puts "" + puts "Destination URL (PUT):" + puts " #{urls[:destination]}" + puts "" + puts "Details:" + puts " Bucket: #{urls[:bucket]}" + puts " Region: #{urls[:region]}" + puts " Unique ID: #{urls[:unique_id]}" + puts " Expires in: #{urls[:expiration]} seconds (#{urls[:expiration] / 60} minutes)" + puts "" + puts "JSON Payload for API:" + puts JSON.pretty_generate({ + source: urls[:source], + destination: urls[:destination], + unique_id: urls[:unique_id] + }) + puts "" + puts "=" * 80 + end + +rescue Aws::Errors::ServiceError => e + puts "AWS Error: #{e.message}" + puts "" + puts "Make sure you have:" + puts " 1. AWS credentials configured (run 'aws configure')" + puts " 2. Permissions to access S3 in region #{options[:region]}" + exit 1 +rescue StandardError => e + puts "Error: #{e.message}" + exit 1 +end