-
-
Notifications
You must be signed in to change notification settings - Fork 349
fix: improve conversion rates by fixing metadata fetching errors #392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Add proper error handling for network failures in fetch-metadata API - Wrap optional llms.txt fetches in try-catch to prevent crashes - Add specific error messages for better debugging - Include User-Agent headers in all fetch requests - Fix form submission error handling with proper response parsing - Replace type assertions with proper type guards - Add comprehensive test coverage for error scenarios Fixes 78% metadata fetching failure rate and 51% form submission errors 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
✨ Finishing Touches
🧪 Generate unit tests
Comment |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/web/app/api/fetch-metadata/route.ts (1)
37-198: Consider adding timeout configuration for fetch operations.While the error handling is comprehensive, there's no explicit timeout configuration for the fetch operations. Network requests could hang indefinitely without proper timeouts.
Add timeout handling using AbortController:
// Add this helper function at module level function fetchWithTimeout(url: string, options: RequestInit = {}, timeout = 10000) { const controller = new AbortController() const timeoutId = setTimeout(() => controller.abort(), timeout) return fetch(url, { ...options, signal: controller.signal }).finally(() => clearTimeout(timeoutId)) }Then update the fetch calls:
- response = await fetch(url, { + response = await fetchWithTimeout(url, { headers: { 'User-Agent': 'llmstxthub/1.0 (https://llmstxthub.com)' } - }) + }, 10000) // 10 second timeoutAnd similarly for the llms.txt and llms-full.txt fetches:
- const llmsResponse = await fetch(llmsUrl, { + const llmsResponse = await fetchWithTimeout(llmsUrl, { headers: { 'User-Agent': 'llmstxthub/1.0 (https://llmstxthub.com)' } - }) + }, 5000) // 5 second timeout for optional files
🧹 Nitpick comments (9)
apps/web/components/forms/submit-form.tsx (4)
118-120: Propagate server error details to users (at least in dev) to speed up debugging.You parse server JSON errors, but the toast shows a generic message. Consider surfacing
errorMessagein non-production to reduce support pings.- if (!response.ok) { - const errorData = await response.json().catch(() => ({ error: 'Failed to fetch metadata' })) - throw new Error(errorData.error || `Failed to fetch metadata: ${response.status}`) - } + if (!response.ok) { + const errorData = await response.json().catch(() => ({ error: 'Failed to fetch metadata' })) + throw new Error(errorData.error || `Failed to fetch metadata: ${response.status}`) + } ... - toast.error('Failed to fetch website information. Please try again.') + const baseMsg = 'Failed to fetch website information.' + const detail = process.env.NODE_ENV !== 'production' ? ` Details: ${errorMessage}` : '' + toast.error(`${baseMsg}${detail}`.trim())Also applies to: 156-160
147-147: Align llmsFullUrl type with form defaults/schema (avoid '' vs null mismatch).
defaultValues.llmsFullUrlisnull, but here you reset to''. If Zod expectsstring | null, keepnullto avoid subtle validation glitches.- llmsFullUrl: result.metadata.llmsFullUrl || '', // Only use if found in metadata, don't auto-generate + llmsFullUrl: result.metadata.llmsFullUrl ?? null, // keep null when absentIf the schema actually requires a string, we can standardize both defaults and reset to
''. Confirm with the Zod schema.
101-109: DRY: factor CSRF token lookup into a small helper.You duplicate the meta lookup in two handlers. A tiny helper improves readability and prevents drift.
// Place near top of component (client-safe) function getCsrfToken(): string | null { const el = document.querySelector('meta[name="csrf-token"]') return el instanceof HTMLMetaElement && el.content ? el.content : null }Then:
-const csrfMetaTag = document.querySelector('meta[name="csrf-token"]') -const headers: Record<string, string> = { 'Content-Type': 'application/json' } -if (csrfMetaTag instanceof HTMLMetaElement && csrfMetaTag.content) { - headers['x-csrf-token'] = csrfMetaTag.content -} else { - logger.warn('CSRF token not found in meta tag') -} +const headers: Record<string, string> = { 'Content-Type': 'application/json' } +const csrf = getCsrfToken() +if (csrf) headers['x-csrf-token'] = csrf +else logger.warn('CSRF token not found in meta tag')-const csrfMetaTag = document.querySelector('meta[name="csrf-token"]') -if (csrfMetaTag instanceof HTMLMetaElement && csrfMetaTag.content) { - formData.append('_csrf', csrfMetaTag.content) -} +const csrf = getCsrfToken() +if (csrf) formData.append('_csrf', csrf)Also applies to: 179-181
18-20: Enrich the component JSDoc per guidelines (props, returns, example).Add @param/@returns and a short example to satisfy repo standards.
/** * Submit form for adding websites to the directory. * @returns JSX.Element * @example * <SubmitForm /> */apps/web/app/api/fetch-metadata/__tests__/route-error-scenarios.test.ts (2)
63-66: Add JSON Content-Type header for consistency.Not strictly required for
request.json(), but keeps tests consistent with other POST cases and real clients.- const request = new Request('http://localhost/api/fetch-metadata', { + const request = new Request('http://localhost/api/fetch-metadata', { method: 'POST', - body: JSON.stringify({ website: 'https://nonexistent.domain' }) + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ website: 'https://nonexistent.domain' }) })
304-321: Avoid repeating the User-Agent literal; centralize in a constant.Reduces drift if the UA changes.
+// at top of file, after imports +const USER_AGENT = 'llmstxthub/1.0 (https://llmstxthub.com)' ... - expect(mockFetch).toHaveBeenNthCalledWith(1, 'https://example.com', { - headers: { - 'User-Agent': 'llmstxthub/1.0 (https://llmstxthub.com)' - } - }) + expect(mockFetch).toHaveBeenNthCalledWith(1, 'https://example.com', { + headers: { 'User-Agent': USER_AGENT } + }) ... - expect(mockFetch).toHaveBeenNthCalledWith(2, 'https://example.com/llms.txt', { - headers: { - 'User-Agent': 'llmstxthub/1.0 (https://llmstxthub.com)' - } - }) + expect(mockFetch).toHaveBeenNthCalledWith(2, 'https://example.com/llms.txt', { + headers: { 'User-Agent': USER_AGENT } + }) ... - expect(mockFetch).toHaveBeenNthCalledWith(3, 'https://example.com/llms-full.txt', { - headers: { - 'User-Agent': 'llmstxthub/1.0 (https://llmstxthub.com)' - } - }) + expect(mockFetch).toHaveBeenNthCalledWith(3, 'https://example.com/llms-full.txt', { + headers: { 'User-Agent': USER_AGENT } + })apps/web/app/api/fetch-metadata/route.ts (3)
49-63: Consider extracting thetoKeyfunction outside offetchMetadatafor better modularity.The
toKeyfunction is a pure utility that could be reused and tested independently. Consider moving it to module scope or a separate utilities file.+/** + * Convert URL to a comparable key format for duplicate detection + * @param url - The URL to convert + * @returns Object with hostname and path, or null if invalid + */ +function toKey(url: string) { + try { + const { hostname, pathname } = new URL( + normalizeUrl(url, { + stripWWW: true, + removeTrailingSlash: true, + removeQueryParameters: true + }) + ) + const path = pathname.endsWith('/') ? pathname : `${pathname}/` + return { hostname, path } + } catch { + return null + } +} + async function fetchMetadata(url: string) { try { // Check for duplicate websites const existingWebsites = await getWebsites() // Normalized URL not needed since we use toKey function below - /** - * Convert URL to a comparable key format - * @param u - The URL to convert - * @returns Object with hostname and path, or null if invalid - */ - const toKey = (u: string) => { - try { - const { hostname, pathname } = new URL( - normalizeUrl(u, { - stripWWW: true, - removeTrailingSlash: true, - removeQueryParameters: true - }) - ) - const path = pathname.endsWith('/') ? pathname : `${pathname}/` - return { hostname, path } - } catch { - return null - } - }
189-196: Consider simplifying the error re-throwing logic.The current implementation has some redundancy in error handling. The error is already being logged and the re-throwing logic could be simplified.
} catch (error) { - // Re-throw with the actual error message for better debugging - const errorMessage = error instanceof Error ? error.message : 'Failed to fetch metadata' - logger.error('Error in fetchMetadata:', { - data: { url, error: errorMessage }, - tags: { type: 'api', error_type: 'fetch_metadata_error' } - }) - throw error instanceof Error ? error : new Error(errorMessage) + // Re-throw with the actual error message for better debugging + logger.error('Error in fetchMetadata:', { + data: { url, error: error instanceof Error ? error.message : 'Failed to fetch metadata' }, + tags: { type: 'api', error_type: 'fetch_metadata_error' } + }) + throw error instanceof Error ? error : new Error('Failed to fetch metadata') }
206-247: Consider using a single object parameter for better maintainability.According to the coding guidelines, functions with more than one parameter should use a single object parameter. While
GETonly has one parameter, for consistency with the pattern and future extensibility, consider refactoring the internalfetchMetadatafunction call.Additionally, the error handling is well-implemented with proper logging and user-friendly error messages.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
apps/web/app/api/fetch-metadata/__tests__/route-error-scenarios.test.ts(1 hunks)apps/web/app/api/fetch-metadata/__tests__/route.test.ts(5 hunks)apps/web/app/api/fetch-metadata/route.ts(7 hunks)apps/web/components/forms/submit-form.tsx(4 hunks)
🧰 Additional context used
📓 Path-based instructions (10)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/functions.mdc)
**/*.{ts,tsx}: For functions with more than one parameter, use a single object parameter instead of multiple positional parameters
Name the input type for such object parameters using the pattern {FunctionName}Input
For single-parameter functions, direct positional arguments are acceptable
**/*.{ts,tsx}: Write concise, technical TypeScript code
Use functional and declarative programming patterns; avoid classes
Prefer iteration and modularization over code duplication
Use descriptive variable names with auxiliary verbs (e.g., isLoading, hasError)
Use TypeScript for all code; prefer interfaces over types
Avoid enums; use object maps instead
Use thefunctionkeyword for pure functions
Avoid unnecessary curly braces in simple conditionals; use concise syntax
**/*.{ts,tsx}: Use interfaces over types when possible in TypeScript
Avoid enums; prefer const objects (as const) instead
Use proper error handling with custom error types
Favor functional programming patterns
Avoid classes
Use pure functions where possible
Implement proper error handling
Use meaningful variable names
Keep functions small and focused
Files:
apps/web/components/forms/submit-form.tsxapps/web/app/api/fetch-metadata/route.tsapps/web/app/api/fetch-metadata/__tests__/route-error-scenarios.test.tsapps/web/app/api/fetch-metadata/__tests__/route.test.ts
**/*.tsx
📄 CodeRabbit inference engine (.cursorrules)
**/*.tsx: Use functional React components typed with TypeScript interfaces
Write declarative JSX
Use Shadcn UI, Radix UI, and Tailwind CSS for components and styling
Implement responsive, mobile-first design with Tailwind CSS
When using next/image, include explicit sizes and enable lazy loadingAdd JSDoc blocks for React components, including brief description, @param props with individual prop fields, @returns, and an @example usage
Files:
apps/web/components/forms/submit-form.tsx
apps/web/{**/*.{tsx,css},tailwind.config.{js,ts},postcss.config.{js,ts}}
📄 CodeRabbit inference engine (.cursor/rules/website.mdc)
Follow mobile-first responsive design using Tailwind CSS 4
Files:
apps/web/components/forms/submit-form.tsx
apps/web/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/website.mdc)
apps/web/**/*.{ts,tsx}: Use Shadcn UI components with Radix UI primitives sourced from packages/design-system
Optimize for Core Web Vitals (images, fonts, script strategy, code-splitting)
Use camelCase for function and variable names
Use descriptive names with auxiliary verbs (e.g., isLoading, hasError)
Files:
apps/web/components/forms/submit-form.tsxapps/web/app/api/fetch-metadata/route.tsapps/web/app/api/fetch-metadata/__tests__/route-error-scenarios.test.tsapps/web/app/api/fetch-metadata/__tests__/route.test.ts
apps/web/**
📄 CodeRabbit inference engine (.cursor/rules/website.mdc)
Use kebab-case for directories
Files:
apps/web/components/forms/submit-form.tsxapps/web/app/api/fetch-metadata/route.tsapps/web/app/api/fetch-metadata/__tests__/route-error-scenarios.test.tsapps/web/app/api/fetch-metadata/__tests__/route.test.ts
**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/jsdoc-comments.mdc)
Add JSDoc blocks for utility functions, documenting purpose, @param entries, @returns, and @throws where applicable
Files:
apps/web/app/api/fetch-metadata/route.tsapps/web/app/api/fetch-metadata/__tests__/route-error-scenarios.test.tsapps/web/app/api/fetch-metadata/__tests__/route.test.ts
apps/web/app/**/route.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/nextjs.mdc)
Use API Route Handlers (route.ts/tsx) to manage backend logic within the app structure
Files:
apps/web/app/api/fetch-metadata/route.ts
apps/web/app/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/website.mdc)
apps/web/app/**/*.{ts,tsx}: Follow Next.js 15 App Router best practices in the web app
Use React Server Components (RSC) by default; only add 'use client' when necessary
Implement type-safe Server Actions using next-safe-action
Files:
apps/web/app/api/fetch-metadata/route.tsapps/web/app/api/fetch-metadata/__tests__/route-error-scenarios.test.tsapps/web/app/api/fetch-metadata/__tests__/route.test.ts
**/*.{spec,test}.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.cursor/rules/shared.mdc)
Add meaningful tests
Files:
apps/web/app/api/fetch-metadata/__tests__/route-error-scenarios.test.tsapps/web/app/api/fetch-metadata/__tests__/route.test.ts
{**/*.test.{ts,tsx},**/__tests__/*.{ts,tsx},*.test.{ts,tsx}}
📄 CodeRabbit inference engine (.cursor/rules/testing.mdc)
{**/*.test.{ts,tsx},**/__tests__/*.{ts,tsx},*.test.{ts,tsx}}: Place tests next to the file being tested using the .test.ts or .test.tsx extension
Alternatively, place tests in a tests directory located alongside the source files
Match the test folder structure to the source code structure
In tests, import from '@/test/test-utils' and use the custom render to get consistent providers/utilities
Structure tests with describe blocks and it cases using clear scenario names
Follow the Arrange, Act, Assert (AAA) pattern within test cases
Prefer @testing-library/user-event over fireEvent for user interactions
Follow React Testing Library query priority: getByRole, getByLabelText, getByPlaceholderText, getByText, getByDisplayValue, getByAltText, getByTitle, then getByTestId as a last resort
Mock external modules and APIs in tests using jest.mock and replace global.fetch with a jest.fn where needed
For Next.js API route tests, construct req/res using node-mocks-http createMocks
For Next.js server components, render with data from server-side props and assert rendered output
Test custom hooks with renderHook and wrap state changes in act
Always test both success and error paths for features and API calls
Focus tests on business logic, user interactions, error states, and edge cases
Use meaningful, descriptive test names
Keep tests focused and isolated; mock external dependencies as needed
Use proper cleanup in afterEach when needed
Use async/await properly for asynchronous tests
Maintain test directory structure that mirrors component/module structure
Files:
apps/web/app/api/fetch-metadata/__tests__/route-error-scenarios.test.tsapps/web/app/api/fetch-metadata/__tests__/route.test.ts
🔇 Additional comments (14)
apps/web/components/forms/submit-form.tsx (2)
101-109: Good: CSRF token guarded with a proper type check and fallback logging.Safer than asserting the element type; warning helps diagnose missing token in non-auth flows.
179-182: Good: CSRF token appended only when present and correctly typed.Keeps payload clean and avoids undefined values.
apps/web/app/api/fetch-metadata/__tests__/route.test.ts (5)
241-242: LGTM: updated expectation matches new network error message contract.Keeps tests aligned with improved server error propagation.
247-249: LGTM: mocking statusText enables precise error strings.This supports “ ” formatting.
261-262: LGTM: precise HTTP error message assertion.Matches the server’s new response shape.
379-379: LGTM: Cheerio error passthrough covered.Ensures DOM parsing failures remain observable.
401-403: LGTM: DOMPurify error passthrough covered.Maintains clarity on sanitization failures.
apps/web/app/api/fetch-metadata/__tests__/route-error-scenarios.test.ts (1)
34-57: Solid coverage for network error paths with structured logging assertions.The assertions on tags and payload are precise and valuable.
apps/web/app/api/fetch-metadata/route.ts (6)
80-96: Excellent error handling for network failures!The try-catch block properly handles network errors with detailed logging and user-friendly error messages. The User-Agent header is correctly included.
98-104: Good HTTP status checking with proper logging.The non-OK response handling is well-implemented with appropriate error logging and descriptive error messages.
106-115: Proper handling of response parsing errors.The separate try-catch for reading the response body is a good practice that helps distinguish between network and parsing errors.
145-160: Well-implemented optional llms.txt fetching.The try-catch wrapper prevents crashes while the debug-level logging is appropriate for optional file checks. Good use of the User-Agent header.
162-177: Consistent error handling for llms-full.txt.The implementation mirrors the llms.txt handling appropriately, maintaining consistency in the error handling approach.
255-296: Excellent error handling in POST endpoint.The POST handler properly validates input, checks for malicious URLs, and handles errors gracefully with detailed logging. The implementation aligns well with the PR objectives.
| // Check for llms.txt | ||
| // Check for llms.txt - but don't let it crash the whole function | ||
| let llmsExists = false | ||
| const llmsUrl = `${url}/llms.txt`.replace(/([^:]\/)\/+/g, '$1') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential issue with URL concatenation.
The current URL construction for llms.txt and llms-full.txt might not handle all edge cases correctly, especially if the base URL already has a trailing slash or path.
- const llmsUrl = `${url}/llms.txt`.replace(/([^:]\/)\/+/g, '$1')
+ const baseUrl = url.endsWith('/') ? url.slice(0, -1) : url
+ const llmsUrl = `${baseUrl}/llms.txt`- const llmsFullUrl = `${url}/llms-full.txt`.replace(/([^:]\/)\/+/g, '$1')
+ const baseUrl = url.endsWith('/') ? url.slice(0, -1) : url
+ const llmsFullUrl = `${baseUrl}/llms-full.txt`Or better yet, use the URL constructor:
- const llmsUrl = `${url}/llms.txt`.replace(/([^:]\/)\/+/g, '$1')
+ const llmsUrl = new URL('/llms.txt', url).href- const llmsFullUrl = `${url}/llms-full.txt`.replace(/([^:]\/)\/+/g, '$1')
+ const llmsFullUrl = new URL('/llms-full.txt', url).hrefAlso applies to: 164-164
🤖 Prompt for AI Agents
In apps/web/app/api/fetch-metadata/route.ts around lines 147 and 164, the manual
string concatenation with a regex to build llms.txt and llms-full.txt can fail
for edge cases (trailing slashes, paths); replace both constructions by creating
a URL object from the base (ensuring it's trimmed) and resolve the file paths
via the URL constructor (e.g., new URL('llms.txt', baseUrl).toString()) for
llms.txt and similarly for llms-full.txt so path joining is correct and robust.
| import { useEffect, useState } from 'react' | ||
| import { useForm } from 'react-hook-form' | ||
| import { toast } from 'sonner' | ||
| import { submitLlmsTxt } from '@/actions/submit-llms-xxt' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix broken import path (likely typo) for submit action.
@/actions/submit-llms-xxt looks wrong and will fail at runtime. It’s almost certainly submit-llms-txt.
Apply this diff:
-import { submitLlmsTxt } from '@/actions/submit-llms-xxt'
+import { submitLlmsTxt } from '@/actions/submit-llms-txt'Run to verify the correct file:
#!/bin/bash
# Find the action file and all import sites
fd -H -a 'submit-llms-*.ts*' | sed 's/^/FILE: /'
rg -n "submitLlmsTxt|submit-llms-xxt|submit-llms-txt" -C2🤖 Prompt for AI Agents
In apps/web/components/forms/submit-form.tsx around line 9 the import path is a
typo: change "import { submitLlmsTxt } from '@/actions/submit-llms-xxt'" to
"import { submitLlmsTxt } from '@/actions/submit-llms-txt'"; ensure the target
file exists and update any other occurrences if needed, then run the provided
shell verification (fd/rg) to confirm the action file and all import sites are
correct.
Fixes 78% metadata fetching failure rate and 51% form submission errors
Summary by CodeRabbit