-
Notifications
You must be signed in to change notification settings - Fork 850
rework: change workspace id #6923
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Claude finished @dieriba's task —— View job Comprehensive Code Review: Workspace ID MigrationI've completed a thorough review of PR #6923. This is a significant feature that enables changing workspace IDs through a multi-step migration process. Below is my detailed analysis. 🎯 OverviewThis PR implements a comprehensive workspace ID migration system with:
Architecture: The migration is split into two phases:
|
| .await?; | ||
| check_w_id_conflict(&mut tx, &req.target_workspace_id).await?; | ||
|
|
||
| sqlx::query!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical Issue: Missing Database Schema Column
The workspace table INSERT is referencing a parent_workspace_id column that doesn't appear in the simplified schema documentation. This will cause a runtime error if the column doesn't exist.
INSERT INTO workspace (id, name, owner, deleted, premium, parent_workspace_id)
SELECT $1, $2, owner, deleted, premium, $3 FROM workspace WHERE id = $4Please verify that:
- The
parent_workspace_idcolumn exists in the workspace table - A migration has been added to create this column if it's new
- The column is properly indexed if it will be used in WHERE clauses (as seen in line 480)
| "v2_job_queue", | ||
| ]; | ||
|
|
||
| for table in non_auth_tables { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Security Concern: SQL Injection Risk
Using string formatting to construct SQL queries is dangerous and can lead to SQL injection:
sqlx::query(&format!(
r#"UPDATE {} SET workspace_id = $1 WHERE workspace_id = $2"#,
table
))While table comes from a hardcoded array, this pattern should be avoided. Consider using a macro or match statement to generate the queries at compile time instead:
macro_rules! update_workspace_id {
($tx:expr, $table:literal, $target:expr, $source:expr) => {
sqlx::query!(
concat!("UPDATE ", $table, " SET workspace_id = $1 WHERE workspace_id = $2"),
$target,
$source
)
.execute(&mut **$tx)
.await?;
};
}Or use individual sqlx::query! calls for each table to get compile-time verification.
|
|
||
| let mut tx = user_db.begin(&authed).await?; | ||
|
|
||
| let migrated_count = sqlx::query_scalar!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance Issue: Unbounded Batch Migration
The migrate_jobs function uses a single large CTE query that could potentially update millions of rows at once:
WITH batch_completed_jobs AS (
SELECT id FROM v2_job_completed WHERE workspace_id = $1 LIMIT $2
),
batch_jobs AS (
SELECT id FROM v2_job WHERE workspace_id = $1 LIMIT $2
)Issues:
- Lock contention: This could hold locks on v2_job and v2_job_completed tables for extended periods
- Transaction timeout: Large batches may exceed transaction timeout limits
- Memory pressure: Processing 10,000+ rows in a single transaction can cause memory issues
Recommendations:
- Add a maximum batch size limit (e.g., 1000-5000 rows)
- Consider processing jobs in smaller chunks with brief pauses between batches
- Add progress logging to help diagnose stuck migrations
- Consider using a queue-based approach for very large workspaces
| sqlx::query!("DELETE FROM job_stats WHERE workspace_id = $1", &w_id) | ||
| .execute(&mut *tx) | ||
| .await?; | ||
| migrate_auth_tables(&mut tx, &req.source_workspace_id, &req.target_workspace_id).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Data Integrity Risk: No Validation Before Auth Migration
The migrate_auth_tables function is called in complete_workspace_migration without verifying that all non-auth data has been successfully migrated. If job migration fails partway through, users could lose access to their workspace data.
Suggest adding validation:
// Before migrating auth tables, verify all critical data is migrated
let remaining_data = sqlx::query_scalar!(
"SELECT COUNT(*) FROM (
SELECT id FROM v2_job WHERE workspace_id = $1
UNION ALL
SELECT id FROM v2_job_completed WHERE workspace_id = $1
UNION ALL
SELECT path FROM script WHERE workspace_id = $1
UNION ALL
SELECT path FROM flow WHERE workspace_id = $1
) as remaining",
source
)
.fetch_one(&mut **tx)
.await?;
if remaining_data.unwrap_or(0) > 0 {
return Err(Error::BadRequest(
"Cannot complete migration: source workspace still contains data".to_string()
));
}| migratingJobs = true | ||
| jobMigrationJobId = await JobService.runScriptByPath({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Race Condition: Job Creation Without Workspace Verification
The job is created in the old workspace but operates on data that may have already been partially migrated to the new workspace:
jobMigrationJobId = await JobService.runScriptByPath({
workspace: $workspaceStore!, // old workspace
path: hubPaths.workspaceMigrator,
requestBody: {
source_workspace_id: oldWorkspaceId,
target_workspace_id: newId
},
skipPreprocessor: true
})Issues:
- If the user's browser session is still bound to the old workspace, and auth tables have been migrated, the job creation might fail
- The job script needs to be available in the old workspace to execute
Consider creating the migration job in a system/admin workspace or ensure the migration script is available in both workspaces.
| function startPolling() { | ||
| if (!pollInterval) { | ||
| pollInterval = setInterval(checkJobStatus, 1000) as any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UX Issue: Hardcoded Polling Interval
The polling interval is set to 1 second (1000ms) which may be too aggressive for long-running migration jobs:
pollInterval = setInterval(checkJobStatus, 1000) as anyFor migrations that could take minutes or hours:
- Consider exponential backoff (start at 1s, increase to 5s, then 10s)
- Or use a fixed 2-3 second interval to reduce API load
- Add a timeout mechanism to detect stuck migrations
Example:
let pollCount = 0;
function getPollingInterval() {
if (pollCount < 10) return 1000; // First 10s: 1s interval
if (pollCount < 30) return 2000; // Next 40s: 2s interval
return 5000; // After 50s: 5s interval
}| } | ||
| }) | ||
| sendUserToast(`Migration completed and old workspace deleted`) | ||
| window.location.href = `/workspace_settings?tab=general&workspace=${newId}` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical UX Bug: Forced Redirect Without User Confirmation
After migration completes, the code immediately redirects the user to the new workspace without confirmation or chance to review:
window.location.href = `/workspace_settings?tab=general&workspace=${newId}`Issues:
- If the redirect URL is malformed, users could lose access
- No chance to bookmark or save the new workspace URL
- Hard refresh loses any unsaved state in other tabs
Suggest:
- Show a success message with the new workspace URL first
- Require user to click a button to navigate
- Or add a 5-second countdown with cancel option:
sendUserToast(`Migration complete! Redirecting to ${newId} in 5 seconds...`)
setTimeout(() => {
window.location.href = `/workspace_settings?tab=general&workspace=${newId}`
}, 5000)| let totalMigrated = 0; | ||
| const batchSize = 10000; | ||
|
|
||
| while (true) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Infinite Loop Risk: No Max Iteration Limit
The migration loop has no maximum iteration count, which could run indefinitely if there's a bug:
while (true) {
// ... migration logic
if (migratedInBatch < batchSize) {
break;
}
}Issues:
- If
migrated_countis always equal tobatchSizedue to a bug, this loops forever - No safety mechanism to detect stuck migrations
- Could consume API rate limits
Add safeguards:
const MAX_ITERATIONS = 10000; // Adjust based on expected maximum
let iterations = 0;
while (iterations < MAX_ITERATIONS) {
iterations++;
// ... existing logic
if (migratedInBatch < batchSize) {
break;
}
}
if (iterations >= MAX_ITERATIONS) {
log.error(colors.red(`Migration stopped: exceeded ${MAX_ITERATIONS} iterations`));
throw new Error("Migration exceeded maximum iterations");
}| .await?; | ||
| } | ||
|
|
||
| sqlx::query!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Data Migration Issue: flow Table Requires Special Handling
The flow table is handled differently from other tables - it uses INSERT INTO...SELECT then DELETE instead of UPDATE. However, this approach has issues:
- Foreign key violations: If
flow_versionorflow_nodetables have foreign keys referencingflow.workspace_id + flow.path, the INSERT will fail - Lost data risk: If the INSERT fails, the subsequent DELETE would need to be skipped, but there's no error handling for this
- Performance: INSERT+DELETE is slower than UPDATE for large tables
Consider using UPDATE like the other tables unless there's a specific reason for this approach. If there ARE constraints preventing UPDATE, they should be documented in a comment.
| sqlx::query!("DELETE FROM http_trigger WHERE workspace_id = $1", &w_id) | ||
| .execute(&mut *tx) | ||
| .await?; | ||
| migrate_workspace_data_tables(&mut tx, &req.target_workspace_id, &source_workspace_id).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic Bug: Reverting Data to Wrong Workspace
The revert function calls migrate_workspace_data_tables with reversed parameters:
migrate_workspace_data_tables(&mut tx, &req.target_workspace_id, &source_workspace_id).await?;This moves data from target → source, which is correct for a revert. However, the function then deletes the target workspace entirely. This means:
- If there's NEW data created in the target workspace after initial migration, it's lost
- Users might have created resources in the target workspace during testing
Consider either:
- Warning users that target workspace will be deleted with all its contents
- Checking if target workspace has any modified/new data before allowing revert
- Providing an option to preserve target workspace data
| workspace w | ||
| WHERE | ||
| w.parent_workspace_id = $1 | ||
| AND NOT EXISTS ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Query Optimization: Inefficient NOT EXISTS Subquery
The query uses a correlated NOT EXISTS subquery which can be slow on large databases:
WHERE w.parent_workspace_id = $1
AND NOT EXISTS (
SELECT 1 FROM usr u WHERE u.workspace_id = w.id
)This performs a table scan for each workspace. For better performance:
WHERE w.parent_workspace_id = $1
AND w.id NOT IN (
SELECT DISTINCT workspace_id FROM usr
)Or use a LEFT JOIN:
LEFT JOIN usr u ON u.workspace_id = w.id
WHERE w.parent_workspace_id = $1
AND u.workspace_id IS NULLBoth alternatives allow the query planner to build a hash table of workspace_ids instead of scanning for each row.
| let newId = $state('') | ||
| $: validateName(newId) | ||
| $effect(() => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Svelte 5 Best Practice: $effect Dependency Tracking
This $effect has implicit dependencies that could cause issues:
$effect(() => {
if (!incompleteMigration && newName) {
newId = newName.toLowerCase().replace(/\s/gi, '-')
}
})Issues:
- Runs on every component render, not just when
newNameorincompleteMigrationchange - The reactive dependency tracking might miss changes in some edge cases
Better approach - make dependencies explicit or use $derived:
// Option 1: Use $derived instead
const newId = $derived(
incompleteMigration
? incompleteMigration.targetWorkspaceId
: newName.toLowerCase().replace(/\s/gi, '-')
);This is more efficient and follows Svelte 5 best practices.
| validateName(newId) | ||
| }) | ||
| async function validateName(id: string): Promise<void> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance Issue: Validation Runs on Every Effect Trigger
The validation effect runs on every render:
$effect(() => {
validateName(newId)
})This causes excessive API calls to check workspace existence. Issues:
- No debouncing - calls API for every keystroke
- Runs even when
newIdhasn't changed - Could trigger rate limiting
Add debouncing:
let validationTimeout: number | null = null;
$effect(() => {
if (validationTimeout) {
clearTimeout(validationTimeout);
}
validationTimeout = setTimeout(() => {
validateName(newId);
}, 300) as any;
});| } | ||
| export let open = false | ||
| onDestroy(() => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Memory Leak: Polling Not Stopped on Component Destroy
While onDestroy calls stopPolling(), there are edge cases where the interval might not be cleared:
- If component unmounts during async operations
- If error handlers redirect before
onDestroyruns
Add cleanup in error handlers too:
} catch (err: any) {
stopPolling() // ✓ Good!
migratingJobs = false
// ... rest of error handling
}Also consider using $effect cleanup instead of onDestroy for better Svelte 5 compatibility:
$effect(() => {
if (migratingJobs && jobMigrationJobId) {
startPolling();
return () => stopPolling(); // Automatic cleanup
}
});| "cloneRepoToS3forGitRepoViewer_0": "hub/19825/clone_repo_and_upload_to_instance_storage", | ||
| "cloneRepoToS3forGitRepoViewer": "hub/19827/clone_repo_and_upload_to_instance_storage" | ||
| "cloneRepoToS3forGitRepoViewer": "hub/19827/clone_repo_and_upload_to_instance_storage", | ||
| "workspaceMigrator": "hub/28067/workspace_job_migrator" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dependency: Hub Script Must Exist and Be Accessible
The migration relies on a Hub script hub/28067/workspace_job_migrator that must:
- Be available and accessible to all workspaces attempting migration
- Have the correct implementation matching the expected API (source_workspace_id, target_workspace_id parameters)
- Use the CLI migrate command correctly
Critical questions to verify:
- Does this Hub script exist and is it published?
- What happens if the Hub is unavailable during migration?
- Should there be a fallback mechanism or local copy of the migration script?
- Does the script have proper error handling for partial migrations?
Consider adding validation to check if the Hub script exists before starting migration, or bundling the migration script as part of the Windmill installation for reliability.
No description provided.