Skip to content

Conversation

@de6p
Copy link
Contributor

@de6p de6p commented Jul 7, 2025

I have completely revamped the Volcano dashboard by migrating it to a modern architecture powered by Next.js, tRPC, shadcn/ui, and Turborepo. With this upgrade, the dashboard now delivers 5 to 10 times better performance compared to the previous version. The new user interface is blazing fast, visually stunning, and highly responsive. Additionally, Turborepo significantly improves the development workflow by enabling efficient monorepo management and faster build times.

image

@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign william-wang
You can assign the PR to them by writing /assign @william-wang in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jayesh9747
Copy link
Contributor

@de6p i think you should create small small pr , so that it can be better to review.

@karanBRAVO
Copy link
Member

@de6p i think you should create small small pr , so that it can be better to review.

+1

@karanBRAVO
Copy link
Member

Hello @de6p,
I have one quick question, why are we using next.js instead of vite.js+express.js?

@JesseStutler
Copy link
Member

Hello @de6p, I have one quick question, why are we using next.js instead of vite.js+express.js?

cc @de6p

@JesseStutler
Copy link
Member

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is a massive and impressive refactoring of the Volcano dashboard. The migration to a modern stack with Next.js, tRPC, and Turborepo is a huge step forward. The new UI looks clean and the architecture is much more scalable.

I've conducted a thorough review and identified several areas for improvement, from critical correctness issues like manual YAML parsing to maintainability and efficiency enhancements. Addressing these will help solidify the new foundation and ensure the dashboard is robust and easy to maintain.

Comment on lines 67 to 147
const parseYamlToManifest = (yamlString: string) => {
const lines = yamlString.trim().split('\n')
const manifest: any = {
apiVersion: '',
kind: '',
metadata: { name: '' },
spec: { containers: [] }
}

const requiredFields = ['apiVersion', 'kind', 'metadata', 'spec']
const foundFields = new Set<string>()
let currentSection = ''
let currentSubSection = ''
let currentContainer: any = null
let inContainers = false

for (const line of lines) {
const trimmed = line.trim()
if (!trimmed || trimmed.startsWith('#')) continue

for (const field of requiredFields) {
if (trimmed.startsWith(`${field}:`)) {
foundFields.add(field)
if (field === 'apiVersion' || field === 'kind') {
manifest[field] = trimmed.split(':')[1].trim()
}
currentSection = field
currentSubSection = ''
}
}

if (trimmed.startsWith('name:') && currentSection === 'metadata') {
manifest.metadata.name = trimmed.split(':')[1].trim()
}

if (currentSection === 'spec') {
if (trimmed.startsWith('containers:')) {
inContainers = true
currentSubSection = 'containers'
} else if (trimmed.startsWith('- name:') && inContainers) {
if (currentContainer) {
manifest.spec.containers.push(currentContainer)
}
currentContainer = { name: trimmed.split(':')[1].trim() }
} else if (trimmed.startsWith('image:') && currentContainer) {
currentContainer.image = trimmed.split(':')[1].trim()
} else if (trimmed.startsWith('ports:') && currentContainer) {
currentContainer.ports = []
currentSubSection = 'ports'
} else if (trimmed.startsWith('- containerPort:') && currentSubSection === 'ports' && currentContainer) {
const port = parseInt(trimmed.split(':')[1].trim())
currentContainer.ports.push({ containerPort: port })
} else if (trimmed.startsWith('restartPolicy:') && currentSection === 'spec') {
manifest.spec.restartPolicy = trimmed.split(':')[1].trim()
}
}
}

if (currentContainer) {
manifest.spec.containers.push(currentContainer)
}

const missingFields = requiredFields.filter(field => !foundFields.has(field))
if (missingFields.length > 0) {
throw new Error(`Missing required fields: ${missingFields.join(', ')}`)
}

if (manifest.kind !== 'Pod') {
throw new Error('Kind must be "Pod"')
}

if (!manifest.metadata.name) {
throw new Error('Missing required field: metadata.name')
}

if (!manifest.spec.containers || manifest.spec.containers.length === 0) {
throw new Error('Pod spec must include at least one container')
}

return manifest
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Manually parsing YAML using string splitting and line-by-line processing is extremely brittle and a significant security and correctness risk. It can easily fail with valid but differently formatted YAML (e.g., different indentation, comments, ordering).

Please use a robust, well-tested library like js-yaml to parse the YAML string. You will need to add it as a dependency.

Comment on lines 79 to 126
const parseYamlToManifest = (yamlString: string) => {
const lines = yamlString.trim().split('\n')
const manifest: any = {
apiVersion: '',
kind: '',
metadata: { name: '' },
spec: {}
}

const requiredFields = ['apiVersion', 'kind', 'metadata', 'spec']
const foundFields = new Set<string>()

for (const line of lines) {
const trimmed = line.trim()
if (!trimmed || trimmed.startsWith('#')) continue

// Check for required top-level fields
for (const field of requiredFields) {
if (trimmed.startsWith(`${field}:`)) {
foundFields.add(field)
if (field === 'apiVersion' || field === 'kind') {
manifest[field] = trimmed.split(':')[1].trim()
}
}
}

// Extract metadata.name
if (trimmed.startsWith('name:') && foundFields.has('metadata')) {
manifest.metadata.name = trimmed.split(':')[1].trim()
}
}

// Validate required fields
const missingFields = requiredFields.filter(field => !foundFields.has(field))
if (missingFields.length > 0) {
throw new Error(`Missing required fields: ${missingFields.join(', ')}`)
}

if (manifest.kind !== 'Queue') {
throw new Error('Kind must be "Queue"')
}

if (!manifest.metadata.name) {
throw new Error('Missing required field: metadata.name')
}

return manifest
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This component uses a manual YAML parser, which is brittle and unsafe. As with the pod creation dialog, this should be replaced with a dedicated YAML parsing library like js-yaml to ensure correctness and security.

Comment on lines 8 to 23
<div className="flex">
<Sidebar />
<main className="w-full flex-1 overflow-hidden">
<ScrollArea className="h-[calc(100dvh)]">
<section className="container grid items-center gap-6 pb-6 pt-12">
<div className="flex max-w-[980px] flex-col items-start gap-2 mx-auto">
<h1 className="text-xl font-semibold text-purple text-center">
Dashboard
</h1>
</div>
</section>
<hr className="max-w-x" />
<DashboardUtils />
</ScrollArea>
</main>
</div>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This page duplicates the layout structure from apps/web/src/app/(dashboard)/layout.tsx by including its own <Sidebar /> and <main> tag. This will result in a nested layout with two sidebars on the home page.

To fix this, this file should be moved to apps/web/src/app/(dashboard)/page.tsx. This will make it part of the dashboard route group, and the correct layout will be applied automatically. After moving the file, you can remove the redundant layout elements.

    <ScrollArea className="h-[calc(100dvh)]">
      <section className="container grid items-center gap-6 pb-6 pt-12">
        <div className="flex max-w-[980px] flex-col items-start gap-2 mx-auto">
          <h1 className="text-xl font-semibold text-purple text-center">
            Dashboard
          </h1>
        </div>
      </section>
      <hr className="max-w-x" />
      <DashboardUtils />
    </ScrollArea>

Comment on lines 62 to 70
const queuesQuery = trpc.queueRouter.getAllQueues.useQuery(
undefined,
{
onError: (err) => {
console.error("Error fetching queues:", err);
setError(`Queues API error: ${err.message}`);
},
},
);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The queuesQuery is executed, but its data is never used within the component. This results in an unnecessary API call on every render of this page. Please remove this query to improve performance.

Comment on lines 7 to 20
let response = await k8sApi.listClusterCustomObject({
group: "batch.volcano.sh",
version: "v1alpha1",
plural: "jobs",
pretty: "true",
});


let filteredJobs = response.items || [];

const startIndex = (page - 1) * pageSize;
const endIndex = startIndex + pageSize;
const paginatedJobs = filteredJobs.slice(startIndex, endIndex);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation of server-side pagination fetches all jobs from the Kubernetes API and then slices the array. This is inefficient and will not scale well with a large number of jobs, potentially causing performance issues and high memory usage on the server.

The Kubernetes API supports pagination via the limit and continue parameters. You should leverage this for true server-side pagination. This comment also applies to fetchQueues and fetchPods in this file.


useEffect(() => {
if (podsQuery.data) {
console.log(podsQuery)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

A console.log statement was left in the code. These should be removed before merging to keep the console output clean in production.

"version": "0.1.0",
"private": true,
"scripts": {
"dev": "next dev ",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's a trailing space in the dev script. While this doesn't cause any issues, it's best to remove it for consistency and cleanliness.

    "dev": "next dev",


useEffect(() => {
if (queuesQuery.data) {
console.log(queuesQuery)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's a console.log statement here that should be removed before merging.

Comment on lines 5 to 10
export const Icons = {
dashboard: HomeIcon,
Cloud: Cloud,
Notepad: NotepadText,
Waypoint: Waypoints
} No newline at end of file

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The keys in the Icons object have inconsistent casing (dashboard vs. Cloud, Notepad, Waypoint). For better maintainability and predictability, it's best to use a consistent casing scheme, such as all lowercase.

Suggested change
export const Icons = {
dashboard: HomeIcon,
Cloud: Cloud,
Notepad: NotepadText,
Waypoint: Waypoints
}
export const Icons = {
dashboard: HomeIcon,
cloud: Cloud,
notepad: NotepadText,
waypoint: Waypoints
}

Comment on lines 3 to 9
export interface NavItem {
title: string;
icon?: keyof typeof Icons;
href: string;
disable?: boolean;
label: string;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The label property in the NavItem interface appears to be redundant, as its value is always the same as the title property in the navItems array. Removing it would simplify the data structure and reduce duplication.

export interface NavItem {
    title: string;
    icon?: keyof typeof Icons;
    href: string;
    disable?: boolean;
}

@JesseStutler
Copy link
Member

Also please fix the CI @de6p , and can we build the image to test now? I don't hope I meet some building errors when I want to test it

@de6p
Copy link
Contributor Author

de6p commented Jul 23, 2025

fixing it

de6p added 4 commits July 26, 2025 07:38
…ubernetes deployment configuration

Signed-off-by: Deep <[email protected]>
…ld tasks, and refactor TRPC context and router logic

Signed-off-by: Deep <[email protected]>
… process; remove outdated formatting check workflow

Signed-off-by: Deep <[email protected]>
…dency, and refactor navigation icons for consistency

Signed-off-by: Deep <[email protected]>
@de6p
Copy link
Contributor Author

de6p commented Aug 19, 2025

I’ve fixed all the requested changes; we can merge it now.

@Monokaix
Copy link
Member

Please check the gemini review comments and resolve the code conflict first.

@de6p
Copy link
Contributor Author

de6p commented Sep 17, 2025

@Monokaix I fixed all the comments .

@Monokaix
Copy link
Member

@Monokaix I fixed all the comments .

Please also resolve the code conflicts: )

Signed-off-by: Deep <[email protected]>
@de6p
Copy link
Contributor Author

de6p commented Sep 24, 2025

fixed the merge conflicts.

@Monokaix
Copy link
Member

Should also update https://github.com/volcano-sh/dashboard/blob/main/CONTRIBUTING.md as the docker build script changed.

@Monokaix
Copy link
Member

Why deleted test.yaml and build.yaml in workflow?

@Monokaix
Copy link
Member

Want to know if this is still a front-end and back-end separation architecture?

@Monokaix
Copy link
Member

There some issus I currently have found:

  1. The filter(Namespce, queue) func of job is lost
image 2. The "All" option for filter namespace is lost image 3. The logo of dashboard keep it as volcano logo image 4. Should also add a create job action image 5. The dashboard page can not reflect the queue's resource, because we have removed the related reosurces in queue yaml image 6. Some other actions are lost like delete/edit of queue/job/pod, it's already implemented in lFX term1

@de6p
Copy link
Contributor Author

de6p commented Sep 24, 2025

Why deleted test.yaml and build.yaml in workflow?

I deleted test.yaml and build.yaml because we now have a single ci.yaml workflow that runs linting, type-checking, and build steps together. This makes our CI setup simpler to maintain and ensures all checks run consistently in one place

@de6p
Copy link
Contributor Author

de6p commented Sep 24, 2025

Want to know if this is still a front-end and back-end separation architecture?

No, this is not a traditional front-end/back-end separation architecture. Instead, we are using Next.js with tRPC, which results in a tightly coupled setup. This means the front-end and back-end are more integrated, allowing us to share types, ensure end-to-end type safety, and speed up development. However, it differs from a separated architecture where front-end and back-end are completely decoupled services.

@Monokaix
Copy link
Member

OK, please solve other issues.

@Monokaix
Copy link
Member

It worked when I changed the dockerfile to

# Multi-stage build for TurboRepo Next.js app
FROM node:18-alpine AS base

# Stage 1: Install dependencies
FROM base AS deps
RUN apk add --no-cache libc6-compat
WORKDIR /app

# Copy package files
COPY package.json package-lock.json* turbo.json ./
COPY apps/web/package.json ./apps/web/
COPY packages/*/package.json ./packages/

# Install dependencies
RUN npm ci

# Stage 2: Build the application
FROM base AS builder
WORKDIR /app

# Copy package files and install dependencies again to ensure workspace links
COPY package.json package-lock.json* turbo.json ./
COPY apps/web/package.json ./apps/web/
COPY packages/*/package.json ./packages/
RUN npm ci

# Copy source code and node_modules from deps stage
COPY --from=deps /app/node_modules ./node_modules
COPY . .

# Build the web app
RUN cd apps/web && npm run build

# Stage 3: Production runtime
FROM base AS runner
WORKDIR /app

ENV NODE_ENV=production
# ENV NEXT_TELEMETRY_DISABLED=1

RUN addgroup --system --gid 1001 nodejs
RUN adduser --system --uid 1001 nextjs

# Create necessary directories
RUN mkdir -p apps/web/.next/cache && \
    chown nextjs:nodejs apps/web/.next/cache

# Copy the entire standalone output to root
COPY --from=builder --chown=nextjs:nodejs /app/apps/web/.next/standalone ./
COPY --from=builder --chown=nextjs:nodejs /app/apps/web/.next/static ./.next/static

# The public directory is often handled by the standalone output.
# If it's missing, this COPY command will fail. It's removed.

USER nextjs

EXPOSE 3000
ENV PORT=3000
ENV HOSTNAME="0.0.0.0"

# The standalone output places server.js in the root
CMD ["node", "server.js"]

@de6p
Copy link
Contributor Author

de6p commented Oct 9, 2025

There some issus I currently have found:

  1. The filter(Namespce, queue) func of job is lost

image 2. The "All" option for filter namespace is lost image 3. The logo of dashboard keep it as volcano logo image 4. Should also add a create job action image 5. The dashboard page can not reflect the queue's resource, because we have removed the related reosurces in queue yaml image 6. Some other actions are lost like delete/edit of queue/job/pod, it's already implemented in lFX term1

Five points are completed; only the fifth point is left. You can review the rest of the changes for now.

@de6p
Copy link
Contributor Author

de6p commented Oct 9, 2025

All the requested changes have been completed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants