Skip to content

Setup CDS extractor esbuild JS bundle #203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Jul 18, 2025

Conversation

data-douser
Copy link
Collaborator

@data-douser data-douser commented Jul 15, 2025

This pull request introduces significant improvements to the CDS extractor toolchain, focusing on bundling, validation, and command execution. The changes include adding a new esbuild-based bundling process, streamlining execution scripts, enhancing command validation, and updating dependencies. These updates aim to improve performance, simplify workflows, and enhance security.

Build and Workflow Improvements:

  • .github/workflows/cds-extractor-dist-bundle.yml: Added a GitHub Actions workflow to automate the validation, linting, testing, and bundling of the CDS extractor. This includes checks for bundle size, source map presence, and Node.js shebang.
  • extractors/cds/tools/package.json: Migrated the build process to use esbuild for bundling, added scripts for validating the bundle, and removed unnecessary dependencies such as shell-quote. [1] [2] [3]

Codebase Simplification:

Command Validation Enhancements:

Script Updates:

  • extractors/cds/tools/index-files.cmd and extractors/cds/tools/index-files.sh: Updated scripts to use the pre-built cds-extractor.bundle.js instead of requiring TypeScript compilation and dependency installation at runtime. Simplified logic for checking and running the bundle. [1] [2]

These changes collectively improve the efficiency, security, and maintainability of the CDS extractor toolchain.

Changes the build process for the CDS extractor to produce
`dist/cds-extractor.bundle.js` and `dist/cds-extractor.bundle.js.map`
files instead of a pair of `.js` and `.js.map` files for every
compiled `.ts` file.

Allows for pre-build of the JavaScript code used to run the CDS
extractor while committing the bare minimum number of files for the
"distribution" of the build.
@data-douser data-douser requested a review from Copilot July 15, 2025 21:16
Copilot

This comment was marked as outdated.

Adds a GitHub Actions workflow for "CDS Extractor Distribution Bundle"
in order to integrate the linting, unit testing, compiling, and
bundling of CDS extractor TS code into a single NodeJS-compatible
bundle.
Removes shell-quote as a dependency of the CDS extractor and
implements alternative approaches to sanitizing the data sources
that had previously been sanitized via the `quote` function of the
shell-quote library, which causes a code scanning alert when included
in the all-in-one `cds-extractor.bundle.js` file.
@data-douser data-douser self-assigned this Jul 16, 2025
@data-douser data-douser requested review from lcartey and Copilot July 16, 2025 17:36
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a comprehensive build system overhaul for the CDS extractor, implementing an esbuild-based bundling process that simplifies deployment and improves security. The changes replace the previous TypeScript compilation approach with a single pre-built JavaScript bundle, eliminating runtime dependencies and command execution vulnerabilities.

  • Migrated from TypeScript compilation to esbuild bundling with automated CI/CD validation
  • Enhanced command validation with predefined secure patterns and absolute path verification
  • Streamlined deployment scripts to use pre-built bundles instead of runtime compilation

Reviewed Changes

Copilot reviewed 20 out of 23 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
validate-bundle.js New validation script for testing bundle integrity and execution
esbuild.config.mjs ESBuild configuration for creating the bundled extractor
src/cds/compiler/command.ts Enhanced command validation with secure pattern matching
src/utils.ts Added path resolution and validation for source root arguments
index-files.sh/.cmd Simplified to use pre-built bundles instead of runtime compilation
package.json Updated build process and removed shell-quote dependency
.github/workflows/cds-extractor-dist-bundle.yml New CI workflow for bundle validation
Files not reviewed (1)
  • extractors/cds/tools/package-lock.json: Language not supported

Applies suggestions from Copilot review of PR advanced-security#203 for the
`advanced-security/codeql-sap-js` repo.
@data-douser data-douser marked this pull request as ready for review July 16, 2025 18:09
Standardizes the CDS extractor build configs to use a 'node20' as
the JS runtime "target".

Updates the CDS extractor's package.json file in order to specify
required "engines" for `node` and `npm`.

Minimizes the `tsconfig.json` config used by the CDS extractor in
order to only specify configs required for linting and testing,
given that the main build config is now specified in the
`esbuild.config.mjs` file.
Copy link
Contributor

@lcartey lcartey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - overall I think the bundle approach is the right one. Added some comments on the command validation and the bundle generation.

/**
* Predefined secure CDS command patterns
*/
const ALLOWED_CDS_COMMANDS = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe we actually need to implement an allow list for cds commands. Reviewing the code, I believe the command tested against this list is exclusively generated by the getBestCdsCommand. The command provided by that function is either already one of the items from this list, or it is a command created from a fixed path prefixed with the cache directory, in which case it is not checked against this list anyway.

Instead of an allow list, I think we should just generate the "validated commands" directly - e.g. an object with the executable and args properties.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made an attempt to rework the "validated commands" approach.

@data-douser data-douser requested a review from lcartey July 17, 2025 16:04
@lcartey lcartey enabled auto-merge July 18, 2025 08:17
@lcartey lcartey merged commit 881e066 into advanced-security:main Jul 18, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants