Skip to content

Releases: codesandbox/codesandbox-sdk

v2.3.0

29 Sep 08:03
8a4d86a
Compare
Choose a tag to compare

2.3.0 (2025-09-29)

History of Hibernation and Persistence

In the original CodeSandbox design, we optimized sandbox lifecycles to feel like cloud-based laptops. When a user stepped away from a project, the sandbox would automatically hibernate after a short period of inactivity. On return, the sandbox would restore almost instantly to its previous state — including memory and persistence.

At the time, most sandboxes were small, short-lived projects. We managed persistence by storing each sandbox on disk for up to seven days before archiving. This provided a seamless user lifecycle: persistence was automatic, reliable, and required no manual intervention.

We also introduced Live Forking, which allowed users to fork a running sandbox while sharing memory with the original. This enabled smooth flows such as starting in a read-only, always-up-to-date main branch, then branching into a writable sandbox without interruption.

This approach delivered:

  • A simple mental model: “My sandbox is always where I left it.”
  • Cost efficiency: unused sandboxes were automatically hibernated or archived.
  • Low configuration overhead: persistence and storage were managed behind the scenes.

What We Have Learned

As CodeSandbox evolved into an SDK, use cases multiplied — many beyond what our original system anticipated. This introduced challenges around scalability, reliability, and flexibility.

The biggest friction today comes from our opinionated approach to hibernation and persistence.

Problems with Automatic Hibernation

  • Timeout fragility: Hibernation is triggered by a configurable timeout, but only certain SDK protocol messages extend it. Other activity (e.g. HTTP requests, file system operations) does not, making the design confusing and brittle.
  • In-sandbox management: Because the timeout is managed internally, it can drift or fail. This sometimes keeps sandboxes alive longer than expected — or hibernates them too early.
  • Network triggers don’t reset timeouts: While SDK users can configure sandboxes to wake on HTTP or WebSocket activity, those interactions don’t extend the timeout. Developers often struggle to keep VMs alive reliably.

Problems with Archiving

  • To stabilize clusters, we recently shortened the archive window from seven to four days (and sometimes just two days during peak load). While this improved system health, it introduced unpredictable resume behavior.
  • Resuming from an archived state (CLEAN) takes significantly longer (up to 60 seconds) compared to a normal resume (RESUME, ~1–3 seconds).
  • SDK users now must detect the boot type, handle longer startup times, and explain this inconsistency to their end users.

Problems with Live Forking

Live Forking introduced scalability bottlenecks. In some scenarios, thousands of sandboxes read simultaneously from a single “origin” sandbox’s memory, degrading performance across the platform.

In short, SDK users have worked hard to adapt our system to their needs. But the wide variety of use cases has proven incompatible with our current hibernation, persistence, and forking models. This mismatch has been a major source of reliability and scalability issues.

What We Are Shipping Today

SDK v2.3.0

SDK version 2.3.0 represents our Best Practices release. This is a NON-BREAKING change. We have also rewritten our docs to show you how to best take advantage of the current state of the service.

Feature

  • Add a delete method
  • Make id in connect and createSession an optional field, allowing you to default to global user

Fix

  • Fix: Defaulting to public-hosts as privacy
  • Fix: .gitignore is now included in the template build

What We Are Working On

A REST-based Sandbox Agent

The current SDK requires a WebSocket connection, adding complexity across different environments. Moving to a REST-based agent will:

  • Simplify the mental model
  • Eliminate connection management overhead
  • Make the interface easier and safer to use

Long-term persistence

When a sandbox is hibernated, we create a snapshot. If that snapshot isn’t resumed within 2–7 days (depending on cluster health), the sandbox is archived. This makes the resume process unpredictable: normally it takes 1–3 seconds, but resuming from an archived state can take up to 60 seconds.

With long-term snapshot persistence, our goal is to eliminate archiving altogether. This ensures predictable resume times of just 1–3 seconds, no matter how long a sandbox has been hibernated.

Replace the current hibernation timeout and automatic wakeup

For the best experience and the most control, we recommend adopting active lifecycle management. This applies both to the current service and to future updates.

That said, we also want to support a simple, reliable timeout and wakeup mechanism. Our goal is to provide a default behavior that’s easy to understand, while still allowing room for configuration where developers need it.

Here are some key questions to consider:

  • Should the hibernation timeout only be extended when calling the Sandbox Agent? (For example, using its health endpoint as a heartbeat.)
  • If the timeout should extend on any request to the sandbox, are there certain types of requests that should not extend it?
  • If any request can wake the sandbox, are there certain types of requests that should not trigger a wakeup?

Feedback & Collaboration

We deeply appreciate our users for supporting us through this transition. One of the clearest lessons has been that you want sandboxes to behave like low-level, controllable resources, not high-level “laptops in the cloud.”

Your feedback has been invaluable in shaping the future of the SDK.

As we move forward, we invite you to share:

  • Comments or concerns
  • Specific use cases you’d like us to consider
  • Interest in joining feedback sessions or implementation discussions

We’re committed to making this transition smooth and giving you the tools and flexibility you need.

With love, The CodeSandbox SDK Team ❤️

v2.2.1

16 Sep 12:31
7f6f7b4
Compare
Choose a tag to compare

2.2.1 (2025-09-16)

Bug Fixes

  • Throw error when invalid port is used (#188) (d0bdf31)

v2.2.0

02 Sep 12:39
ca0e02a
Compare
Choose a tag to compare

2.2.0 (2025-09-02)

Features

Bug Fixes

  • ensure private preview on private sandbox (#179) (04381a0)
  • prevent api config overrides (#177) (a9ec1a7)
  • Queue messages on lost connection for reconnect (#176) (c5a8ffd)

v2.1.0

22 Aug 13:38
73e3e5c
Compare
Choose a tag to compare

2.1.0 (2025-08-22)

Features

  • add fetching single sandbox (#142) (2f58d43)
  • add listRunning method to sandboxes namespace (#145) (6050dbd)
  • add open telemetry for sandboxes methods (#147) (b331315)
  • add tracing to Sandbox and SandboxClient, also allow passing to browser and node connectors (#150) (6ef2bf5)
  • Debug Sandboxes through CLI (#163) (9af1cdd)
  • enhance container setup logging in build command (836a7a6)
  • enhance container setup logging in build command (a6f9fe7)
  • private sandbox, public hosts with public-hosts privacy (#154) (dce7caf)

Bug Fixes

  • Add custom retry delay support for startVM API calls (#156) (ce3a282)
  • Decouple pitcher-client (#148) (3a6f9ea)
  • friendly 503 error for overloaded Sandbox (#172) (f9987b1)
  • include response handling in retries and dispose clients in build to avoid reconnects (#162) (f70903a)
  • properly dispose and prevent wakeups (#170) (029e3a5)
  • Stabilize websocket connection (#166) (cb2f330)
  • update log line length to be smaller (9a3099f)

v2.0.7

06 Aug 09:46
fbb30a6
Compare
Choose a tag to compare

2.0.7 (2025-08-06)

Bug Fixes

v2.0.6

06 Aug 09:01
207dd06
Compare
Choose a tag to compare

2.0.6 (2025-08-06)

Bug Fixes

  • Add retries to all idempotent endpoints and added parallel file writing (#140) (db8aded)
  • Fix broken authorization in preview hosts (20a4e53)
  • Fix broken authorization in preview hosts (71b38b4)
  • Update to latest Ink and React 19 and bundle React and Ink into CLI (#138) (62da4fe)

Features

  • Add running VMs to CLI Dashboard (#129)

v2.0.5

29 Jul 09:05
281a2bd
Compare
Choose a tag to compare

2.0.5 (2025-07-29)

Bug Fixes

  • timeout errors on keepActiveWhileConnected (a519bcf)

v2.0.4

16 Jul 13:37
5ab4da7
Compare
Choose a tag to compare

2.0.4 (2025-07-16)

Bug Fixes

  • fix dependencies (0211660)
  • move some devdependencies to dependencies (77f74f1)

v2.0.3

09 Jul 09:00
988b892
Compare
Choose a tag to compare

2.0.3 (2025-07-08)

Bug Fixes

  • disable sentry by default, make it optional (#126) (09d9e8a)

v2.0.2

02 Jul 12:43
82ec4c3
Compare
Choose a tag to compare

2.0.2 (2025-07-02)

Bug Fixes

  • add api client to runningvms query fn (0119379)
  • add apiClient to context and pass to query fn (75f56da)
  • Template resolve files fixes (#121) (6d455ad)