From 6b6923e128e979f6177af6a56fbe61b2e4a9a981 Mon Sep 17 00:00:00 2001 From: CPerezz Date: Wed, 22 Oct 2025 18:47:40 +0200 Subject: [PATCH 1/7] Add draft EIP: Contract Bytecode Deduplication Discount This proposal introduces a gas discount for contract deployments when the bytecode being deployed already exists in the state. The mechanism extends EIP-2930 access lists with an optional checkCodeHash flag to enable deterministic deduplication checks without breaking consensus. Key features: - Access-list based deduplication via checkCodeHash flag - Avoids GAS_CODE_DEPOSIT * L costs for duplicate deployments - Solves database divergence issues across different sync modes - Becomes particularly relevant with EIP-8037's increased gas costs This EIP is extracted from the original EIP-8037 proposal to allow independent review and adoption. --- EIPS/eip-draft.md | 226 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 226 insertions(+) create mode 100644 EIPS/eip-draft.md diff --git a/EIPS/eip-draft.md b/EIPS/eip-draft.md new file mode 100644 index 00000000000000..ea938df9639dc7 --- /dev/null +++ b/EIPS/eip-draft.md @@ -0,0 +1,226 @@ +--- +eip: draft +title: Contract Bytecode Deduplication Discount +description: Reduces gas costs for deploying duplicate contract bytecode via access-list based mechanism +author: Carlos Perez (@CPerezz) +discussions-to: https://ethereum-magicians.org/t/eip-8037-state-creation-gas-cost-increase/25694 +status: Draft +type: Standards Track +category: Core +created: 2025-10-22 +requires: 2930 +--- + +## Abstract + +This proposal introduces a gas discount for contract deployments when the bytecode being deployed already exists in the state. By extending EIP-2930 access lists with an optional `checkCodeHash` flag, transactions can signal which existing contract addresses should be checked for bytecode duplication. When a match is found, the deployment avoids paying `GAS_CODE_DEPOSIT * L` costs since clients already store the bytecode and only need to link the new account to the existing code hash. + +This EIP becomes particularly relevant with the adoption of EIP-8037, which increases `GAS_CODE_DEPOSIT` from 200 to 1,900 gas per byte. Under EIP-8037, deploying a 24kB contract would cost approximately 46.6M gas for code deposit alone, making the deduplication discount economically significant for applications that deploy identical bytecode multiple times. + +## Motivation + +Currently, deploying duplicate bytecode costs the same as deploying new bytecode, even though Ethereum clients don't store duplicated code in their databases. When the same bytecode is deployed multiple times, clients store only one copy and have multiple accounts point to the same code hash. Under EIP-8037's proposed gas costs, deploying a 24kB contract costs approximately 46.6M gas for code deposit alone (`1,900 × 24,576`). This charge is unfair for duplicate deployments where no additional storage is consumed. + +A naive "check if code exists in database" approach would break consensus because different nodes have different database contents due to mostly Sync-mode differences: +- Full-sync nodes: Retain all historical code, including from reverted/reorged transactions +- Snap-sync nodes: Only store code reachable from the current state trie + +Empirical analysis reveals that approximately 27,869 bytecodes existed in full-synced node databases with no live account pointing to them (as of the Cancun fork). A database lookup `CodeExists(hash)` would yield different results on different nodes, causing different gas costs and breaking consensus. + +This proposal solves the problem by making deduplication checks explicit and deterministic through access lists, ensuring all nodes compute identical gas costs regardless of their database state. (Notice here that even if fully-synced clients have more codes, there are no accounts whose codeHash actually is referencing them. Thus, users can't profit from such discounts which keeps consensus safe). + +## Specification + +The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 and RFC 8174. + +### Access List Extension + +EIP-2930 access list tuples are extended with an optional `checkCodeHash` boolean field: + +```json +{ + "address": "0x...", + "storageKeys": ["0x..."], + "checkCodeHash": true +} +``` + +### Consensus semantics: + +- The `checkCodeHash` field is OPTIONAL. If omitted, it defaults to `false`. +- Transactions with `checkCodeHash` fields are valid both pre-fork and post-fork. +- Pre-fork: Nodes MUST ignore the `checkCodeHash` field and MUST NOT grant deduplication discounts. +- Post-fork: Nodes MUST process the `checkCodeHash` field as specified below. + +### CodeHash Access-Set Construction + +Before transaction execution begins, build a set `W` (the "CodeHash Access-Set") as follows: + +``` +W = { codeHash(a) | a ∈ accessList, a.checkCodeHash = true, a exists in state, a has code } +``` + +Where: +- `W` is built from state at the **start** of transaction execution (before any state changes) +- Only addresses that **already exist** in the state contribute to `W` +- Only addresses that **have deployed code** (non-empty code) contribute to `W` +- Empty accounts or accounts with no code do NOT contribute their code hash to `W` + +### Contract Creation Gas Accounting + +When a contract creation transaction or opcode (`CREATE`/`CREATE2`) successfully completes and returns bytecode `B` of length `L`, compute `H = keccak256(B)` and apply the following gas charges: + +**Deduplication check:** +- If `H ∈ W`: Bytecode is a duplicate + - Do NOT charge `GAS_CODE_DEPOSIT * L` + - Link the new account's `codeHash` to the existing code hash `H` + - The bytecode `B` is NOT persisted (it already exists and it's the current behaviour) +- If `H ∉ W`: Bytecode is new + - Charge `GAS_CODE_DEPOSIT * L` + - Persist bytecode `B` under hash `H` + - Link the new account's `codeHash` to `H` + +**Gas costs:** +- The cost of reading `codeHash` for access-listed addresses is already covered by EIP-2929/2930 access costs (intrinsic access-list cost and cold→warm state access charges). +- No additional gas cost is introduced for the deduplication check itself. + +### Implementation Pseudocode + +```python +# Before transaction execution: +W = set() +for tuple in tx.access_list: + warm(tuple.address) # per EIP-2930/EIP-2929 rules + if tuple.checkCodeHash == true: + acc = load_account(tuple.address) + if acc exists and acc.code is not empty: + W.add(acc.codeHash) + +# On successful CREATE/CREATE2: +H = keccak256(B) +if H in W: + # Duplicate: no deposit gas + link_codehash(new_account, H) +else: + # New bytecode: charge and persist + charge(GAS_CODE_DEPOSIT * len(B)) + persist_code(H, B) + link_codehash(new_account, H) +``` + +## Rationale + +### Why Access-List Based Deduplication? + +The access-list approach provides several critical properties: + +1. Deterministic behavior: +The result depends only on the transaction's access list and current state, not on local database contents. All nodes compute the same gas cost. + +2. No reverse index requirement: +Unlike other approaches, this doesn't require maintaining a `codeHash → [accounts]` reverse index, which would add significant complexity and storage overhead. + +3. Leverages existing infrastructure: +Builds on EIP-2930 access lists and EIP-2929 access costs, requiring minimal protocol changes. + +4. Explicit opt-in: +Transactions must explicitly indicate which addresses to check. This prevents unexpected behavior and gives users/wallets control over gas optimization. + +5. Forward compatibility: +Pre-fork nodes ignore `checkCodeHash` and never grant discounts. Post-fork, all nodes enforce identical behavior. Wallets can optionally add the field to optimize gas, but its absence doesn't invalidate transactions. + +6. Avoiding have a code-root for state: +At this point, clients handle code storage on their own ways. They don't have any consensus on the deployed existing codes (besides that all of the ones referenced in account's codehash fields exist). +Changing this seems a lot more complex and unnecessary. + +### Same-Block Deployments + +Sequential transaction execution ensures that a deployment storing new code makes it visible to later transactions in the same block: + +1. Transaction `T_A` deploys bytecode `B` at address `X` + - Pays full `GAS_CODE_DEPOSIT * L` (no prior contract has this bytecode) + - Code is stored under hash `H = keccak256(B)` + +2. Later transaction `T_B` in the same block deploys the same bytecode `B`: + - `T_B` includes address `X` in its access list with `checkCodeHash: true` + - When `T_B` executes, `W` is built from the current state (including `T_A`'s changes) + - Since `X` now exists, `W` contains `H` + - `T_B`'s deployment gets the discount + +> While this only tries to formalize the behaviour, it's important to remark that this kind of behaviour is complex. As it requires control over tx ordering in order to abuse. And Builders can't modify the Acess List as it is already signed with the Tx. Nevertheless, this could happen, thus is formalized here. + +### Edge Case: Simultaneous New Deployments + +If two transactions in the same block both deploy identical new bytecode and neither references an existing contract with that bytecode in their access lists, both will pay full `GAS_CODE_DEPOSIT * L`. This is acceptable because: + +- The first deployment cannot be known at transaction construction time +- This scenario is extremely rare in practice +- The complexity of special handling is not worth the minimal benefit + +## Backwards Compatibility + +This proposal requires a scheduled network upgrade but is designed to be forward-compatible with existing transactions. + +**Transaction compatibility:** +- Transactions with `checkCodeHash` fields are syntactically valid both pre-fork and post-fork +- Pre-fork: The field is ignored; all deployments pay full costs +- Post-fork: The field determines deduplication behavior + +**Wallet and tooling updates:** +- RPC methods like `eth_estimateGas` MUST account for potential deduplication discounts +- Wallets SHOULD provide UI for users to specify deduplication targets +- Transaction builders MAY automatically detect duplicate deployments and add appropriate access list entries + +**Node implementation:** +- Clients MUST ignore `checkCodeHash` pre-fork +- Clients MUST enforce deduplication semantics post-fork +- No changes to state trie structure or database schema are required + +### Example Transaction + +Deploying a contract with the same bytecode as the contract at `0x1234...5678`: + +```json +{ + "from": "0xabcd...ef00", + "to": null, + "data": "0x608060405234801561001...", + "accessList": [ + { + "address": "0x1234567890123456789012345678901234567890", + "storageKeys": [], + "checkCodeHash": true + } + ] +} +``` + +If the deployed bytecode hash matches `codeHash(0x1234...5678)`, the deployment receives the deduplication discount. + +## Security Considerations + +### Gas Cost Accuracy + +The deduplication mechanism ensures that gas costs accurately reflect actual resource consumption. Duplicate deployments don't consume additional storage, so they shouldn't pay storage costs. + +### Denial of Service + +The access-list mechanism prevents DoS attacks because: +- The cost of reading `codeHash` is already covered by EIP-2929/2930 +- No additional state lookups or database queries are required +- The deduplication check is O(1) (set membership test) + +### Access List Size + +Large access lists with many `checkCodeHash: true` entries could increase transaction size, but: +- Access lists are already part of transaction calldata and priced accordingly +- The `checkCodeHash` field adds minimal bytes +- Users have economic incentive to only include necessary entries + +### State Divergence + +The explicit access-list approach prevents state divergence issues that would arise from implicit database lookups. All nodes compute identical gas costs regardless of their sync mode or database contents. + +## Copyright + +Copyright and related rights waived via [CC0](../LICENSE.md). From 36d65072f91d3da7925b6cda1517534a9ade9d7f Mon Sep 17 00:00:00 2001 From: CPerezz Date: Thu, 23 Oct 2025 12:50:07 +0200 Subject: [PATCH 2/7] Address PR review comments and switch to implicit deduplication Major changes: 1. Remove checkCodeHash flag - make deduplication implicit via access lists 2. Add co-authors Wei-Han and Guillaume Ballet 3. Fix grammar: behaviour -> behavior, formalise -> formalize 4. Update snap-sync description for technical accuracy 5. Clarify edge cases for same-block deployments 6. Move Same-Block Deployments section from Rationale to Specification 7. Add rationale explaining why implicit design avoids chain splits Reviewer feedback addressed: - @weiihann: Remove explicit checkCodeHash flag, use implicit checking - @gballet: Chain split concerns resolved by implicit design - @gballet: Grammar and technical accuracy fixes - @weiihann: Simplify empty code handling - @weiihann: Clarify same-block deployment edge cases The implicit design provides several advantages: - No protocol changes to access list structure - Avoids chain split risks from unknown transaction fields - Simpler implementation - any address in access list contributes - Automatic optimization without explicit opt-in flags --- EIPS/eip-draft.md | 135 ++++++++++++++++++++++------------------------ 1 file changed, 64 insertions(+), 71 deletions(-) diff --git a/EIPS/eip-draft.md b/EIPS/eip-draft.md index ea938df9639dc7..091581eaa168bf 100644 --- a/EIPS/eip-draft.md +++ b/EIPS/eip-draft.md @@ -2,7 +2,7 @@ eip: draft title: Contract Bytecode Deduplication Discount description: Reduces gas costs for deploying duplicate contract bytecode via access-list based mechanism -author: Carlos Perez (@CPerezz) +author: Carlos Perez (@CPerezz), Wei-Han (@weiihann), Guillaume Ballet (@gballet) discussions-to: https://ethereum-magicians.org/t/eip-8037-state-creation-gas-cost-increase/25694 status: Draft type: Standards Track @@ -13,17 +13,17 @@ requires: 2930 ## Abstract -This proposal introduces a gas discount for contract deployments when the bytecode being deployed already exists in the state. By extending EIP-2930 access lists with an optional `checkCodeHash` flag, transactions can signal which existing contract addresses should be checked for bytecode duplication. When a match is found, the deployment avoids paying `GAS_CODE_DEPOSIT * L` costs since clients already store the bytecode and only need to link the new account to the existing code hash. +This proposal introduces a gas discount for contract deployments when the bytecode being deployed already exists in the state. By leveraging EIP-2930 access lists, any contract address included in the access list automatically contributes its code hash to a deduplication check. When the deployed bytecode matches an existing code hash from the access list, the deployment avoids paying `GAS_CODE_DEPOSIT * L` costs since clients already store the bytecode and only need to link the new account to the existing code hash. This EIP becomes particularly relevant with the adoption of EIP-8037, which increases `GAS_CODE_DEPOSIT` from 200 to 1,900 gas per byte. Under EIP-8037, deploying a 24kB contract would cost approximately 46.6M gas for code deposit alone, making the deduplication discount economically significant for applications that deploy identical bytecode multiple times. ## Motivation -Currently, deploying duplicate bytecode costs the same as deploying new bytecode, even though Ethereum clients don't store duplicated code in their databases. When the same bytecode is deployed multiple times, clients store only one copy and have multiple accounts point to the same code hash. Under EIP-8037's proposed gas costs, deploying a 24kB contract costs approximately 46.6M gas for code deposit alone (`1,900 × 24,576`). This charge is unfair for duplicate deployments where no additional storage is consumed. +Currently, deploying duplicate bytecode costs the same as deploying new bytecode, even though execution clients don't store duplicated code in their databases. When the same bytecode is deployed multiple times, clients store only one copy and have multiple accounts point to the same code hash. Under EIP-8037's proposed gas costs, deploying a 24kB contract costs approximately 46.6M gas for code deposit alone (`1,900 × 24,576`). This charge is unfair for duplicate deployments where no additional storage is consumed. A naive "check if code exists in database" approach would break consensus because different nodes have different database contents due to mostly Sync-mode differences: - Full-sync nodes: Retain all historical code, including from reverted/reorged transactions -- Snap-sync nodes: Only store code reachable from the current state trie +- Snap-sync nodes: initially, only store code referenced in the pivot state tries, and those accumulated past the start of the sync Empirical analysis reveals that approximately 27,869 bytecodes existed in full-synced node databases with no live account pointing to them (as of the Cancun fork). A database lookup `CodeExists(hash)` would yield different results on different nodes, causing different gas costs and breaking consensus. @@ -33,38 +33,22 @@ This proposal solves the problem by making deduplication checks explicit and det The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 and RFC 8174. -### Access List Extension +### Implicit Deduplication via Access Lists -EIP-2930 access list tuples are extended with an optional `checkCodeHash` boolean field: - -```json -{ - "address": "0x...", - "storageKeys": ["0x..."], - "checkCodeHash": true -} -``` - -### Consensus semantics: - -- The `checkCodeHash` field is OPTIONAL. If omitted, it defaults to `false`. -- Transactions with `checkCodeHash` fields are valid both pre-fork and post-fork. -- Pre-fork: Nodes MUST ignore the `checkCodeHash` field and MUST NOT grant deduplication discounts. -- Post-fork: Nodes MUST process the `checkCodeHash` field as specified below. +This proposal leverages the existing EIP-2930 access list structure without any modifications. No new fields or protocol changes are required. ### CodeHash Access-Set Construction Before transaction execution begins, build a set `W` (the "CodeHash Access-Set") as follows: ``` -W = { codeHash(a) | a ∈ accessList, a.checkCodeHash = true, a exists in state, a has code } +W = { codeHash(a) | a ∈ accessList, a exists in state, a has code } ``` Where: - `W` is built from state at the **start** of transaction execution (before any state changes) -- Only addresses that **already exist** in the state contribute to `W` -- Only addresses that **have deployed code** (non-empty code) contribute to `W` -- Empty accounts or accounts with no code do NOT contribute their code hash to `W` +- **All** addresses in the access list are checked - if they exist in state and have deployed code, their code hash is added to `W` +- Empty accounts or accounts with no code do not contribute to `W` ### Contract Creation Gas Accounting @@ -91,10 +75,9 @@ When a contract creation transaction or opcode (`CREATE`/`CREATE2`) successfully W = set() for tuple in tx.access_list: warm(tuple.address) # per EIP-2930/EIP-2929 rules - if tuple.checkCodeHash == true: - acc = load_account(tuple.address) - if acc exists and acc.code is not empty: - W.add(acc.codeHash) + acc = load_account(tuple.address) + if acc exists and acc.code is not empty: + W.add(acc.codeHash) # On successful CREATE/CREATE2: H = keccak256(B) @@ -108,31 +91,6 @@ else: link_codehash(new_account, H) ``` -## Rationale - -### Why Access-List Based Deduplication? - -The access-list approach provides several critical properties: - -1. Deterministic behavior: -The result depends only on the transaction's access list and current state, not on local database contents. All nodes compute the same gas cost. - -2. No reverse index requirement: -Unlike other approaches, this doesn't require maintaining a `codeHash → [accounts]` reverse index, which would add significant complexity and storage overhead. - -3. Leverages existing infrastructure: -Builds on EIP-2930 access lists and EIP-2929 access costs, requiring minimal protocol changes. - -4. Explicit opt-in: -Transactions must explicitly indicate which addresses to check. This prevents unexpected behavior and gives users/wallets control over gas optimization. - -5. Forward compatibility: -Pre-fork nodes ignore `checkCodeHash` and never grant discounts. Post-fork, all nodes enforce identical behavior. Wallets can optionally add the field to optimize gas, but its absence doesn't invalidate transactions. - -6. Avoiding have a code-root for state: -At this point, clients handle code storage on their own ways. They don't have any consensus on the deployed existing codes (besides that all of the ones referenced in account's codehash fields exist). -Changing this seems a lot more complex and unnecessary. - ### Same-Block Deployments Sequential transaction execution ensures that a deployment storing new code makes it visible to later transactions in the same block: @@ -142,41 +100,77 @@ Sequential transaction execution ensures that a deployment storing new code make - Code is stored under hash `H = keccak256(B)` 2. Later transaction `T_B` in the same block deploys the same bytecode `B`: - - `T_B` includes address `X` in its access list with `checkCodeHash: true` + - `T_B` includes address `X` in its access list - When `T_B` executes, `W` is built from the current state (including `T_A`'s changes) - - Since `X` now exists, `W` contains `H` + - Since `X` now exists and is in the access list, `W` contains `H` - `T_B`'s deployment gets the discount -> While this only tries to formalize the behaviour, it's important to remark that this kind of behaviour is complex. As it requires control over tx ordering in order to abuse. And Builders can't modify the Acess List as it is already signed with the Tx. Nevertheless, this could happen, thus is formalized here. +> While this only tries to formalize the behavior, it's important to remark that this kind of behavior is complex. As it requires control over tx ordering in order to abuse. Builders can't modify the access list as it is already signed with the transaction. Nevertheless, this could happen, thus is formalized here. ### Edge Case: Simultaneous New Deployments -If two transactions in the same block both deploy identical new bytecode and neither references an existing contract with that bytecode in their access lists, both will pay full `GAS_CODE_DEPOSIT * L`. This is acceptable because: +If two transactions in the same block both deploy identical new bytecode and neither references an existing contract with that bytecode in their access lists, both will pay full `GAS_CODE_DEPOSIT * L`. + +**Example:** +- Transaction `T_A` deploys bytecode `B` producing code hash `0xCA` at address `X` +- Transaction `T_B` (later in same block) also deploys bytecode `B` producing code hash `0xCA` at address `Y` +- If `T_B`'s access list does NOT include address `X`, then `T_B` pays full deposit cost +- Deduplication only occurs when the deploying address is included in the access list + +This is acceptable because: - The first deployment cannot be known at transaction construction time +- Deduplication requires explicit opt-in via access list - This scenario is extremely rare in practice - The complexity of special handling is not worth the minimal benefit +## Rationale + +### Why Access-List Based Deduplication? + +The access-list approach provides several critical properties: + +1. Deterministic behavior: +The result depends only on the transaction's access list and current state, not on local database contents. All nodes compute the same gas cost. + +2. No reverse index requirement: +Unlike other approaches, this doesn't require maintaining a `codeHash → [accounts]` reverse index, which would add significant complexity and storage overhead. + +3. Leverages existing infrastructure: +Builds on EIP-2930 access lists and EIP-2929 access costs, requiring minimal protocol changes. + +4. Implicit optimization: +Any address included in the access list automatically contributes to deduplication. This provides automatic gas optimization without requiring explicit flags or special handling. + +5. Avoids chain split risks: +Since no new transaction structure is introduced, pre-fork and post-fork nodes handle the same transactions identically (just with different gas accounting post-fork). This eliminates the risk of chain splits from nodes rejecting transactions with unknown fields. + +6. Forward compatibility: +All nodes enforce identical behavior. Wallets can add addresses to access lists to optimize gas, but this doesn't change transaction validity. + +7. Avoid having a code-root for state: +At this point, clients handle code storage on their own ways. They don't have any consensus on the deployed existing codes (besides that all of the ones referenced in account's codehash fields exist). +Changing this seems a lot more complex and unnecessary. + ## Backwards Compatibility This proposal requires a scheduled network upgrade but is designed to be forward-compatible with existing transactions. **Transaction compatibility:** -- Transactions with `checkCodeHash` fields are syntactically valid both pre-fork and post-fork -- Pre-fork: The field is ignored; all deployments pay full costs -- Post-fork: The field determines deduplication behavior +- No changes to transaction structure - uses existing EIP-2930 access lists +- Existing transactions with access lists automatically benefit from deduplication post-fork +- Transactions without access lists behave identically to current behavior (no deduplication discount) **Wallet and tooling updates:** -- RPC methods like `eth_estimateGas` MUST account for potential deduplication discounts -- Wallets SHOULD provide UI for users to specify deduplication targets -- Transaction builders MAY automatically detect duplicate deployments and add appropriate access list entries +- RPC methods like `eth_estimateGas` SHOULD account for potential deduplication discounts when access lists are present +- Wallets MAY provide UI for users to add addresses to access lists for deduplication +- Transaction builders MAY automatically detect duplicate deployments and include relevant addresses in access lists **Node implementation:** -- Clients MUST ignore `checkCodeHash` pre-fork -- Clients MUST enforce deduplication semantics post-fork -- No changes to state trie structure or database schema are required +- No changes to state trie structure or database schema required +- No changes to transaction parsing or RLP encoding -### Example Transaction +## Example Transaction Deploying a contract with the same bytecode as the contract at `0x1234...5678`: @@ -188,14 +182,13 @@ Deploying a contract with the same bytecode as the contract at `0x1234...5678`: "accessList": [ { "address": "0x1234567890123456789012345678901234567890", - "storageKeys": [], - "checkCodeHash": true + "storageKeys": [] } ] } ``` -If the deployed bytecode hash matches `codeHash(0x1234...5678)`, the deployment receives the deduplication discount. +If the deployed bytecode hash matches `codeHash(0x1234...5678)` (which is automatically checked because the address is in the access list), the deployment receives the deduplication discount. ## Security Considerations From 6f9eb1f793acd7bb6f8d090cb3b59e655b03523a Mon Sep 17 00:00:00 2001 From: CPerezz Date: Thu, 23 Oct 2025 14:00:04 +0200 Subject: [PATCH 3/7] Address additional review comments from Guillaume - Simplify deduplication logic to more concise form - Remove pre-fork/post-fork language from chain split rationale - Clarify that only gas accounting changes at fork activation Co-authored-by: Guillaume Ballet --- EIPS/eip-draft.md | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/EIPS/eip-draft.md b/EIPS/eip-draft.md index 091581eaa168bf..a63a66e2d0ed7a 100644 --- a/EIPS/eip-draft.md +++ b/EIPS/eip-draft.md @@ -55,14 +55,11 @@ Where: When a contract creation transaction or opcode (`CREATE`/`CREATE2`) successfully completes and returns bytecode `B` of length `L`, compute `H = keccak256(B)` and apply the following gas charges: **Deduplication check:** -- If `H ∈ W`: Bytecode is a duplicate - - Do NOT charge `GAS_CODE_DEPOSIT * L` - - Link the new account's `codeHash` to the existing code hash `H` - - The bytecode `B` is NOT persisted (it already exists and it's the current behaviour) - If `H ∉ W`: Bytecode is new - Charge `GAS_CODE_DEPOSIT * L` - Persist bytecode `B` under hash `H` - Link the new account's `codeHash` to `H` +- Otherwise, link the new account's `codeHash` to the existing code hash `H` **Gas costs:** - The cost of reading `codeHash` for access-listed addresses is already covered by EIP-2929/2930 access costs (intrinsic access-list cost and cold→warm state access charges). @@ -143,7 +140,7 @@ Builds on EIP-2930 access lists and EIP-2929 access costs, requiring minimal pro Any address included in the access list automatically contributes to deduplication. This provides automatic gas optimization without requiring explicit flags or special handling. 5. Avoids chain split risks: -Since no new transaction structure is introduced, pre-fork and post-fork nodes handle the same transactions identically (just with different gas accounting post-fork). This eliminates the risk of chain splits from nodes rejecting transactions with unknown fields. +Since no new transaction structure is introduced, there's no risk of nodes rejecting transactions with unknown fields. The same transaction format works before and after the fork, with only the gas accounting rules changing at fork activation. 6. Forward compatibility: All nodes enforce identical behavior. Wallets can add addresses to access lists to optimize gas, but this doesn't change transaction validity. From f0e9b17535d4514ec6fde7addc5f0afdaf44591d Mon Sep 17 00:00:00 2001 From: CPerezz Date: Thu, 23 Oct 2025 14:04:24 +0200 Subject: [PATCH 4/7] Fix section structure: move Example Transaction under Reference Implementation --- EIPS/eip-draft.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/EIPS/eip-draft.md b/EIPS/eip-draft.md index a63a66e2d0ed7a..1ce98df4f8209f 100644 --- a/EIPS/eip-draft.md +++ b/EIPS/eip-draft.md @@ -167,7 +167,9 @@ This proposal requires a scheduled network upgrade but is designed to be forward - No changes to state trie structure or database schema required - No changes to transaction parsing or RLP encoding -## Example Transaction +## Reference Implementation + +### Example Transaction Deploying a contract with the same bytecode as the contract at `0x1234...5678`: From 690dea8c49609395ac24fc2e4d8d12089e3ed2d8 Mon Sep 17 00:00:00 2001 From: CPerezz Date: Thu, 23 Oct 2025 14:45:16 +0200 Subject: [PATCH 5/7] Address review: change 'explicit' to 'implicit' and remove redundant sections --- EIPS/{eip-draft.md => eip-8058.md} | 15 ++------------- 1 file changed, 2 insertions(+), 13 deletions(-) rename EIPS/{eip-draft.md => eip-8058.md} (94%) diff --git a/EIPS/eip-draft.md b/EIPS/eip-8058.md similarity index 94% rename from EIPS/eip-draft.md rename to EIPS/eip-8058.md index 1ce98df4f8209f..b71d54092fc39d 100644 --- a/EIPS/eip-draft.md +++ b/EIPS/eip-8058.md @@ -1,9 +1,9 @@ --- -eip: draft +eip: 8058 title: Contract Bytecode Deduplication Discount description: Reduces gas costs for deploying duplicate contract bytecode via access-list based mechanism author: Carlos Perez (@CPerezz), Wei-Han (@weiihann), Guillaume Ballet (@gballet) -discussions-to: https://ethereum-magicians.org/t/eip-8037-state-creation-gas-cost-increase/25694 +discussions-to: https://ethereum-magicians.org/t/eip-8058-contract-bytecode-deduplication-discount/25933 status: Draft type: Standards Track category: Core @@ -202,17 +202,6 @@ The access-list mechanism prevents DoS attacks because: - No additional state lookups or database queries are required - The deduplication check is O(1) (set membership test) -### Access List Size - -Large access lists with many `checkCodeHash: true` entries could increase transaction size, but: -- Access lists are already part of transaction calldata and priced accordingly -- The `checkCodeHash` field adds minimal bytes -- Users have economic incentive to only include necessary entries - -### State Divergence - -The explicit access-list approach prevents state divergence issues that would arise from implicit database lookups. All nodes compute identical gas costs regardless of their sync mode or database contents. - ## Copyright Copyright and related rights waived via [CC0](../LICENSE.md). From 3580318a7b0ec5c640d57407e0162ec93bdd9e83 Mon Sep 17 00:00:00 2001 From: CPerezz Date: Thu, 23 Oct 2025 14:51:00 +0200 Subject: [PATCH 6/7] Fix markdown linting: add blank lines before lists --- EIPS/eip-8058.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/EIPS/eip-8058.md b/EIPS/eip-8058.md index b71d54092fc39d..3d2ca6601bcdf7 100644 --- a/EIPS/eip-8058.md +++ b/EIPS/eip-8058.md @@ -22,6 +22,7 @@ This EIP becomes particularly relevant with the adoption of EIP-8037, which incr Currently, deploying duplicate bytecode costs the same as deploying new bytecode, even though execution clients don't store duplicated code in their databases. When the same bytecode is deployed multiple times, clients store only one copy and have multiple accounts point to the same code hash. Under EIP-8037's proposed gas costs, deploying a 24kB contract costs approximately 46.6M gas for code deposit alone (`1,900 × 24,576`). This charge is unfair for duplicate deployments where no additional storage is consumed. A naive "check if code exists in database" approach would break consensus because different nodes have different database contents due to mostly Sync-mode differences: + - Full-sync nodes: Retain all historical code, including from reverted/reorged transactions - Snap-sync nodes: initially, only store code referenced in the pivot state tries, and those accumulated past the start of the sync @@ -46,6 +47,7 @@ W = { codeHash(a) | a ∈ accessList, a exists in state, a has code } ``` Where: + - `W` is built from state at the **start** of transaction execution (before any state changes) - **All** addresses in the access list are checked - if they exist in state and have deployed code, their code hash is added to `W` - Empty accounts or accounts with no code do not contribute to `W` @@ -55,6 +57,7 @@ Where: When a contract creation transaction or opcode (`CREATE`/`CREATE2`) successfully completes and returns bytecode `B` of length `L`, compute `H = keccak256(B)` and apply the following gas charges: **Deduplication check:** + - If `H ∉ W`: Bytecode is new - Charge `GAS_CODE_DEPOSIT * L` - Persist bytecode `B` under hash `H` @@ -62,6 +65,7 @@ When a contract creation transaction or opcode (`CREATE`/`CREATE2`) successfully - Otherwise, link the new account's `codeHash` to the existing code hash `H` **Gas costs:** + - The cost of reading `codeHash` for access-listed addresses is already covered by EIP-2929/2930 access costs (intrinsic access-list cost and cold→warm state access charges). - No additional gas cost is introduced for the deduplication check itself. @@ -109,6 +113,7 @@ Sequential transaction execution ensures that a deployment storing new code make If two transactions in the same block both deploy identical new bytecode and neither references an existing contract with that bytecode in their access lists, both will pay full `GAS_CODE_DEPOSIT * L`. **Example:** + - Transaction `T_A` deploys bytecode `B` producing code hash `0xCA` at address `X` - Transaction `T_B` (later in same block) also deploys bytecode `B` producing code hash `0xCA` at address `Y` - If `T_B`'s access list does NOT include address `X`, then `T_B` pays full deposit cost @@ -154,16 +159,19 @@ Changing this seems a lot more complex and unnecessary. This proposal requires a scheduled network upgrade but is designed to be forward-compatible with existing transactions. **Transaction compatibility:** + - No changes to transaction structure - uses existing EIP-2930 access lists - Existing transactions with access lists automatically benefit from deduplication post-fork - Transactions without access lists behave identically to current behavior (no deduplication discount) **Wallet and tooling updates:** + - RPC methods like `eth_estimateGas` SHOULD account for potential deduplication discounts when access lists are present - Wallets MAY provide UI for users to add addresses to access lists for deduplication - Transaction builders MAY automatically detect duplicate deployments and include relevant addresses in access lists **Node implementation:** + - No changes to state trie structure or database schema required - No changes to transaction parsing or RLP encoding @@ -198,6 +206,7 @@ The deduplication mechanism ensures that gas costs accurately reflect actual res ### Denial of Service The access-list mechanism prevents DoS attacks because: + - The cost of reading `codeHash` is already covered by EIP-2929/2930 - No additional state lookups or database queries are required - The deduplication check is O(1) (set membership test) From 28316e2206c2995137aa51634d30909f32d23e4f Mon Sep 17 00:00:00 2001 From: CPerezz Date: Thu, 23 Oct 2025 15:47:06 +0200 Subject: [PATCH 7/7] Address review: fix author name and change explicit to implicit --- EIPS/eip-8058.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/EIPS/eip-8058.md b/EIPS/eip-8058.md index 3d2ca6601bcdf7..69a1d2b0ec12d1 100644 --- a/EIPS/eip-8058.md +++ b/EIPS/eip-8058.md @@ -2,7 +2,7 @@ eip: 8058 title: Contract Bytecode Deduplication Discount description: Reduces gas costs for deploying duplicate contract bytecode via access-list based mechanism -author: Carlos Perez (@CPerezz), Wei-Han (@weiihann), Guillaume Ballet (@gballet) +author: Carlos Perez (@CPerezz), Wei Han Ng (@weiihann), Guillaume Ballet (@gballet) discussions-to: https://ethereum-magicians.org/t/eip-8058-contract-bytecode-deduplication-discount/25933 status: Draft type: Standards Track @@ -28,7 +28,7 @@ A naive "check if code exists in database" approach would break consensus becaus Empirical analysis reveals that approximately 27,869 bytecodes existed in full-synced node databases with no live account pointing to them (as of the Cancun fork). A database lookup `CodeExists(hash)` would yield different results on different nodes, causing different gas costs and breaking consensus. -This proposal solves the problem by making deduplication checks explicit and deterministic through access lists, ensuring all nodes compute identical gas costs regardless of their database state. (Notice here that even if fully-synced clients have more codes, there are no accounts whose codeHash actually is referencing them. Thus, users can't profit from such discounts which keeps consensus safe). +This proposal solves the problem by making deduplication checks implicit and deterministic through access lists, ensuring all nodes compute identical gas costs regardless of their database state. (Notice here that even if fully-synced clients have more codes, there are no accounts whose codeHash actually is referencing them. Thus, users can't profit from such discounts which keeps consensus safe). ## Specification @@ -122,7 +122,7 @@ If two transactions in the same block both deploy identical new bytecode and nei This is acceptable because: - The first deployment cannot be known at transaction construction time -- Deduplication requires explicit opt-in via access list +- Deduplication requires implicit opt-in via access list - This scenario is extremely rare in practice - The complexity of special handling is not worth the minimal benefit