The general idea is to have a clear and structured format for EVM byte-code. Something like this two approaches:

Simple approach

magic 8-bits
version 8-bits
code size 16-bits
…code… variable length
jumptable size 16-bits
…jumptable… variable length
metadata size 16-bits
…metadata… variable length

Structured approach

magic 32-bits
version 8-bits

Inspired by other formats, we have sections following the header. Each section is formatted as:

section id 8-bits
section size 16-bits
section data variable length

We define three sections, which must appear in this order (Sn+1.id >= Sn.id):

  1. Code (id = 0) → 1! code section
  2. Jumptable (id = 1) → 1! jumptable section
  3. Data (id = 2) → 0+ data section

Why?

In order to analyze the byte-code at contract creation and simplify a lot the runtime analysis. You can also add versioning without an additional version field in the account to better introduce or deprecate new features.

Timeline

London

EIP-3541 is rolled out, which rejects any new contracts starting with the 0xEF byte. After the forkblock, we’ll inspect all existing code on chain to select the prefix.

Shanghai

The format described in EIP-3540 introduces a simple and extensible container with a minimal set of changes required to both clients and languages, and introduces the concept of validation.

Cancun and later