Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add INCR privileged instructions #734

Open
wants to merge 14 commits into
base: develop
Choose a base branch
from
2 changes: 1 addition & 1 deletion .env
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ CPU_CIRCUIT_SIZE=8..21
KECCAK_CIRCUIT_SIZE=4..20
KECCAK_SPONGE_CIRCUIT_SIZE=8..17
LOGIC_CIRCUIT_SIZE=4..21
MEMORY_CIRCUIT_SIZE=17..24
MEMORY_CIRCUIT_SIZE=16..24
MEMORY_BEFORE_CIRCUIT_SIZE=16..23
MEMORY_AFTER_CIRCUIT_SIZE=7..23
POSEIDON_CIRCUIT_SIZE=4..25
19 changes: 12 additions & 7 deletions book/src/cpu_execution/privileged_instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,37 +50,42 @@ ecPairing precompiles.
resetting memory sections (by calling MSTORE_32BYTES_32 with the
value 0).

7. `PROVER_INPUT`. Pushes a single prover input onto the stack.
7. `INCR`. Reads the `N`th element of the stack, and increments it in-place
by one without pushing or popping. There are 4 INCR operations, namely
INCR1, INCR2, INCR3, INCR4 to increment respectively the 1st, 2nd, 3rd
and 4th stack elements.

8. `GET_CONTEXT`. Pushes the current context onto the stack. The kernel
8. `PROVER_INPUT`. Pushes a single prover input onto the stack.

9. `GET_CONTEXT`. Pushes the current context onto the stack. The kernel
always has context 0.

9. `SET_CONTEXT`. Pops the top element of the stack and updates the
10. `SET_CONTEXT`. Pops the top element of the stack and updates the
current context to this value. It is usually used when calling
another contract or precompile, to distinguish the caller from the
callee.

10. `MLOAD_32BYTES`. Pops 2 elements from the stack (a Memory address,
11. `MLOAD_32BYTES`. Pops 2 elements from the stack (a Memory address,
and then a length $\ell$), and pushes a value onto the stack. The
pushed value corresponds to the U256 integer read from the
big-endian sequence of length $\ell$ from the memory address being
fetched. Note that an empty length is not valid, nor is a length
greater than 32 (as a U256 consists in at most 32 bytes). Missing
these conditions will result in an unverifiable proof.

11. `EXIT_KERNEL`. Pops 1 element from the stack. This instruction is
12. `EXIT_KERNEL`. Pops 1 element from the stack. This instruction is
used at the end of a syscall, before proceeding to the rest of the
execution logic. The popped element, *kexit_info*, contains several
pieces of information like the current program counter, the current
amount of gas used, and whether we are in kernel (i.e. privileged)
mode or not.

12. `MLOAD_GENERAL`. Pops 1 elements (a Memory address), and pushes the
13. `MLOAD_GENERAL`. Pops 1 elements (a Memory address), and pushes the
value stored at this memory address onto the stack. It can read any
memory location, general (similarly to MLOAD (0x51) instruction) or
privileged.

13. `MSTORE_GENERAL`. Pops 2 elements (a value and a Memory address),
14. `MSTORE_GENERAL`. Pops 2 elements (a value and a Memory address),
and writes the popped value from the stack at the fetched address.
It can write to any memory location, general (similarly to MSTORE
(0x52) / MSTORE8 (0x53) instructions) or privileged.
4 changes: 3 additions & 1 deletion evm_arithmetization/src/all_stark.rs
Original file line number Diff line number Diff line change
Expand Up @@ -159,8 +159,10 @@ pub(crate) fn all_cross_table_lookups<F: Field>() -> Vec<CrossTableLookup<F>> {
/// `CrossTableLookup` for `ArithmeticStark`, to connect it with the `Cpu`
/// module.
fn ctl_arithmetic<F: Field>() -> CrossTableLookup<F> {
let cpu_arithmetic_looking = cpu_stark::ctl_arithmetic_base_rows();
let cpu_increment_looking = cpu_stark::ctl_arithmetic_incr_op();
CrossTableLookup::new(
vec![cpu_stark::ctl_arithmetic_base_rows()],
vec![cpu_arithmetic_looking, cpu_increment_looking],
arithmetic_stark::ctl_arithmetic_rows(),
)
}
Expand Down
2 changes: 2 additions & 0 deletions evm_arithmetization/src/cpu/columns/ops.rs
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,8 @@ pub(crate) struct OpsColumnsView<T: Copy> {
pub m_op_general: T,
/// Combines PC and PUSH0
pub pc_push0: T,
/// Flag for INCR_N instructions.
pub incr: T,

/// Flag for syscalls.
pub syscall: T,
Expand Down
1 change: 1 addition & 0 deletions evm_arithmetization/src/cpu/contextops.rs
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ const KEEPS_CONTEXT: OpsColumnsView<bool> = OpsColumnsView {
m_op_32bytes: true,
exit_kernel: true,
m_op_general: true,
incr: true,
syscall: true,
exception: true,
};
Expand Down
41 changes: 38 additions & 3 deletions evm_arithmetization/src/cpu/cpu_stark.rs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A possible optimization to avoid introducing an extra CTL just for INCR1 is to keep the same arithmetic CTL for all cases, with memory channels 1 and 2. Then you add INCR1-specific constraints to check that lv.stack_top == mem_channel[1] and nv.stack_top == mem_channel[2].

It requires using the value columns of some disable channels, but I couldn't find anywhere in the code that it wasn't possible.

Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,14 @@ use starky::lookup::{Column, Filter};
use starky::stark::Stark;

use super::columns::CpuColumnsView;
use super::halt;
use super::kernel::constants::context_metadata::ContextMetadata;
use super::kernel::opcodes::get_opcode;
use super::membus::{NUM_CHANNELS, NUM_GP_CHANNELS};
use crate::all_stark::{EvmStarkFrame, Table};
use crate::cpu::columns::{COL_MAP, NUM_CPU_COLUMNS};
use crate::cpu::{
byte_unpacking, clock, contextops, control_flow, decode, dup_swap, gas, jumps, membus, memio,
modfp254, pc, push0, shift, simple_logic, stack, syscalls_exceptions,
byte_unpacking, clock, contextops, control_flow, decode, dup_swap, gas, halt, incr, jumps,
membus, memio, modfp254, pc, push0, shift, simple_logic, stack, syscalls_exceptions,
};
use crate::memory::segments::Segment;
use crate::memory::VALUE_LIMBS;
Expand Down Expand Up @@ -131,6 +131,39 @@ pub(crate) fn ctl_arithmetic_base_rows<F: Field>() -> TableWithColumns<F> {
)
}

/// Returns the `TableWithColumns` for the CPU rows calling arithmetic
/// `ADD` operations through the `INCR` privileged instructions.
pub(crate) fn ctl_arithmetic_incr_op<F: Field>() -> TableWithColumns<F> {
// Instead of taking single columns, we reconstruct the entire opcode value
// directly.
let mut columns = vec![Column::constant(F::from_canonical_u8(get_opcode("ADD")))];

// Read value
columns.extend(Column::singles(COL_MAP.mem_channels[1].value));

// Fixed second operand (`U256::one()`)
{
columns.push(Column::constant(F::ONE));
for _ in 1..VALUE_LIMBS {
columns.push(Column::constant(F::ZERO));
}
}

// Ignored third operand, `ADD` is a binary operation
for _ in 0..VALUE_LIMBS {
columns.push(Column::constant(F::ZERO));
}

// Returned value
columns.extend(Column::singles(COL_MAP.mem_channels[2].value));

TableWithColumns::new(
*Table::Cpu,
columns,
Filter::new_simple(Column::single(COL_MAP.op.incr)),
)
}

/// Returns a column containing stale contexts.
pub(crate) fn ctl_context_pruning_looked<F: Field>() -> TableWithColumns<F> {
TableWithColumns::new(
Expand Down Expand Up @@ -612,6 +645,7 @@ impl<F: RichField + Extendable<D>, const D: usize> Stark<F, D> for CpuStark<F, D
dup_swap::eval_packed(local_values, next_values, yield_constr);
gas::eval_packed(local_values, next_values, yield_constr);
halt::eval_packed(local_values, next_values, yield_constr);
incr::eval_packed(local_values, next_values, yield_constr);
jumps::eval_packed(local_values, next_values, yield_constr);
membus::eval_packed(local_values, yield_constr);
memio::eval_packed(local_values, next_values, yield_constr);
Expand Down Expand Up @@ -647,6 +681,7 @@ impl<F: RichField + Extendable<D>, const D: usize> Stark<F, D> for CpuStark<F, D
dup_swap::eval_ext_circuit(builder, local_values, next_values, yield_constr);
gas::eval_ext_circuit(builder, local_values, next_values, yield_constr);
halt::eval_ext_circuit(builder, local_values, next_values, yield_constr);
incr::eval_ext_circuit(builder, local_values, next_values, yield_constr);
jumps::eval_ext_circuit(builder, local_values, next_values, yield_constr);
membus::eval_ext_circuit(builder, local_values, yield_constr);
memio::eval_ext_circuit(builder, local_values, next_values, yield_constr);
Expand Down
5 changes: 3 additions & 2 deletions evm_arithmetization/src/cpu/decode.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ use starky::constraint_consumer::{ConstraintConsumer, RecursiveConstraintConsume

use crate::cpu::columns::{CpuColumnsView, COL_MAP};

const OPCODES_LEN: usize = if cfg!(feature = "cdk_erigon") { 6 } else { 5 };
const OPCODES_LEN: usize = if cfg!(feature = "cdk_erigon") { 7 } else { 6 };

/// List of opcode blocks
/// Each block corresponds to exactly one flag, and each flag corresponds to
Expand Down Expand Up @@ -46,7 +46,8 @@ const OPCODES: [(u8, usize, bool, usize); OPCODES_LEN] = [
// JUMPDEST and KECCAK_GENERAL are handled manually here.
(0x56, 1, false, COL_MAP.op.jumps), // 0x56-0x57
(0x80, 5, false, COL_MAP.op.dup_swap), // 0x80-0x9f
(0xf6, 1, true, COL_MAP.op.context_op), //0xf6-0xf7
(0xe0, 2, true, COL_MAP.op.incr), // 0xe0-0xe3
(0xf6, 1, true, COL_MAP.op.context_op), // 0xf6-0xf7
(0xf9, 0, true, COL_MAP.op.exit_kernel),
// MLOAD_GENERAL and MSTORE_GENERAL flags are handled manually here.
];
Expand Down
4 changes: 2 additions & 2 deletions evm_arithmetization/src/cpu/dup_swap.rs
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ fn channels_equal_ext_circuit<F: RichField + Extendable<D>, const D: usize>(
///
/// `offset` is the stack index before this instruction is executed, e.g. `0`
/// for the top of the stack.
fn constrain_channel_packed<P: PackedField>(
pub(crate) fn constrain_channel_packed<P: PackedField>(
is_read: bool,
filter: P,
offset: P,
Expand All @@ -64,7 +64,7 @@ fn constrain_channel_packed<P: PackedField>(
///
/// `offset` is the stack index before this instruction is executed, e.g. `0`
/// for the top of the stack.
fn constrain_channel_ext_circuit<F: RichField + Extendable<D>, const D: usize>(
pub(crate) fn constrain_channel_ext_circuit<F: RichField + Extendable<D>, const D: usize>(
builder: &mut CircuitBuilder<F, D>,
is_read: bool,
filter: ExtensionTarget<D>,
Expand Down
1 change: 1 addition & 0 deletions evm_arithmetization/src/cpu/gas.rs
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ const SIMPLE_OPCODES: OpsColumnsView<Option<u32>> = OpsColumnsView {
m_op_32bytes: KERNEL_ONLY_INSTR,
exit_kernel: None,
m_op_general: KERNEL_ONLY_INSTR,
incr: KERNEL_ONLY_INSTR,
syscall: None,
exception: None,
};
Expand Down
76 changes: 76 additions & 0 deletions evm_arithmetization/src/cpu/incr.rs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some constraints seem to be missing, and some seem to be unneeded.

The current set of constraints work as intended for INCR2-4, but for INCR1 we are not checking that the output channel is equal to the next top of the stack.
Moreover, the value read in the input channel is not constrained to match the current top of the stack (tests pass so it seems to be the case, but it sounds like coincidence to me).

I think the clean way to do it is to filter all of the current constraints with lv.opcode_bits[0] (with a new one making sure that the top of the stack doesn't change), and handle INCR1 separately with filter 1 - lv.opcode_bits[0] (you can even disable the memory channels to save some memory rows).

Copy link
Collaborator Author

@Nashtare Nashtare Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handle INCR1 separately with filter 1 - lv.opcode_bits[0]

This would be catching INCR3 too, which wouldn't work. If I use both bits I'll get a degree 4, but I can introduce helper columns. I'll look at it.

Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
use plonky2::field::extension::Extendable;
use plonky2::field::packed::PackedField;
use plonky2::field::types::Field;
use plonky2::hash::hash_types::RichField;
use plonky2::iop::ext_target::ExtensionTarget;
use plonky2::plonk::circuit_builder::CircuitBuilder;
use starky::constraint_consumer::{ConstraintConsumer, RecursiveConstraintConsumer};

use super::dup_swap::{constrain_channel_ext_circuit, constrain_channel_packed};
use crate::cpu::columns::CpuColumnsView;

/// Evaluates the constraints for the DUP and SWAP opcodes.
Nashtare marked this conversation as resolved.
Show resolved Hide resolved
Nashtare marked this conversation as resolved.
Show resolved Hide resolved
pub(crate) fn eval_packed<P: PackedField>(
lv: &CpuColumnsView<P>,
nv: &CpuColumnsView<P>,
yield_constr: &mut ConstraintConsumer<P>,
) {
let filter = lv.op.incr;

let n = lv.opcode_bits[0]
+ lv.opcode_bits[1] * P::Scalar::from_canonical_u64(2)
+ lv.opcode_bits[2] * P::Scalar::from_canonical_u64(4);
Nashtare marked this conversation as resolved.
Show resolved Hide resolved
Nashtare marked this conversation as resolved.
Show resolved Hide resolved

// Disable the partial channel.
yield_constr.constraint(lv.op.incr * lv.partial_channel.used);

// Constrain the input channel's address, `is_read` and `used` fields.
let read_channel = &lv.mem_channels[1];
constrain_channel_packed(true, filter, n, read_channel, lv, yield_constr);

// Constrain the output channel's address, `is_read` and `used` fields.
let write_channel = &lv.mem_channels[2];
constrain_channel_packed(false, filter, n, write_channel, lv, yield_constr);

// Constrain the unchanged stack len.
yield_constr.constraint_transition(filter * (nv.stack_len - lv.stack_len));
}

/// Circuit version of `eval_packed`.
/// Evaluates the constraints for the DUP and SWAP opcodes.
pub(crate) fn eval_ext_circuit<F: RichField + Extendable<D>, const D: usize>(
builder: &mut CircuitBuilder<F, D>,
lv: &CpuColumnsView<ExtensionTarget<D>>,
nv: &CpuColumnsView<ExtensionTarget<D>>,
yield_constr: &mut RecursiveConstraintConsumer<F, D>,
) {
let filter = lv.op.incr;

let n = lv.opcode_bits[..3].iter().enumerate().fold(
builder.zero_extension(),
|cumul, (i, &bit)| {
builder.mul_const_add_extension(F::from_canonical_u64(1 << i), bit, cumul)
},
);

// Disable the partial channel.
{
let constr = builder.mul_extension(lv.op.incr, lv.partial_channel.used);
yield_constr.constraint(builder, constr);
}

// Constrain the input channel's address, `is_read` and `used` fields.
let read_channel = &lv.mem_channels[1];
constrain_channel_ext_circuit(builder, true, filter, n, read_channel, lv, yield_constr);

// Constrain the output channel's address, `is_read` and `used` fields.
let write_channel = &lv.mem_channels[2];
constrain_channel_ext_circuit(builder, false, filter, n, write_channel, lv, yield_constr);

// Constrain the unchanged stack len.
{
let diff = builder.sub_extension(nv.stack_len, lv.stack_len);
let constr = builder.mul_extension(filter, diff);
yield_constr.constraint_transition(builder, constr);
}
}
2 changes: 1 addition & 1 deletion evm_arithmetization/src/cpu/kernel/asm/balance.asm
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ global balance:
%mpt_read_state_trie
// stack: account_ptr, retdest
DUP1 ISZERO %jumpi(retzero) // If the account pointer is null, return 0.
%add_const(1)
INCR1
// stack: balance_ptr, retdest
%mload_trie_data
// stack: balance, retdest
Expand Down
6 changes: 2 additions & 4 deletions evm_arithmetization/src/cpu/kernel/asm/bignum/add.asm
Original file line number Diff line number Diff line change
Expand Up @@ -40,12 +40,10 @@ add_loop:
// stack: a_cur_addr, c[cur], carry_new, base_addr, i, a_cur_loc, b_cur_loc, retdest
%swap_mstore
// stack: carry_new, base_addr, i, a_cur_loc, b_cur_loc, retdest
SWAP3
%increment
SWAP3
INCR4
// stack: carry_new, base_addr, i, a_cur_loc + 1, b_cur_loc, retdest
SWAP4
%increment
INCR1
SWAP4
// stack: carry_new, base_addr, i, a_cur_loc + 1, b_cur_loc + 1, retdest
SWAP2
Expand Down
4 changes: 2 additions & 2 deletions evm_arithmetization/src/cpu/kernel/asm/bignum/addmul.asm
Original file line number Diff line number Diff line change
Expand Up @@ -87,11 +87,11 @@ addmul_loop:
// stack: i-1, base_addr, carry_limb, a_cur_loc, b_cur_loc, val, retdest
SWAP3
// stack: a_cur_loc, base_addr, carry_limb, i-1, b_cur_loc, val, retdest
%increment
INCR1
// stack: a_cur_loc+1, base_addr, carry_limb, i-1, b_cur_loc, val, retdest
SWAP4
// stack: b_cur_loc, base_addr, carry_limb, i-1, a_cur_loc+1, val, retdest
%increment
INCR1
// stack: b_cur_loc+1, base_addr, carry_limb, i-1, a_cur_loc+1, val, retdest
%stack (b, addr, c, i, a) -> (c, addr, i, a, b)
// stack: carry_limb, base_addr, i-1, a_cur_loc+1, b_cur_loc+1, val, retdest
Expand Down
4 changes: 1 addition & 3 deletions evm_arithmetization/src/cpu/kernel/asm/bignum/isone.asm
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,6 @@ starts_with_one:
// Starts with one, so check that the remaining limbs are zero.
// stack: len, start_loc, retdest
%decrement
SWAP1
%increment
SWAP1
INCR2
// stack: len-1, start_loc+1, retdest
%jump(iszero_bignum)
2 changes: 1 addition & 1 deletion evm_arithmetization/src/cpu/kernel/asm/bignum/iszero.asm
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ iszero_loop:
// stack: cur_val, cur_loc, end_loc, retdest
%jumpi(neqzero)
// stack: cur_loc, end_loc, retdest
%increment
INCR1
// stack: cur_loc + 1, end_loc, retdest
%stack (vals: 2) -> (vals, vals)
// stack: cur_loc + 1, end_loc, cur_loc + 1, end_loc, retdest
Expand Down
13 changes: 6 additions & 7 deletions evm_arithmetization/src/cpu/kernel/asm/bignum/modmul.asm
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ modmul_remainder_loop:
DUP4 ADD // out_addr_i
%swap_mstore
// stack: i, base_addr, len, a_loc, b_loc, m_loc, out_loc, s1, s2, s3, retdest
%increment
INCR1
DUP3
DUP2
// stack: i+1, len, i+1, base_addr, len, a_loc, b_loc, m_loc, out_loc, s1, s2, s3, retdest
Expand Down Expand Up @@ -89,7 +89,7 @@ modmul_quotient_loop:
DUP4 ADD // s1_addr_i
%swap_mstore
// stack: i, base_addr, 2*len, len, a_loc, b_loc, m_loc, out_loc, s1, s2, s3, retdest
%increment
INCR1
DUP3
DUP2
// stack: i+1, 2*len, i+1, base_addr, 2*len, len, a_loc, b_loc, m_loc, out_loc, s1, s2, s3, retdest
Expand Down Expand Up @@ -156,13 +156,12 @@ modmul_check_loop:
SWAP1
%decrement
// stack: n-1, base_addr, i, j, retdest
Nashtare marked this conversation as resolved.
Show resolved Hide resolved
INCR3
SWAP2
%increment
// stack: i+1, base_addr, n-1, j, retdest
SWAP3
%increment
// stack: j+1, base_addr, n-1, i+1, retdest
%stack (j, addr, n, i) -> (n, addr, n, i, j)
INCR4
// stack: i+1, base_addr, n-1, j+1, retdest
%stack (i, addr, n) -> (n, addr, n, i)
// stack: n-1, base_addr, n-1, i+1, j+1, retdest
%jumpi(modmul_check_loop)
// end of modmul_check_loop
Expand Down
4 changes: 2 additions & 2 deletions evm_arithmetization/src/cpu/kernel/asm/bignum/mul.asm
Original file line number Diff line number Diff line change
Expand Up @@ -47,11 +47,11 @@ mul_addmul_return:
%decrement
// stack: n-1, base_addr, len, a_start_loc, bi, output_cur, retdest
SWAP4
%increment
INCR1
SWAP4
// stack: n-1, base_addr, len, a_start_loc, bi+1, output_cur, retdest
SWAP5
%increment
INCR1
SWAP5
// stack: n-1, base_addr, len, a_start_loc, bi+1, output_cur+1, retdest
DUP1
Expand Down
Loading
Loading