Internal working of Ethereum Virtual Machine (EVM)

Internal working of Ethereum Virtual Machine (EVM)

The revolutionary technology that powers decentralized applications and smart contracts

Introduction👋

Ethereum Blockchain works in the Ethereum Network. The Ethereum network is an environment where all the data, accounts, and smart contracts of the Ethereum blockchain exist. EVM is what defines the rules for computing a new valid state from block to block.

In this blog, we are going to see what magic goes inside the EVM!

Image

What is a State Machine?

A State Machine is a machine that can change from one state to another, with changes in inputs. Take the example of a locker. Before entering the correct password, it is locked, and entering a password changes the locked state to an unlocked state and the locker opens up.

image credits: theburningmonk.com

Ethereum can be better described as a distributed state machine.

What is EVM?

The Ethereum Virtual Machine(EVM) is a deterministic, stack data structure-based virtual machine.

Deterministic means, producing same output for same set of inputs

EVM is the runtime environment for executing smart contracts on the Ethereum blockchain.

For Ethereum, the state is more complex than the locker, but the general principle is the same. At the end of the day, the overall network stores all its state in a large data structure. The specific rules of how that state can change in response to different types of transactions that are occurring - that the EVM defines.

EVM behaves similarly to a mathematical state transition function. This function can be represented as follows:

Y(S, T) = S'

Given the old valid state S and a new set of valid transactions T, the state transition function Y can go from state S to state S'.

EVM Architecture

💡
A little techy stuff, you need to be stick along to the last to grasp it thoroughly!

EVM is made up of many internal components, let's see in detail-

The EVM code/ Smart contract bytecode

These are the low-level code executed by the EVM. We can go a little deeper to understand how smart contract works. Understand it in this way,

  • When we compile a smart contract, it gets converted into bytecode.

  • Bytecode is a series of machine-readable instructions represented in hexadecimal format. It is the intermediary step between writing smart contract code and its execution on the blockchain.

  • The bytecode are essentially OPCODES for the EVM, these no longer have function names that existed in your Solidity code, nor the names of parameters they contained, or what type of values these functions will return instead it includes a specific operation, such as arithmetic, conditional branching or memory manipulation

So, we understood that, for compiling the smart contract, we need to convert it into bytecode(OPCODES). Still, we also need to be able to go the other way, converting the return value from those functions back into human-readable values. And here the ABIs role comes. We will not get into ABI detail here.

Storage

Every smart contract has some key-value pair-based storage. It consists of a large array of 32-byte words and each word is accessed and can be manipulated by a unique 256-bit key. But, modifying the stored data is a costly process in terms of gas costs.

The EVM storage is non-volatile, meaning the data it stores persists even when the smart contract completes execution.

EVM storage is implemented as a modified version of the Merkle Patricia Tree data structure, which allows for efficient access and modification of the storage data.

💡
Some extra Stuff

A Patricia Tree is a specific type of a trie designed to be more space-efficient than a standard trie, by storing only the unique parts of the keys in the tree. Patricia Trees are particularly useful in scenarios where keys share common prefixes, as they allow for more efficient use of memory and faster lookups compared to standard tries.

The following opcodes are used to manipulate the storage of a smart contract:

SLOAD loads a 256-bit word from Storage at a given index, and pushes it onto the Stack.

SSTORE stores a 256-bit word to Storage at a given index. The value to be stored is popped from the Stack, and its index is specified as the next value on the Stack.

Stack

EVM is a stack-based machine, as it uses a stack data structure for execution. It has 256 bits x 1024 elements. Each item in that stack is a 256-bit (32-byte) word.

When an operation is performed, values currently on top of the Stack are popped off, used in the executed operation, and then the result of the operation is pushed back onto the Stack.

Some stack operations:

PUSH, POP, DUP, SWAP, ADD, SUB, MUL, DIV, MOD, AND, OR, XOR, NOT, EQ, LT, GT, SHA3, SHA3, JUMP, JUMPI.

More about Stacks.

💡
Interesting Fact

The top 16 elements in the stack are used for gas optimization purposes. The idea is that only the top 16 elements in the stack are considered "accessible" for certain operations. If an operation requires data deeper in the stack, additional gas is consumed to move the required elements to the top of the stack before the operation can be executed. This encourages developers to structure their code in a way that minimizes the need to access elements deep in the stack, promoting efficiency.

Memory

The EVM memory is a linear array of bytes used by smart contracts to store and retrieve data. It is used to store large data structures, like arrays or strings.

The size of the memory is dynamically allocated at runtime, meaning that the amount of memory available to a smart contract can grow depending on its needs.

Memory can be accessed at a byte level. It means that we can use a unique index to access each byte.

If you are a smart contract developer, recall, you use memory keyword when taking a string as input. Yes, it gets stored in here!

Memory, in contrast to Storage, is volatile, which means the data stored in it is removed once the smart contract is executed completely.

Memory directly affects gas consumption. Hence, developers use it carefully to optimize gas costs.

Some OPCODES used are:

MLOAD , MSTORE , MSTORE8 , MSIZE

More about Memory.

Conclusion

As we saw some internal working of EVM. A lot goes on while a smart contract processes and executes, and then storing the data and retrieving it. EVM has a lot to explore. This is enough for today.

Reference:

Reference-1

Reference-2

Do comment for any suggestions, your views on the blog. Do visit my blog page to read more interesting technical stuff.

Thanks for reading!🙏

Did you find this article valuable?

Support Agrim Sharma by becoming a sponsor. Any amount is appreciated!