This guide assumes you're somewhat familiar with:
- The esp open sdk
- The GCC toolchain
The first step in understanding the nature of the ESP8266 is looking at how all the functionality is distributed. As seen in the not so well documented memory map, the system has different memory sections which contain all kinds of useful information.
The ESP8266 has a Tensilica Xtensa IP core which is responsible for the code execution. The code is separated into three parts:
- The internal ROM
- The SDK provided libraries
- The user code
All three parts of the code interact with each other in a manner that is outside the scope of this article. I will document them once I finish reverse engineering all of the basic registers.
All of the low level functionality is located in the internal ROM, and most of the functions available there can be seen in the ROM linker script. Here is an excerpt:
... PROVIDE ( ets_timer_handler_isr = 0x40002da8 ); PROVIDE ( ets_timer_init = 0x40002e68 ); PROVIDE ( ets_timer_setfn = 0x40002c48 ); PROVIDE ( ets_uart_printf = 0x40002544 ); PROVIDE ( ets_update_cpu_frequency = 0x40002f04 ); PROVIDE ( ets_vprintf = 0x40001f00 ); PROVIDE ( ets_wdt_disable = 0x400030f0 ); PROVIDE ( ets_wdt_enable = 0x40002fa0 ); PROVIDE ( ets_wdt_get_mode = 0x40002f34 ); ...
This simply means that all of these symbols have a predefined address pointing to the internal ROM (0x40000000 - 0x40000FFFF).
Now on the hands on part; The first thing is dumping the internal ROM with the SDK provided esptool.py (You can skip this part if you already have the dump):
[[email protected] esptest]$ esptool dump_mem 0x40000000 0x10000 rom_dump.bin
We chose the function that we want to reverse engineer, for this example let's go with a simple one such as:
void ets_update_cpu_frequency(int freqmhz);
We look for the address of the function in the ROM ld script which in this case is 0x40002f04 as seen on the excerpt before. With this information, we can get into disassembling the dump.
For this we have to use the sdk provided objdump (xtensa-lx106-elf-objdump)
[[email protected] esptest]$ xtensa-lx106-elf-objdump -bbinary -mxtensa --adjust-vma=0x40000000 --start-address=0x40002f04 -D rom_dump.bin | less
Objdump will disassemble the rom and adjust the address to 0x4000000 which is the start address of the ROM, and start the disassembly at 0x40002f04 which is the address of our function.
The output will look something like:
40002f04: 31f1ff l32r a3, 0x40002ec8 40002f07: 2903 s32i.n a2, a3, 0 40002f09: 0df0 ret.n 40002f0b: 0021ef excw 40002f0e: ff .byte 0xff 40002f0f: 2802 l32i.n a2, a2, 0 40002f11: 0df0 ret.n 40002f13: 002632 excw 40002f16: 0e .byte 0xe 40002f17: 26620f beqi a2, 6, 0x40002f2a 40002f1a: 42c2f4 addi a4, a2, -12 40002f1d: 0cd3 movi.n a3, 13 40002f1f: 0c02 movi.n a2, 0 40002f21: 402383 moveqz a2, a3, a4 40002f24: 0df0 ret.n 40002f26: 0cb2 movi.n a2, 11 40002f28: 0df0 ret.n
objdump output will be in the following format:
address: opcode mnemonic
The complete reference (and a good read if you like this sort of thing) is the Xtensa ISA.
Going back to the assembly we check the code until the return:
40002f04: 31f1ff l32r a3, 0x40002ec8; Load 32-bit PC-Relative 40002f07: 2903 s32i.n a2, a3, 0 ; Narrow Store 32-bit 40002f09: 0df0 ret.n; ; Narrow Non-Windowed Return
So basically, it loads what is in 0x40002ec8 (the address 0x3FFFC704), and stores it in a3; then stores a2 (The argument) in a3 (0x3FFFC704) with offset 0, and returns. We've just seen that all what the function does is storing its argument in 0x3FFFC704.
Pretty simple huh :)?
Edit: I've taken the time to disassemble the entire rom with the available symbols corrected check it out here. It's provided as a .txt because it's huge!