Shellcoding is an excellent way to learn more about assembly language and how a program communicates with the underlying OS.
Why are we red teamers and penetration testers writing shellcode?
Because in real cases shellcode can be a code that is injected into a running program to make it do something it was not made to do, for example buffer overflow attacks. So shellcode is generally can be used as the “payload” of an exploit.
Basic Terms
Stack
The stack is a data structure, more specifically a Last In First Out(LIFO) data structure, which means that the most recent data placed, or pushed, onto the stack is the next item to be removed, or popped, from the stack.
The stack stores local variables, information relating to function calls, and other information used to clean up the stack after a function or procedure is called.
The stack grows down the address space.
High memory address (e.g., 0xFFFF)
|
| <-- Stack starts here (empty stack pointer)
|
| Function A is called
| Push return address
| Push local variables
|
| Function B is called
| Push return address
| Push local variables
|
V
Low memory address (e.g., 0x0000)
Each time a function is called:
A stack frame is created (return address, arguments, local variables).
This stack frame is pushed at a lower address than the one before.
Heap
The Heap is a First In First Out(FIFO) data structure, which means data is placed and removed from the heap as it builds.
The heap used to hold program information, more specifically, dynamic variables.
Allocated at runtime (not at compile-time)
Can grow and shrink as needed (until the system limit is reached).
<sys/mman.h> is a C header file on Unix/Linux systems that provides access to memory management functions, most notably mmap() and munmap().
It gives us low-level control over how memory is allocated, protected, and shared in our program, beyond what malloc() or new can do.
Common Functions in <sys/mman.h>
Function
Description
mmap()
Maps memory (e.g., allocates a memory region that can be read, written, or executed). Often used for shared memory, file-backed memory, or raw executable memory for shellcode.
munmap()
Unmaps a memory region created by mmap().
mprotect()
Changes the protection (read/write/exec) of memory pages.
msync()
Flushes changes made in memory-mapped files to disk.
This line is allocating a region of memory that is:
Readable,
Writable,
Executable,
Private to this process,
Not backed by any file.
Argument
Meaning
0
Let the OS choose the address for the memory region.
sizeof(code)
Size of the memory to allocate, in bytes (same size as your shellcode).
`PROT_READ
PROT_WRITE
MAP_ANON or MAP_ANONYMOUS
Memory is not backed by any file—it’s just zero-initialized memory.
MAP_PRIVATE
Memory changes are private to this process (not shared with others).
-1
File descriptor (not used because we’re using anonymous mapping).
0
Offset in the file (also not used here).
memcpy(exec, code, sizeof(code));
Copies the shellcode (from the code array) into the executable memory region pointed to by exec.
Part
Meaning
memcpy
A standard C function that copies memory.
exec
Destination address — memory returned by mmap() that’s marked as executable.
code
Source address — your shellcode, stored as a byte array.
sizeof(code)
Number of bytes to copy (length of the shellcode).
We can’t directly execute code stored in a regular data array (like char code[]) on most modern OS, because that memory is marked non-executable by default for security. So you:
Allocate a memory region that can be executed (mmap(...)).
Copy the shellcode there with memcpy.
Run it by casting and calling as a function.
((void(*)())exec)();
This line casts the exec pointer to a function pointer and then calls it like a normal function. Here’s a breakdown:
Step-by-step explanation:
Part
Meaning
exec
A void* pointer to memory where your shellcode is copied.
(void(*)())exec
Casts exec to a pointer to a function that returns void and takes no arguments.
((void(*)())exec)()
Calls that function. This jumps to the start of the shellcode and executes it.
Disable/Enable ASLR
What is ASLR?
Address Space Layout Randomization (ASLR) is a security feature used by operating systems like Windows, Linux, and macOS.
Why does ASLR exist?
Traditionally, memory addresses in a program were predictable. Attackers could guess where in memory to inject or run malicious code.
With ASLR:
The memory addresses used by a program change every time the program is run.
This makes it very hard for an attacker to know where their malicious code should go.
What gets randomized?
Every time a program starts, these can be randomly placed in memory:
This line prints a formatted string to the console using printf().
Now, let’s break down the string =^..^= meow \x00 meow:
=^..^=: This is just a simple string, which is the face of a cat or kitten in “ASCII art” (often referred to as a “meow face”).
meow: This is the literal string “meow”.
\x00: This is the hexadecimal escape sequence in C. \x00 represents a null character (ASCII value 0). It’s a special character that typically marks the end of a string in C.
In this case, \x00 won’t display anything in the output because it’s a non-printable character, but it will still be part of the string passed to printf().
meow: Another literal string “meow”.
So, the final string passed to printf() will look something like this in memory:
=^..^= meow [null byte] meow
Explanation of the Output:
=^..^=: This is the “meow face” or ASCII art representing a cat.
meow: This is a string that was printed.
After this, we’re seeing the prompt (voldemort@IdeaPad:~/Malware$), which is our terminal prompt. This is just the normal shell prompt, indicating that our program has finished executing.
Why \x00 didn’t show up:
The null character \x00 (hexadecimal value 0) is a non-printable character. It is often used in C to mark the end of a string. So, even though printf was given the string that includes \x00, it doesn’t cause any visible output.
In most cases, the null byte doesn’t cause any visible change when printing to the terminal because it’s treated as the string terminator in C. It doesn’t show up in the output but is still present in memory. So, printf simply stops printing at the first occurrence of the null byte, which happens after the first “meow.”
So, even though \x00 is in the string, it won’t be printed, and the output we get is:
=^..^= meow
The null byte\x00 is a non-printable character that serves as a string terminator in C, but doesn’t actually show up in the output.
Our terminal prompt (voldemort@IdeaPad:~/Malware$) is shown after the program finishes running, but this is separate from the program output itself.
Assembly codes
Program 1
; exit1.asmsection .datasection .bsssection .text global _start_start: mov eax, 0 mov eax, 1 int 0x80
Explanation
section .data
This is where you put variables with values already known. Example:
section .datamsg db "Hello", 0
section .bss
This is for empty variables (you’ll fill them later). Example:
section .bssbuffer resb 64 ; reserve 64 bytes
Think: .data = pre-filled, .bss = empty box.
section .text
This is where your actual instructions/code live.
Like C’s main(), your program starts here.
What is global _start?
global _start makes the _start label accessible to the linker.
When the program runs, Linux says:
“Where do I start?”
So we write:
global _start_start:
This is like saying: “Start from here!”
Equivalent to int main() in C.
mov eax, 0
This moves the value 0 into the eax register.
But it’s immediately overwritten, so it’s useless here.
mov eax, 1
eax is like a temporary number holder.
We say: “Put number 1 in eax.”
Why number 1? Because:
System calls = Asking the OS to do something
Linux gives us a list of numbers. Each number = a request:
Number
Syscall
1
exit()
4
write()
5
open()
6
close()
11
execve()
We can find it at /usr/include/asm/unistd_32.h
So:
mov eax, 1int 0x80
Means: “Hey Linux, I want to exit.”
What is int 0x80?
It switches to kernel mode (a special mode where Linux runs).
It tells Linux: “Please perform the syscall I asked for.”
It’s like hitting the enter key after typing the syscall.
In C strings, the null byte (0x00) means “end of string”.
So, if our shellcode (which is often injected as a string) contains a null byte in the middle, it might get truncated or cut off, making it fail. Example: