Windows PE Header Parser (C++ Project)

🧠 What It Does

This project is a simple C++ program that parses and prints key information from a Windows Portable Executable (PE) file — such as notepad.exe.

Repository: Windows-PE-Header-Parser
Language: C++

By inspecting the DOS and NT headers, the program extracts metadata used in reverse engineering and malware analysis:

Opens and memory maps a .exe file using the Windows API

Validates the DOS signature (the MZ magic bytes)

Retrieves and displays: Magic bytes (in hex and ASCII), Offset to NT headers, Number of sections, Entry point address

🪄 What Are “Magic Bytes”?

Magic bytes are the first few bytes of a file, used to identify its format.

Hex: 0x5A4D

ASCll: MZ

The initials “MZ” refer to Mark Zbikowski, one of the original developers of the MS-DOS executable format. These bytes are always found at the very beginning of the file and are crucial for confirming the file type.

📊 Sample Output:

🔧 How It Works (Overview):

File Mapping: Uses CreateFile, CreateFileMapping, and MapViewOfFile to access the .exe file in memory.
DOS Header Parsing: Checks if the file starts with 0x5A4D.
NT Header Location: Finds the offset to the NT headers using the DOS header.
PE Header Parsing: Reads the NT headers to extract metadata like section count and entry point.

🗂️ 1. File Mapping: Instead of reading the file byte by byte, the program uses Windows memory-mapped file APIs to load the entire executable into memory. Here’s what each function does:

CreateFile: Opens the .exe file (like notepad.exe) for reading.

CreateFileMapping: Creates a memory-mapped object, which is a way of treating the file contents as if they’re part of memory.

MapViewOfFile: Maps that memory object into the process’s address space, making it accessible just like a normal array or pointer.

📌 Why it matters: This approach is faster and lets you work with the file as a block of memory, which is perfect for binary structure parsing.

📜 2. DOS Header Parsing: At the very start of every PE file is the DOS Header. The program reads the first two bytes of the file to verify they are 0x4D 0x5A — which corresponds to ‘MZ’ in ASCII.

These are known as magic bytes.

If the file doesn’t start with ‘MZ’, it’s not a valid PE file.

📌 Why it matters: Validating the DOS header ensures we’re working with a proper executable before going deeper.

📍 3. NT Header Location: Within the DOS header, there’s a field called e_lfanew. This gives the offset in the file where the NT Headers (the real “PE” structure) begin.

This value is typically something like 0xF8.

The program adds this offset to the file base address to locate the start of the NT headers.

📌 Why it matters: The DOS header is just a stub. The real meat — entry point, section table, etc. — lives in the NT headers, and this tells us where to look.

🧩 4. PE Header Parsing: Once the program has the address of the NT headers, it reads key fields like:

Number of sections: Tells you how many parts (code, data, resources, etc.) the executable has.

Entry point address: This is the memory address where execution begins when the file is run.

These values come from specific sub-structures inside the NT headers, like IMAGE_FILE_HEADER and IMAGE_OPTIONAL_HEADER.

📌 Why it matters: These values are essential for reverse engineers, malware analysts, or anyone trying to understand how a Windows program behaves under the hood.

🧠 What It Does

🪄 What Are “Magic Bytes”?

📊 Sample Output:

🔧 How It Works (Overview):

🔗 View on GitHub: https://github.com/JasonEiler/Windows-PE-Header-Parser