Windows PE Header Parser (C++ Project)

🧠 What It Does

This project is a simple C++ program that parses and prints key information from a Windows Portable Executable (PE) file — such as notepad.exe.

Repository: Windows-PE-Header-Parser
Language: C++



By inspecting the DOS and NT headers, the program extracts metadata used in reverse engineering and malware analysis:

  • Opens and memory maps a .exe file using the Windows API
  • Validates the DOS signature (the MZ magic bytes)
  • Retrieves and displays: Magic bytes (in hex and ASCII), Offset to NT headers, Number of sections, Entry point address


🪄 What Are “Magic Bytes”?

Magic bytes are the first few bytes of a file, used to identify its format.

  • Hex: 0x5A4D
  • ASCll: MZ

The initials “MZ” refer to Mark Zbikowski, one of the original developers of the MS-DOS executable format. These bytes are always found at the very beginning of the file and are crucial for confirming the file type.



📊 Sample Output:



🔧 How It Works (Overview):

  1. File Mapping: Uses CreateFile, CreateFileMapping, and MapViewOfFile to access the .exe file in memory.
  2. DOS Header Parsing: Checks if the file starts with 0x5A4D.
  3. NT Header Location: Finds the offset to the NT headers using the DOS header.
  4. PE Header Parsing: Reads the NT headers to extract metadata like section count and entry point.

🗂️ 1. File Mapping: Instead of reading the file byte by byte, the program uses Windows memory-mapped file APIs to load the entire executable into memory. Here’s what each function does:

  • CreateFile: Opens the .exe file (like notepad.exe) for reading.
  • CreateFileMapping: Creates a memory-mapped object, which is a way of treating the file contents as if they’re part of memory.
  • MapViewOfFile: Maps that memory object into the process’s address space, making it accessible just like a normal array or pointer.

📌 Why it matters: This approach is faster and lets you work with the file as a block of memory, which is perfect for binary structure parsing.



📜 2. DOS Header Parsing: At the very start of every PE file is the DOS Header. The program reads the first two bytes of the file to verify they are 0x4D 0x5A — which corresponds to ‘MZ’ in ASCII.

  • These are known as magic bytes.
  • If the file doesn’t start with ‘MZ’, it’s not a valid PE file.

📌 Why it matters: Validating the DOS header ensures we’re working with a proper executable before going deeper.



📍 3. NT Header Location: Within the DOS header, there’s a field called e_lfanew. This gives the offset in the file where the NT Headers (the real “PE” structure) begin.

  • This value is typically something like 0xF8.
  • The program adds this offset to the file base address to locate the start of the NT headers.

📌 Why it matters: The DOS header is just a stub. The real meat — entry point, section table, etc. — lives in the NT headers, and this tells us where to look.



🧩 4. PE Header Parsing: Once the program has the address of the NT headers, it reads key fields like:

  • Number of sections: Tells you how many parts (code, data, resources, etc.) the executable has.
  • Entry point address: This is the memory address where execution begins when the file is run.

These values come from specific sub-structures inside the NT headers, like IMAGE_FILE_HEADER and IMAGE_OPTIONAL_HEADER.

📌 Why it matters: These values are essential for reverse engineers, malware analysts, or anyone trying to understand how a Windows program behaves under the hood.



🔗 View on GitHub: https://github.com/JasonEiler/Windows-PE-Header-Parser