Forensics CTF Basics

beginner30 minWriteup

Essential forensics techniques for CTFs

Learning Objectives

  • Extract file metadata
  • Identify file types
  • Use binwalk and strings
  • Handle corrupted files

Forensics challenges test your ability to analyze files, extract hidden data, and piece together digital evidence. They're like being a detective, but instead of fingerprints, you're looking for hidden strings, embedded files, and metadata!

The first rule of forensics: don't modify the evidence! In CTFs, always work on copies. In real life, this is critical for legal reasons.

First Steps with Any File

bash
1606070;"># 1. Identify the file type
2file mystery_file
3606070;"># Output: PNG image, ELF executable, etc.
4606070;"># Don't trust the extension!
5 
6606070;"># 2. Check file signature (magic bytes)
7xxd mystery_file | head -5
8606070;"># JPEG: FF D8 FF
9606070;"># PNG: 89 50 4E 47 0D 0A 1A 0A
10606070;"># GIF: 47 49 46 38
11606070;"># PDF: 25 50 44 46 (%PDF)
12606070;"># ZIP: 50 4B 03 04 (PK)
13606070;"># ELF: 7F 45 4C 46
14 
15606070;"># 3. Extract strings
16strings mystery_file | head -50
17strings -n 10 mystery_file 606070;"># Min 10 chars
18 
19606070;"># 4. Search for flags
20strings mystery_file | grep -i flag
21strings mystery_file | grep -E 606070;">#a5d6ff;">"[A-Za-z]+\{[^}]+\}"

Always Run These

file, strings, and xxd should be your automatic first three commands for ANY forensics challenge. They often reveal the flag directly!

Metadata Extraction

bash
1606070;"># exiftool - The metadata Swiss army knife
2exiftool image.jpg
3606070;"># Shows: camera model, GPS coordinates, creation date, software used
4 
5606070;"># Specific fields
6exiftool -GPSPosition image.jpg
7exiftool -Comment image.jpg 606070;"># Often contains hidden messages!
8exiftool -Author document.pdf
9 
10606070;"># Remove all metadata (for privacy)
11exiftool -all= image.jpg
12 
13606070;"># PDF metadata
14exiftool document.pdf
15pdfinfo document.pdf
16 
17606070;"># Office documents (DOCX, XLSX are ZIP files!)
18unzip document.docx -d extracted/
19cat extracted/docProps/core.xml
EXIF data in images can reveal amazing things: GPS coordinates where a photo was taken, the camera/phone model, even the owner's name!

Finding Hidden Data with Binwalk

bash
1606070;"># binwalk scans for embedded files/data signatures
2 
3606070;"># Analyze file
4binwalk suspicious_image.png
5606070;"># Shows offsets where other files begin
6 
7606070;"># Extract embedded files
8binwalk -e suspicious_image.png
9606070;"># Creates _suspicious_image.png.extracted/
10 
11606070;"># Extract ALL detected signatures
12binwalk --dd=606070;">#a5d6ff;">'.*' suspicious_image.png
13 
14606070;"># Common findings:
15606070;"># - ZIP files appended to images
16606070;"># - Embedded executables
17606070;"># - Hidden file systems
18606070;"># - Encrypted containers
19 
20606070;"># Check extracted files
21file _suspicious_image.png.extracted/*
bash
1606070;"># Other extraction tools
2 
3606070;"># foremost - File carving
4foremost -i disk.img -o output/
5 
6606070;"># scalpel - Faster file carving
7scalpel -o output/ disk.img
8 
9606070;"># photorec - Recovery tool
10photorec disk.img
11 
12606070;"># dd - Manual extraction at offset
13dd if=file.bin of=extracted.zip bs=1 skip=12345 count=6789

Hex Analysis

bash
1606070;"># Hex editors and viewers
2 
3606070;"># xxd - Quick hex dump
4xxd file.bin | head -20
5xxd -p file.bin 606070;"># Plain hex only
6 
7606070;"># hexdump
8hexdump -C file.bin | head -20
9 
10606070;"># Online: hexed.it
11606070;"># GUI: ghex, bless, HxD (Windows)
12 
13606070;"># Common things to look for:
14606070;"># - Readable strings in hex
15606070;"># - File signatures at unusual offsets
16606070;"># - Patterns (repeated bytes, zeros)
17606070;"># - Corrupted magic bytes (fix to open file)
18 
19606070;"># Fix corrupted file header
20printf 606070;">#a5d6ff;">'\x89\x50\x4e\x47' | dd of=broken.png bs=1 conv=notrunc

Fixing Corrupted Files

bash
1606070;"># PNG repair
2pngcheck -v image.png 606070;"># Check for errors
3 
4606070;"># If header is wrong:
5606070;"># Correct PNG header: 89 50 4E 47 0D 0A 1A 0A
6printf 606070;">#a5d6ff;">'\x89\x50\x4E\x47\x0D\x0A\x1A\x0A' > header.bin
7dd if=broken.png of=body.bin bs=1 skip=8
8cat header.bin body.bin > fixed.png
9 
10606070;"># ZIP repair
11zip -FF broken.zip --out fixed.zip
12 
13606070;"># JPEG repair
14606070;"># Often missing FF D8 FF at start
15printf 606070;">#a5d6ff;">'\xFF\xD8\xFF\xE0' > header.bin
16cat header.bin broken.jpg > fixed.jpg
17 
18606070;"># PDF repair
19606070;"># Should start with %PDF-
20printf 606070;">#a5d6ff;">'%%PDF-1.4\n' > header.bin
21cat header.bin broken.pdf > fixed.pdf

File Signature Reference

Keep a list of common file signatures handy:
  • JPEG: FF D8 FF
  • PNG: 89 50 4E 47 0D 0A 1A 0A
  • GIF: 47 49 46 38 (GIF8)
  • PDF: 25 50 44 46 (%PDF)
  • ZIP/DOCX: 50 4B 03 04 (PK)

Archive Analysis

bash
1606070;"># ZIP files
2 
3606070;"># List contents
4unzip -l archive.zip
5 
6606070;"># Extract
7unzip archive.zip -d output/
8 
9606070;"># Password-protected ZIP
10zip2john archive.zip > hash.txt
11john hash.txt --wordlist=rockyou.txt
12 
13606070;"># Or with fcrackzip
14fcrackzip -v -u -D -p rockyou.txt archive.zip
15 
16606070;"># Check for zip bomb (malicious nested zips)
17zipinfo archive.zip
18 
19606070;"># RAR files
20unrar x archive.rar
21rar2john archive.rar > hash.txt
22 
23606070;"># 7z files
247z x archive.7z
25606070;"># Password cracking is harder for 7z

Forensics Checklist

1□ file - Identify file type
2□ strings - Extract readable text
3□ xxd | head - Check magic bytes
4□ exiftool - Extract metadata
5□ binwalk - Find embedded files
6□ grep for flag patterns
7□ Check if file is archive (try unzip/unrar)
8□ If image: check for stego
9□ If corrupted: fix header bytes
10□ If encrypted: try common passwords

Knowledge Check

Quick Quiz
Question 1 of 2

What's the first thing you should do with an unknown file in forensics?

Key Takeaways

  • Always run: file, strings, xxd, exiftool, binwalk
  • Don't trust file extensions - check magic bytes
  • Metadata in images often contains hidden messages
  • Binwalk finds files embedded inside other files
  • Corrupted files can often be fixed by repairing headers
  • Office documents are just ZIP files in disguise