Design Decisions
This document records the key architectural decisions made during the design of the Windows File System Minifilter project, the alternatives considered, and the rationale for each choice.
Related: System Overview ยท Driver Architecture ยท ML Classifier Module
Decision Map
flowchart TB
subgraph Decisions["Key Design Decisions"]
D1["DD-1: Minifilter\nvs Legacy Filter"]
D2["DD-2: Three-Process\nArchitecture"]
D3["DD-3: Filter Port\nvs DeviceIoControl"]
D4["DD-4: Named Pipe\nvs Shared Memory"]
D5["DD-5: Entropy-Based\nClassification"]
D6["DD-6: Extension-Only\nFiltering"]
D7["DD-7: Altitude 47777"]
D8["DD-8: Stream Context\nfor Delete Tracking"]
end
D1 --> D3
D2 --> D4
D3 --> D2
D5 --> D6
style Decisions fill:#1a1a2e,color:#fff
DD-1: Minifilter API vs Legacy File System Filter Driver
| ย | Minifilter (Chosen โ ) | Legacy Filter Driver |
|---|---|---|
| API | FltRegisterFilter, high-level callbacks |
IoAttachDeviceToDeviceStack, raw IRP handling |
| Complexity | Medium โ Filter Manager handles stack ordering | High โ must manually manage attachment and IRP forwarding |
| Stability | High โ Microsoft-maintained infrastructure | Low โ easy to crash the system with incorrect IRP handling |
| Future-proof | Recommended by Microsoft since Vista | Deprecated; no new features |
| Altitude system | Built-in ordering and isolation | Manual stack management |
Rationale: The minifilter framework abstracts away most of the complexity of file system filter development. Filter Manager handles device attachment, IRP routing, and teardown. This dramatically reduces the risk of blue-screen crashes and simplifies maintenance.
DD-2: Three-Process Architecture (Driver โ Monitor โ Scanner)
flowchart LR
subgraph Chosen["Chosen: Three-Process โ
"]
D1["Driver"] --> M1["Monitor"] --> S1["Scanner"]
end
subgraph Alt["Alternative: Two-Process"]
D2["Driver"] --> MS["Monitor + Scanner\n(combined)"]
end
style Chosen fill:#2d6a4f,color:#fff
style Alt fill:#6c757d,color:#fff
Rationale:
- Separation of concerns: The monitor is a thin relay that must be highly reliable (connected to the kernel port). The scanner performs expensive PE parsing that may crash on malformed binaries. Isolating them prevents a scanner crash from disconnecting the kernel communication.
- Independent scaling: The scanner can be restarted without losing the kernel connection.
- SEH isolation: The scanner wraps PE parsing in
__try/__except(SEH). A crash in the scanner process does not affect the monitor or driver.
Trade-off: Adds IPC overhead (named pipe) and operational complexity (two user-mode processes to manage).
DD-3: Filter Communication Port vs DeviceIoControl
| ย | Filter Comm Port (Chosen โ ) | DeviceIoControl |
|---|---|---|
| Direction | Kernel โ User push model | User โ Kernel pull model |
| Latency | Low โ driver pushes immediately | Higher โ user must poll |
| Complexity | FltSendMessage / FilterGetMessage |
DeviceIoControl + IOCTL codes |
| Security | Built-in security descriptor | Must implement manually |
Rationale: The filter communication port provides a natural push model โ the driver sends messages as events occur. DeviceIoControl would require the monitor to poll, adding latency and CPU overhead. The filter port also integrates cleanly with the minifilter framework and supports connection/disconnection callbacks for lifecycle management.
DD-4: Named Pipe vs Shared Memory (Monitor โ Scanner)
| ย | Named Pipe (Chosen โ ) | Shared Memory + Events |
|---|---|---|
| Complexity | Low โ message-oriented, built-in serialization | High โ manual synchronization, ring buffers |
| Message boundaries | Automatic (PIPE_TYPE_MESSAGE) |
Manual framing required |
| Cross-process | Built-in | Requires CreateFileMapping + named events |
| Throughput | ~100K msgs/sec (sufficient) | Higher (millions/sec) |
| Debugging | Easy โ PipeList, handle inspection | Difficult โ memory dumps |
Rationale: Named pipes provide message-oriented IPC with built-in framing and are trivial to implement. The scan workload is I/O-bound (reading PE files from disk), so pipe throughput is not the bottleneck. Shared memory would be over-engineering for this use case.
DD-5: Entropy-Based Heuristic Classification
flowchart LR
subgraph Chosen["Chosen: Entropy Heuristic โ
"]
E["Section Entropy\n> 6.99"]
I["Import Count\n> 10"]
E --> V{"Both true?"}
I --> V
V -->|Yes| MAL["MALICIOUS"]
V -->|No| CLN["CLEAN"]
end
style MAL fill:#e63946,color:#fff
style CLN fill:#2d6a4f,color:#fff
Rationale:
- Packed/encrypted malware exhibits high entropy (> 7.0) in code sections because compression and encryption produce near-random byte distributions.
- Legitimate software typically has structured code sections with lower entropy (5.0โ6.5) and uses many standard imports.
- This heuristic is deliberately simple to serve as a baseline classifier that can be extended with additional features (opcode analysis, API call patterns) without restructuring the pipeline.
Trade-off: High false-positive rate on legitimately packed software (UPX, Themida) and high-entropy data sections. The threshold (6.99) was chosen as an aggressive starting point; see Adding Detection Rules for tuning guidance.
DD-6: Extension-Based Filtering (.exe / .dll Only)
Rationale:
- Focusing on PE executables reduces noise by orders of magnitude โ most file I/O on Windows involves documents, logs, and temporary files.
.exeand.dllare the primary vectors for malware execution on Windows.- The check is performed at the kernel level (
IsTargetExtension()) to avoid sending irrelevant messages across the kernel/user boundary, saving IPC overhead.
Trade-off: Misses malware using non-standard extensions (.scr, .cpl, .ocx). This is an intentional scope limitation for the current version. The function can be trivially extended โ see Adding Detection Rules.
DD-7: Altitude 47777 (FSFilter Activity Monitor)
| Altitude Range | Group | Our Position |
|---|---|---|
| 320000โ329999 | FSFilter Anti-Virus | โ |
| 140000โ149999 | FSFilter Encryption | โ |
| 40000โ49999 | FSFilter Bottom | 47777 โ |
Rationale: The FSFilter Activity Monitor / Bottom group is designed for filters that passively observe I/O without modifying it. Since this driver never blocks, modifies, or redirects any I/O operation, a low altitude is appropriate. This ensures the driver sees I/O after other filters have processed it, providing a more accurate view of what actually reaches the file system.
DD-8: Stream Context for Delete Tracking
Problem: File deletion in Windows is not a single atomic operation. It can occur via:
FILE_DELETE_ON_CLOSEflag onCreateFileFileDispositionInformation/FileDispositionInformationExviaSetFileInformation- Actual deletion confirmed only during
IRP_MJ_CLEANUP
Solution: A FLT_STREAM_CONTEXT tracks the delete disposition state across these IRPs. The context is created on the first delete-related operation and persists until the file handle is cleaned up.
flowchart LR
C["IRP_MJ_CREATE\n(DELETE_ON_CLOSE)"] --> CTX["Stream Context\nCreated"]
S["IRP_MJ_SET_INFO\n(SetDisposition)"] --> CTX
CTX --> CL["IRP_MJ_CLEANUP"]
CL --> Q["FltQueryInformationFile"]
Q -->|STATUS_FILE_DELETED| N["Notify User Mode\n(once only)"]
style CTX fill:#e07a5f,color:#fff
style N fill:#2d6a4f,color:#fff
Rationale: This pattern is directly derived from the Microsoft Delete File Detection Sample. The InterlockedIncrement/InterlockedDecrement pattern on NumOps handles the race condition where multiple threads issue concurrent SetDisposition calls on the same file. The IsNotified flag ensures exactly-once notification semantics.
Summary Table
| ID | Decision | Key Driver | Risk |
|---|---|---|---|
| DD-1 | Minifilter API | Stability, simplicity | Lock-in to Filter Manager |
| DD-2 | Three-process architecture | Fault isolation | Operational complexity |
| DD-3 | Filter communication port | Push model, low latency | Single connection limit |
| DD-4 | Named pipe for IPC | Simplicity, message framing | Throughput ceiling |
| DD-5 | Entropy-based classification | Catches packed malware | False positives on packed legitimate software |
| DD-6 | Extension filtering | Noise reduction | Misses non-standard PE extensions |
| DD-7 | Altitude 47777 | Passive monitoring | Sees post-filter I/O only |
| DD-8 | Stream context delete tracking | Correctness | Memory overhead per tracked file |