Effectively guaranteed

Office 97-2003 RC4 — Hashcat Modes 9700 / 9800

TL;DR — Microsoft Office 97 through 2003 (.doc, .xls, .ppt formats) used a Standard Encryption Header with 40-bit RC4. The 40-bit key length predates the lifting of US export restrictions in 2000, so the entire keyspace is finite enough that recovery is effectively guaranteed for legitimate owners — duration is the only variable.

What Office 97-2003 actually encrypted

Office 97 introduced password protection across Word, Excel, and PowerPoint using a Compound File Binary Format (CFB, also called OLE2) container. Inside the container is a hidden 'EncryptionInfo' stream that holds salt, verifier, and mode metadata.

The encryption itself is RC4 with a 40-bit key by default. Microsoft offered alternative cipher choices through Crypto API providers (Microsoft Base Cryptographic Provider, Microsoft Strong Cryptographic Provider, Microsoft Enhanced Cryptographic Provider), but the default was 40-bit RC4 for compatibility.

Hashcat distinguishes two modes for this generation: 9700 covers the MD5-based key derivation used by Word and Excel (.doc, .xls); 9800 covers the SHA-1-based derivation used by PowerPoint (.ppt). The cipher and 40-bit key length are identical — only the hash function in the KDF differs.

  • File extensions: .doc, .xls, .ppt (Office 97-2003 binary)
  • Cipher: RC4 with 40-bit key
  • 9700: Word/Excel — MD5-based KDF
  • 9800: PowerPoint — SHA-1-based KDF
  • Container: OLE2 / Compound File Binary (CFB)

Why 40-bit Office RC4 is recoverable

Forty bits = 2^40 ≈ 1.1 trillion possible keys. By 2026 hardware standards, this entire keyspace is exhaustively searchable on modern GPU clusters within tractable timeframes. Recovery doesn't require guessing the password — it can target the cipher key directly.

Once the cipher key is known, the document body decrypts to plaintext regardless of how complex the original password was. A 4-character password and a 40-character password produce the same 40-bit key from the same finite space. This is the unusual property that makes mode 9700/9800 effectively guaranteed for recovery.

After the key is recovered, the file can be re-saved without password protection while preserving content, formulas, charts, embedded objects, and metadata exactly. The recovered file is byte-identical to what an authenticated reader would have produced.

How to identify Office 97-2003 files

The file extension is a strong signal: .doc (Word), .xls (Excel), .ppt (PowerPoint) — without the trailing 'x' that newer Office formats use. Internally, the OLE2 container starts with the magic bytes D0 CF 11 E0 A1 B1 1A E1.

An OLE2 inspection tool (oletools, olebrowse, 7-Zip 'Open archive') reveals the structure. The EncryptionInfo stream is present in protected files. Tools like office2john (from John the Ripper) or oledump.py extract the hash without requiring the password.

The hash format uses the oldoffice prefix followed by a numeric type identifier (0-4). Types 0 and 1 are RC4 with 40-bit keys (modes 9700/9800). Type 3 is also RC4 40-bit but uses a slightly different verifier path. Type 4 is the modern AES-128 path used by Office 2007+ — that's a different mode entirely (mode 9410).

Why these files still appear in 2026

Office 2007 introduced the OOXML formats (.docx, .xlsx, .pptx) with stronger encryption, but the legacy binary formats remained fully supported in every Office version through Microsoft 365. Many enterprise document management systems standardised on .doc/.xls during the 2000s and have kept the format internally for compatibility.

Common modern sources: long-archived legal disclosure files, mid-2000s financial templates, government documents that predate the Office 2007 rollout, and files exported from accounting systems that still default to .doc/.xls for compatibility.

When organisations migrate document archives, the original encryption is preserved by default — not upgraded. So a memo encrypted in 2003 with Office XP is still 40-bit RC4 today, even after several round-trips through modern systems.

Office 97-2003 vs OOXML side-by-side

Office 2007 was a complete rewrite of the file format and encryption. OOXML files (.docx, .xlsx, .pptx) are ZIP containers with XML inside, encrypted using AES-128 by default and AES-256 in Office 2010+. The Hashcat modes for the new family are 9400 (Office 2007), 9500 (Office 2010), 9600 (Office 2013+) — all categorically harder than 9700/9800.

If you have a choice between distributing an .xls and an .xlsx and confidentiality matters, .xlsx is materially stronger. If you're recovering an old document, the .xls/.doc/.ppt path is the favourable case.

Practical recovery flow

Standard flow for recovering a mode 9700/9800 file: (1) drop the file into a browser-based analyser to confirm format and version — this happens client-side, file content never leaves the browser; (2) run a free check against fast attack techniques — these complete quickly and identify whether the password is in any common pattern; (3) for documents the free check doesn't crack, the cipher-key search runs on GPU clusters with bounded duration.

The honest expectation: 40-bit RC4 documents have a near-100% recovery rate for legitimate owners. The variability is duration, not outcome. Some are recovered in minutes; others take hours of GPU time.

Frequently Asked Questions

Is recovering my own Office 97-2003 file legal?
Recovering a password to a file you own or are authorised to access is fully legal in every Tier 1 jurisdiction. Document ownership and authorisation must be confirmed before paid recovery proceeds.
How long does mode 9700/9800 typically take?
We don't publish specific timing because it depends on the queue and machine availability. The relevant point: the keyspace is finite, so recovery is bounded — duration varies, outcome doesn't.
Will the recovered file be identical to the original?
Yes. Content, formulas, embedded objects, charts, and metadata are preserved byte-for-byte. The unlocked file is identical to what an authenticated reader produced.
Why doesn't password length matter for mode 9700?
The KDF always produces a 40-bit key regardless of password length. A 6-character password and a 60-character password yield keys from the same 2^40 space. Recovery operates on the keyspace, not the password directly.
How is mode 9700 different from mode 9800?
Mode 9700 covers Word/Excel binary files (MD5-based KDF). Mode 9800 covers PowerPoint binary files (SHA-1-based KDF). The cipher and 40-bit key length are identical — only the hash function differs.
Can I do this myself with open-source tools?
Yes. Hashcat (modes 9700, 9800), John the Ripper, and office2john all handle this format. Time and electricity are the constraints — a single consumer GPU may take hours to days; managed services run on multi-GPU clusters.
What if the file is partially corrupted?
The OLE2 container needs to be parseable for any recovery to apply. If the file is structurally corrupted, repair must come first.
Does VBA project password affect this?
VBA project password is separate from document open password — it protects macro source code, not document content. VBA passwords have their own protection layer that's typically much weaker than the document password.

Related references

Have a file in this category?

Start with a free analysis. The encryption type is detected in your browser, then a free check runs through fast techniques before any paid attempt. You only pay if a recovery actually works.

Run a free Office analysis