October 13, 202510 min read

Air-Gapped AI: Document Processing Without Internet

Some environments can't have internet access. Others just don't want it. Here's how air-gapped and offline AI document processing works, what's possible today, and where the limits are.

When the network is the threat

For most professionals, "no internet" is a temporary inconvenience. For some, it's a security requirement.

Government classified facilities, defense contractors, intelligence agencies, and certain financial trading floors operate in environments where network connectivity is either prohibited or tightly restricted. These aren't theoretical scenarios. SCIF (Sensitive Compartmented Information Facility) environments, air-gapped industrial control systems, and secure document processing rooms are real workplaces where real people handle real documents.

In these environments, cloud AI is not an option -- not because of preference, but because of physics. There is no network connection to the cloud. The AI either runs locally or it doesn't run at all.

But air-gapped AI isn't just for classified environments. It's relevant for anyone who wants the strongest possible guarantee that their documents never leave their device -- attorneys handling trade secret litigation, healthcare organizations processing patient records in high-security wings, financial institutions with regulatory data isolation requirements.

This article covers what's currently possible, what the architecture looks like, and where the boundaries are.

Spectrum of disconnection

"Air-gapped" isn't binary in practice. There's a spectrum:

Fully air-gapped: No network connectivity at all. No Ethernet, no Wi-Fi, no Bluetooth, no cellular. Data enters and leaves only through physical media (USB drives, optical discs) with strict procedures. This is SCIF-level isolation.

Network-isolated: Connected to an internal network but not the internet. Devices can communicate with each other and with local servers, but there's no route to external services. Common in corporate environments with high-security zones.

Intermittently connected: Has internet access sometimes but not always. Field operations, remote locations, aircraft, ships. The AI needs to work during disconnected periods.

Connected but restricted: Has internet access but organizational policy restricts what data can traverse the network. The connection exists but certain documents can't be transmitted over it.

Each level has different implications for AI document processing.

What works offline today

Document processing has two distinct phases: file parsing and AI analysis. Their offline capabilities differ.

File parsing: fully offline

Reading files -- extracting text from PDFs, DOCX, XLSX -- is entirely a local computation. No network needed. docrew's Rust parsers work the same whether you're connected to the internet or sitting in a Faraday cage.

This is a significant capability by itself. Extracting text, identifying document structure, parsing tables, reading metadata -- all of this happens without any network dependency. For workflows that need document reading without AI analysis, the system is fully air-gap compatible.

Code execution: fully offline

When the agent writes Python scripts to process extracted data -- calculate totals, transform formats, generate reports -- that code runs in a local sandbox. No network needed (and the sandbox prevents network access by design, even if connectivity exists).

AI analysis: requires network (for now)

The language model inference -- the "smart" part that understands natural language, extracts meaning, makes comparisons -- requires calling a model API. This is where the frontier cloud models live: models with hundreds of billions of parameters running on clusters of GPUs.

This creates a split: file processing and code execution are fully offline, but AI reasoning requires connectivity to a model endpoint.

The local model option

There is an alternative: run a language model locally. Tools like Ollama, llama.cpp, and LM Studio can run open-source models on consumer hardware. Models like Llama 3, Mistral, and Phi run on machines with 16GB+ of RAM.

The trade-off is quality. As of 2026, locally-runnable models are significantly less capable than frontier cloud models for complex document tasks. They handle simple extraction and summarization adequately, but struggle with nuanced contract comparison, multi-document reasoning, and complex financial analysis.

The gap is closing. Each generation of open-source models gets more capable while requiring less hardware. But for professional document processing where accuracy matters, frontier models through an API remain the quality benchmark.

Architecture for air-gapped processing

For organizations that need air-gapped AI document processing, the architecture looks different:

Option 1: Offline file processing + manual analysis

Use the local agent for all file operations: reading documents, extracting text, structuring data, generating reports from extracted content. The AI analysis step (which requires model inference) is replaced by human analysis of the extracted and structured data.

This is the simplest approach and the most compatible with strict air-gap requirements. The agent still provides significant value: it automates the mechanical work of reading files, extracting text, parsing tables, and organizing data. The human does the reasoning that would otherwise be done by the language model.

Best for: Fully air-gapped environments where no model inference is acceptable.

Option 2: Local model inference

Deploy a local language model alongside the agent. The model runs on the same machine or on a local server within the air-gapped network. All inference stays within the secure perimeter.

The hardware requirements depend on the model:

7B parameter models (Mistral 7B, Llama 3 8B): 8GB RAM, runs on any modern laptop. Adequate for basic extraction and summarization.
13-14B parameter models (Llama 3 14B): 16GB RAM. Better quality, handles moderate complexity.
70B+ parameter models (Llama 3 70B): 64GB+ RAM or dedicated GPU. Approaches frontier quality for many tasks but requires workstation-class hardware.

Best for: Network-isolated environments where local hardware can support model inference.

Option 3: Batch processing with data diode

Process files locally in the air-gapped environment. Extract text content. Transfer the text (not the files) through a data diode or cross-domain solution to a connected environment where model inference runs. Return results through the same controlled path.

A data diode allows one-way data flow -- data can exit the secure zone but nothing can enter (or vice versa). Some organizations use reviewed, one-way transfer mechanisms that allow text to flow out and analysis results to flow back, with human review at each transfer point.

Best for: Organizations with formal cross-domain transfer processes already in place.

Option 4: Intermittent sync

For environments with occasional connectivity, batch model inference requests during connected windows. The agent queues analysis requests while offline, sends them when connectivity is available, and integrates results when they return.

Best for: Field operations, remote sites, or environments with scheduled connectivity windows.

The security advantages of local-first architecture in restricted environments

Even outside fully air-gapped scenarios, local-first architecture provides security properties that matter for restricted environments:

No persistent external connections

docrew doesn't maintain WebSocket connections, long-polling sessions, or persistent channels to external servers. When it needs model inference, it makes an HTTPS request. When it's done, the connection closes. Between requests, there's zero network activity.

This matters for network monitoring. Security teams can observe all outbound traffic and verify that only text content (not files) is transmitted. The traffic pattern is simple: discrete HTTPS requests to a known endpoint, with measurable payload sizes.

Sandbox enforcement

The OS-level sandbox prevents executed code from accessing the network, even if the machine has connectivity. This is relevant for environments where the machine is connected but executed code shouldn't be.

On macOS, the Seatbelt profile explicitly denies network-outbound. On Linux, bwrap's network namespace isolation achieves the same result. These are kernel-level controls that the sandboxed code cannot bypass.

Auditable data flow

Every piece of data that leaves the machine goes through a documented path: text content, via HTTPS, to a known model API endpoint. There's no background sync, no telemetry uploads, no undocumented network activity.

For environments that require data flow certification, this simplicity is an asset. The data flow can be described in a single sentence: "Extracted text content is sent via TLS 1.3 to [endpoint] for language model inference."

Practical considerations

Storage and processing capacity

Air-gapped machines need sufficient local storage for the documents and extracted data. For typical document processing (thousands of PDFs, DOCX files, spreadsheets), a standard 512GB SSD is more than adequate.

Processing capacity is rarely the bottleneck. File parsing is fast on modern hardware. The bottleneck, if any, is model inference -- which is either handled by a cloud API (when connected) or a local model (with associated quality trade-offs).

Updates and maintenance

Air-gapped software needs a mechanism for updates. docrew's update packages can be verified via code signing and applied from physical media. The binary is self-contained -- no dependency downloads, no runtime installations required.

For local model deployments, model weights are static files that can be transferred via physical media and verified via checksums.

Training and documentation

Offline environments may not have access to online documentation or support. Ensure documentation is available locally. docrew's help system works offline, and this blog can be saved for offline reference.

Real-world scenarios

Defense contractor document review

A defense contractor needs to analyze 300 technical proposals for a procurement evaluation. The documents contain export-controlled data (ITAR/EAR) and must be processed in a controlled environment.

Architecture: Air-gapped workstations with docrew running locally. A 70B local model handles analysis. File parsing, text extraction, and comparison happen entirely offline. Results are reviewed by cleared personnel and exported via approved channels.

Healthcare facility records processing

A hospital needs to analyze 5 years of patient records for a quality improvement study. The records are stored on an isolated internal network per HIPAA requirements.

Architecture: Network-isolated workstation with docrew. Files are accessed from the internal network share. Text extraction is local. For AI analysis, the extracted text (de-identified per study protocol) is processed through an approved model endpoint.

Law firm trade secret litigation

A law firm handling trade secret litigation needs to analyze 10,000 documents produced in discovery. The documents contain the client's most sensitive proprietary information.

Architecture: docrew on attorney workstations with full internet connectivity but with the understanding that no document files are uploaded anywhere. Local parsing extracts text. The model API receives only extracted text content. The raw files -- containing proprietary formulas, manufacturing processes, and strategic plans -- never leave the firm's machines.

The future of offline AI

The trajectory is clear: local models are getting more capable while requiring less hardware. What required a data center in 2023 runs on a high-end laptop in 2026. This trend will continue.

Within 2-3 years, locally-runnable models will likely match current frontier model quality for most document processing tasks. When that happens, fully air-gapped AI document processing -- with no external model dependencies at all -- becomes practical for all but the most demanding analytical tasks.

In the meantime, the architecture to adopt is one that minimizes network dependency by default. Process files locally. Execute code in a sandbox. Send only text to the model API. Keep the raw documents home.

Whether you're in a SCIF or a law office, the principle is the same: the fewer places your documents travel, the fewer places they can be compromised. Local-first isn't just a privacy feature. It's a security architecture.

Back to all articles