How to Evaluate AI Tools for Data Privacy: A Buyer's Guide
Not all AI tools handle your data the same way. This buyer's guide gives you a practical framework for evaluating AI document processing tools on privacy, security, and compliance.
The privacy evaluation most buyers skip
When organizations evaluate AI tools for document processing, the demo usually focuses on capability: "Look, it can extract data from PDFs! It can compare contracts! It can summarize reports!"
Privacy evaluation happens as an afterthought, if it happens at all. Someone in IT or legal asks "is it secure?" The vendor says "yes, we use AES-256 encryption and SOC 2 Type II compliance." Everyone nods and moves on.
This is how organizations end up with AI tools that process their most sensitive documents through architectures they don't understand, under terms of service they haven't read, with data flows they can't trace.
A proper privacy evaluation doesn't require a security team or a law degree. It requires asking the right questions and understanding what the answers mean. This guide gives you both.
The seven questions
Question 1: Where does my file go when the AI processes it?
This is the foundational question. The answer determines everything else.
Best answer: "Your file stays on your device. We extract text locally and send only the text to the language model."
Acceptable answer: "Your file is uploaded to our servers, processed, and deleted within [specific timeframe]. We don't use your data for training."
Red flag answer: "Your data is secure in our cloud infrastructure." (Vague, doesn't answer where or how long.)
What to dig into: Ask for a data flow diagram. If the vendor can't show you exactly where your file content travels from upload to deletion, they either don't know or don't want you to know.
docrew's answer: Files are parsed locally by the Rust agent on your desktop. The raw file never leaves your device. Extracted text is sent to the language model via a proxy for analysis. The text is processed transiently by the model API.
Question 2: What data does the language model see?
The language model is the component that does the "AI" work. Understanding what it receives is critical.
Best answer: "The model receives only the extracted text content needed for your specific task."
Acceptable answer: "The model receives the full text content of your document."
Red flag answer: "The model processes your document." (Ambiguous -- does it receive the raw file? Just text? Including metadata?)
What to dig into: Ask specifically whether the model receives file metadata (author, creation date, revision history), embedded images, or the raw file binary. There's a significant privacy difference between "the model sees the words in your contract" and "the model sees the entire PDF including tracked changes and embedded objects."
Question 3: Does the provider store my document content? For how long?
Storage creates risk. The longer data persists and the more places it's stored, the higher the exposure.
Best answer: "We never store your document content. Processing is transient."
Acceptable answer: "We store your content for [X] days for [specific reason]. It's encrypted at rest and automatically deleted."
Red flag answer: "We retain data in accordance with our privacy policy." (Sends you to a legal document instead of answering directly.)
What to dig into: Ask about logs. Many vendors don't "store" your documents in the traditional sense, but they log API requests that contain document content. These logs may be retained for weeks or months for debugging and quality assurance. Ask specifically: "Do your system logs contain any portion of my document content? For how long?"
Question 4: Is my data used for model training?
This is the question most people know to ask, but the answers are often more nuanced than "yes" or "no."
Best answer: "No. We never use your data for training, regardless of your plan tier."
Acceptable answer: "Not on our paid plans. Free tier data may be used for improvement."
Red flag answer: "Your data helps improve our models for everyone." (Means yes, they train on your data.)
What to dig into: Ask about "model improvement" and "quality evaluation" separately from "training." Some providers don't use your data for training model weights but do use it for human evaluation, prompt optimization, or fine-tuning. These activities still involve humans or systems reading your document content.
Question 5: Where are the servers? Can I choose the region?
Data location matters for compliance (GDPR, data sovereignty) and for understanding your legal exposure.
Best answer: "You can choose your processing region. EU data stays in the EU."
Acceptable answer: "Processing happens in [specific region]. We don't transfer data across regions."
Red flag answer: "Our infrastructure is globally distributed for optimal performance." (Means your data could be processed anywhere.)
What to dig into: Ask about all components, not just the primary servers. Your document might be processed in the EU, but if the logs are stored in the US or if the backup systems are in Asia, your data has still crossed borders.
docrew routes EU users to EU Vertex AI endpoints (europe-west1) and US users to US endpoints (us-east1). The routing is based on the user's profile setting, not their IP address, ensuring consistent regional processing.
Question 6: What happens if the provider is breached?
Breaches happen. What matters is the blast radius.
Best answer: "We don't store your documents, so a breach of our systems wouldn't expose your document content."
Acceptable answer: "We maintain incident response procedures, carry cyber insurance, and will notify you within [timeframe]. Your data is encrypted at rest, so a breach of storage wouldn't expose plaintext content."
Red flag answer: "We take security very seriously." (Non-answer. Everyone says this.)
What to dig into: Ask what data an attacker would access in a worst-case breach scenario. Would they get raw files? Text content? Metadata? Logs? The provider should be able to articulate their breach blast radius specifically, not in generalities.
Question 7: Can I verify the privacy claims?
Trust but verify. Or better: don't trust, verify.
Best answer: "You can monitor network traffic to see exactly what data leaves your device. Here's how."
Acceptable answer: "We have SOC 2 Type II certification, and you can request our latest audit report."
Red flag answer: "Just trust us." (Not acceptable for sensitive documents.)
What to dig into: Ask for technical verification methods, not just certifications. Can you use network monitoring tools to observe what data leaves your device? Can you inspect the data in transit? Can you request a copy of everything the provider holds about you?
With local processing tools like docrew, verification is straightforward: run a network monitor (Wireshark, Little Snitch, Charles Proxy) and observe what's transmitted. You'll see text content going to the model API and nothing else. No file uploads, no binary data, no unexplained connections.
The evaluation matrix
Score each tool on a 1-5 scale for each dimension:
| Dimension | 1 (Poor) | 3 (Acceptable) | 5 (Excellent) | |-----------|----------|-----------------|----------------| | File handling | Uploads and stores raw files | Uploads, processes, deletes within days | Processes locally, never uploads | | Model exposure | Full file sent to model | Full text sent to model | Only relevant text sent to model | | Data retention | Indefinite or unclear | Defined retention, automatic deletion | No retention (transient processing) | | Training use | Uses data for training by default | Opt-out available | Never uses data for training | | Regional control | No regional choice | Limited regional options | User-selected regional processing | | Breach exposure | Stored files at risk | Encrypted stored files at risk | No stored files to expose | | Verifiability | No verification method | Audit reports available | Independently verifiable by user |
A score of 25-35 indicates strong privacy practices. Below 20 warrants serious concern for sensitive documents. Below 15 is a deal-breaker for regulated industries.
Beyond the checklist: architecture matters more than promises
Promises are contractual. Architecture is structural.
A promise says "we won't use your data for training." An architecture makes it impossible because the data isn't stored anywhere that training pipelines can access.
A promise says "we'll delete your data within 30 days." An architecture makes deletion unnecessary because the data was never persisted.
A promise says "your data is encrypted at rest." An architecture makes encryption at rest irrelevant because there's nothing at rest to encrypt (the data stays on your device).
When evaluating AI tools, prioritize architectural privacy over promissory privacy. Look for tools where the privacy properties are enforced by the system design, not just by policies that could change.
Specific scenarios
Scenario 1: Law firm evaluating contract analysis tools
Priority dimensions: File handling, breach exposure, verifiability
Key requirements: Attorney-client privilege must be protected. No third-party storage of client documents. Verifiable data flow.
Recommendation: Local processing is strongly preferred. The risk of privilege waiver from uploading client documents outweighs the convenience of cloud processing.
Scenario 2: Healthcare organization processing patient records
Priority dimensions: Data retention, regional control, training use
Key requirements: HIPAA compliance. BAA with any processor. Minimum necessary standard for PHI exposure.
Recommendation: Either local processing or a cloud provider with a signed BAA. Local processing is simpler for compliance because it reduces the scope of the BAA.
Scenario 3: Financial services firm analyzing reports
Priority dimensions: Breach exposure, regional control, model exposure
Key requirements: SOX compliance, data sovereignty, protection of non-public financial information.
Recommendation: Local processing with regional model routing. Financial data should not be stored on third-party servers.
Scenario 4: Small business automating invoice processing
Priority dimensions: Data retention, training use, cost
Key requirements: Reasonable privacy without enterprise overhead. Don't contribute business data to model training.
Recommendation: Either a paid-tier cloud service (no training on data) or local processing. Avoid free-tier cloud services that may use data for model improvement.
The procurement conversation
When you're ready to evaluate specific tools, here's a template for the vendor conversation:
"We're evaluating AI tools for document processing. Our documents include [describe sensitivity level]. We need to understand your data handling in detail. Can you provide:
-
A data flow diagram showing where our document content goes from when we start processing to when it's fully deleted from your systems?
-
Your data retention schedule, including logs, caches, backups, and sub-processor systems?
-
Confirmation of whether any of our document content is used for model training, fine-tuning, human evaluation, or quality assurance -- on any plan tier?
-
The specific regions where our data is processed and stored?
-
Your breach notification SLA and a description of what document content an attacker would access in a worst-case breach of your systems?
-
A method for us to independently verify your data handling claims (network monitoring guidance, audit reports, or both)?"
Vendors who can answer these questions clearly and specifically are worth considering. Vendors who respond with marketing language, redirect to legal documents, or can't provide specifics should give you pause.
The privacy spectrum
AI tools for document processing exist on a spectrum:
Maximum convenience, minimum privacy: Full cloud processing. Upload files, get results. Files stored on provider infrastructure. Convenient but creates maximum data exposure.
Balanced approach: Cloud processing with strong contractual protections. Defined retention, encryption, no training use. Good privacy if you trust the provider.
Maximum privacy, minimum compromise: Local processing with cloud model inference. Files stay local. Only text reaches the model. Transient processing. Minor setup overhead for maximum data control.
For sensitive documents -- legal, medical, financial, confidential business information -- the right choice is clear. The minor additional setup of local processing is trivially small compared to the risk and compliance cost of cloud file processing.
The best privacy evaluation is the one that leads you to an architecture where the hard questions have simple answers. "Where does my file go?" "Nowhere. It stays on your device." That's the answer that makes compliance simple, risk low, and trust unnecessary.