7 min read

How to Compare Two Document Versions and Get a Summary of Changes

Drop two versions of a contract, policy, or report into docrew and get a structured summary of every change -- additions, deletions, modifications, and their significance.


The version comparison problem

You have two versions of a document. A vendor sent a revised contract. Your colleague updated a policy. You need to know what changed.

Reading both documents side by side works for short documents but falls apart beyond a few pages. A 30-page contract with changes scattered across sections is difficult to compare by eye. You will catch obvious additions, but you will miss subtle modifications: a liability cap reduced by half, a notice period shortened by 30 days, a single word deleted from a warranty clause that narrows its scope.

Word processors offer comparison features, but they fail when the revised document arrives as a clean PDF with no markup, when formatting changes flag every paragraph as different, or when you need to compare a DOCX against a PDF.

What you need is a semantic comparison -- an analysis that understands what the document says, not just how the text differs character by character.

Step one: drop both documents into a conversation

Open docrew and start a new conversation. Make sure both document versions are in your workspace.

The two files can be in different formats. The original might be a DOCX, the revision a PDF. docrew reads both formats natively, so format mismatch is not a problem.

Name the files clearly so you can refer to them in your prompt -- "agreement-v1.docx" and "agreement-v2.pdf."

Step two: ask for a comparison

Tell the agent what you need:

"Compare agreement-v1.docx and agreement-v2.pdf. Identify every change between the two versions. For each change, tell me: which section it is in, what the original text said, what the new text says, and whether the change is substantive or cosmetic. Organize the results by section."

The agent knows to read both documents, align their sections, identify differences, classify them, and present them organized by section.

What the agent does

Document parsing. The agent reads both files using built-in parsers -- extracting headings, paragraphs, lists, and tables from DOCX; reconstructing structure from formatting cues in PDF.

Structure mapping. The agent maps each document's structure and creates correspondence between sections. This handles reorganization -- if indemnification moved from Section 9 to Section 12, the agent recognizes it as relocation rather than deletion and addition.

Content comparison. With structures aligned, the agent compares each clause between versions, identifying additions, deletions, and modifications.

Classification. Each change is assessed as substantive or cosmetic. Substantive changes affect meaning, obligations, or rights: a dollar amount changed, a time period shortened, a new obligation added. Cosmetic changes affect presentation only: section numbers updated, "shall" changed to "will," a paragraph reformatted into a list.

This classification lets you focus on substantive changes first and skip cosmetic ones entirely if needed.

Reading the comparison output

The agent presents a structured change summary. A typical output:

Section 1 -- Definitions. One modification. The definition of "Services" was expanded to include "maintenance and support services" in addition to "consulting services." Broadens the agreement's scope. Substantive.

Section 4 -- Fees and Payment. Two modifications. Hourly rate changed from 200 dollars to 225 dollars. Payment terms changed from net-30 to net-45. Both substantive.

Section 7 -- Term and Termination. One addition, one modification. New termination-for-convenience clause added, allowing either party to terminate with 60 days notice. Automatic renewal period changed from 12 months to 6 months. Both substantive.

Section 9 -- Limitation of Liability. Aggregate liability cap reduced from 2,000,000 dollars to 1,000,000 dollars. Substantive -- halves the maximum exposure.

Section 12 -- General Provisions. Three cosmetic changes. Section numbering updated. "Shall" replaced with "will." Governing law section reformatted but substance unchanged.

You can read this in two minutes and understand the full scope of the revision.

Beyond Word redline: semantic comparison

Word's comparison tool finds every inserted, deleted, or changed character and marks it with color coding. A changed comma gets the same treatment as a changed dollar amount. You must apply your own judgment to every markup.

docrew's comparison operates on meaning. It does not merely show you that "2,000,000" became "1,000,000" -- it tells you the liability cap was halved. It does not merely show a paragraph was added -- it tells you a new termination right was introduced and describes its conditions.

This is especially valuable for substantial rewrites. When a clause is restructured with different wording but overlapping meaning, a text diff produces a confusing mess. The agent describes the change in plain language: "The indemnification clause was restructured. The original provided mutual indemnification with no exclusions. The revised version limits vendor indemnification to direct damages only, excluding consequential damages."

Comparing across formats

In real-world workflows, versions do not always arrive in the same format:

  • The original contract is a DOCX; the counterparty sends a PDF revision.
  • You have a signed PDF of the current agreement and a DOCX of the proposed amendment.
  • An old version exists only as a scanned PDF; the new version is a clean digital document.

Traditional comparison tools require converting both documents to the same format first. PDF-to-Word conversion introduces artifacts -- broken tables, misplaced headers -- that show up as false changes, burying real differences in noise.

docrew compares content, not file formatting. The agent extracts text and structure from each document independently using format-appropriate parsers. Format differences between files do not create false positives.

Follow-up questions and deeper analysis

The initial comparison gives you the overview. Continue the conversation for deeper analysis:

Focus on a section. "Tell me more about the limitation of liability changes. What was the language before and after, and what are the practical implications?"

Assess overall risk. "Does the revised version favor us or the counterparty? What is the net impact on our risk position?" The agent reviews all modifications as a group: "The revised version is moderately less favorable. The halved liability cap and new termination-for-convenience clause reduce your protections."

Generate a response. "Draft a list of changes we should push back on in negotiation, with rationale for each."

Summarize for stakeholders. "Write a two-paragraph summary of key changes for the project team. Keep it non-technical."

These follow-ups are fast because the agent already has both documents in context.

Practical scenarios beyond contracts

The same workflow applies to any document that goes through revisions:

Policy documents. Compare this year's information security policy against last year's to identify new requirements and changed procedures.

Regulatory filings. Compare draft and final versions of a submission. Confirm all reviewer comments were addressed.

Reports and proposals. Compare first draft against the final to see how the document evolved and whether revisions introduced inconsistencies.

Employment agreements. Compare old and new template versions to understand and communicate changes to hiring managers.

Insurance policies. Compare renewal documents against the prior year's policy for changes in coverage, exclusions, limits, and premiums.

Saving and sharing comparison results

Ask the agent to export the comparison: "Save the change summary as agreement-v1-v2-comparison.md in my workspace." For formal reviews: "Create a CSV with columns for section, change type, original text, revised text, classification, and significance." For executive communication: "Write a one-paragraph summary of the most important changes for email to the project sponsor."

The agent adapts the output to whatever format your workflow requires.

A workflow that saves hours

Comparing documents by hand is one of the most time-consuming professional tasks. With docrew, you provide two files and a single instruction. The agent reads both, maps structures, identifies every change, classifies each by significance, and presents a structured summary.

The time saved scales with complexity. A five-page policy comparison saves 15 minutes. A 50-page contract comparison saves an hour. A portfolio of ten revised agreements saves a full day.

Everything happens locally -- both documents read from your hard drive, no uploads to cloud comparison services. The files stay where they are. The analysis comes to you.

Back to all articles