How to Create Summary Reports from Multiple Source Documents
Build structured summary reports by pulling data from financial statements, project updates, and metrics dashboards. Let the AI agent extract, organize, and assemble -- you review and send.
The monthly report problem
Every organization has a report that someone dreads assembling. A monthly executive summary pulling numbers from eight different documents. A board report combining financial statements, project updates, risk registers, and KPI dashboards. A compliance report referencing policy documents, audit findings, and remediation logs.
The data exists, scattered across spreadsheets, PDFs, Word documents, and slide decks -- each produced by a different team, in a different format. The person responsible becomes a human aggregation engine: open each source, find the relevant section, copy the number, paste it into the template, repeat. For a typical monthly report drawing from eight sources, this takes four to six hours.
This is exactly the kind of multi-document synthesis that docrew handles well. The agent reads all your source files locally, extracts the data you specify, and assembles it into the report structure you define. You review and refine instead of copying and pasting.
Define the report structure first
Before pointing the agent at any documents, define what the finished report should look like. A good structure answers three questions: what sections does the report contain, what data goes into each section, and where does that data come from.
For a monthly executive summary, the structure might look like this:
Section 1: Financial Overview. Source: monthly P&L, balance sheet. Data needed: revenue, expenses, net income, cash position, variance from budget.
Section 2: Project Status. Source: status reports from engineering, marketing, operations. Data needed: project name, phase, percent complete, milestones hit or missed, blockers.
Section 3: Sales Pipeline. Source: CRM export. Data needed: total pipeline value, new opportunities, deals closed, win rate, forecast.
Section 4: Customer Metrics. Source: customer success dashboard. Data needed: NPS, churn rate, ticket volume, resolution time.
Section 5: Team and Operations. Source: HR report, IT status report. Data needed: headcount, open positions, uptime, security incidents.
Section 6: Risk and Compliance. Source: risk register, audit tracker. Data needed: top risks by severity, new risks, open findings, remediation progress.
Section 7: Key Decisions Needed. Source: derived from all sections. Data needed: decisions requiring executive input.
Write this structure down. You will give it to the agent along with the source documents.
Gather your source documents
Collect all source documents into a single folder. For the executive summary above, the folder might contain:
financials-october.xlsx-- P&L and balance sheetengineering-status-oct.docx-- engineering project updatesmarketing-status-oct.pdf-- marketing project updatesoperations-report-oct.docx-- operations team reportsales-pipeline-oct.xlsx-- CRM exportcustomer-metrics-oct.pdf-- customer success dashboardhr-report-oct.xlsx-- headcount and hiring datarisk-register-oct.xlsx-- risk register with severity ratings
Name the files clearly. The agent reads file names as context, so financials-october.xlsx is more useful than report_v3_final_FINAL.xlsx. Include only the current version of each document.
Point the agent at the documents and define the task
Open docrew and start a new conversation. Provide both the report structure and the source document location.
A practical prompt: "I need to create a monthly executive summary report for October. The source documents are in the October Reports folder on my desktop. The report should have these sections: Financial Overview from the financials spreadsheet, Project Status from the three status reports, Sales Pipeline from the sales spreadsheet, Customer Metrics from the customer metrics PDF, Team and Operations from the HR spreadsheet and the operations report, and Risk and Compliance from the risk register. For each section, extract the key numbers and highlights. Write the report as a Word document with section headers, bullet points for data, and a brief narrative paragraph per section."
The agent scans the folder, identifies all documents, reads each one -- navigating spreadsheet tabs, parsing tables in PDFs, reading narrative sections in Word files -- then extracts specific data points for each section of your report.
How the agent extracts and assembles
The agent does not simply copy text from source documents. It reads each source, understands the content, and extracts the specific data points your structure requires.
From the financials spreadsheet, it pulls summary row totals and calculates budget variance. From the three project status reports -- which may use different formats (tables, narrative paragraphs, bullet lists) -- it normalizes everything into a consistent format. From the sales spreadsheet, it reads pipeline stages and calculates totals. From the customer metrics PDF, it locates the KPI summary and extracts headline numbers. From the HR and operations sources, it pulls headcount and uptime figures. From the risk register, it sorts by severity and identifies top risks.
The agent then assembles everything into your defined structure, writing a brief narrative paragraph per section that provides context for the numbers. The output is a complete, structured report -- not raw data.
Review the output
Open the generated report and review each section against the source documents.
Numbers accuracy. Spot-check the most important figures -- revenue, net income, pipeline total. Complex spreadsheet formulas or nested tables occasionally produce extraction errors.
Context appropriateness. If revenue is down 12% from budget, the narrative should note this as significant, not gloss over it.
Completeness. Make sure every section has content from its designated source. If a section is thin, the agent may not have found the data in the expected location.
Tone consistency. Executive reports have an organizational voice. Tell the agent "Make the tone more direct" or provide last month's report as a tone reference.
The first report requires the most review. After two or three iterations, the agent learns the pattern and output improves.
Handling messy source documents
Spreadsheets with multiple tabs. Tell the agent which tab to use: "Pull revenue from the Summary tab." Being explicit saves time.
PDFs with charts. The agent extracts text but cannot read embedded charts. Ensure source documents include data in text or table form, not only visually.
Inconsistent formatting. The agent reads content by meaning, not position, so it adapts when teams change their report format.
Missing source documents. Tell the agent: "The customer metrics report is not available yet. Write that section as 'Pending' and complete the rest."
Password-protected files. Remove the protection before adding the file to the source folder.
Recurring workflows: same structure, new data
The real payoff comes in month two. The report structure does not change -- only the data does.
Approach one: reference the previous report. Include last month's generated report alongside the new source documents. Tell the agent to use the same structure and add month-over-month comparisons.
Approach two: save your prompt. Keep your detailed report prompt in a text file. Each month, update the folder reference and run it again for consistent output.
Over time, refine the prompt to capture nuances from review. If the CFO always asks about cash runway, add it. If the CEO wants a one-paragraph executive summary at the top, add that section. Each refinement makes the next report better.
Scaling to larger document sets
The same approach works for larger document sets. Board reports might draw from 15 to 20 sources. Quarterly business reviews might combine 20 to 30 documents from every department. Compliance reports might reference dozens of policy documents and audit findings.
The key constraint is not document count but the clarity of your report structure. If you can define what goes where and which source provides it, the agent assembles the report regardless of volume.
Common mistakes to avoid
Vague report structures. "Summarize these eight documents" produces a generic summary. Specific sections with specific data points from specific files produces a useful report.
Stale source documents. Include only current versions. Archive old versions elsewhere.
Skipping the review. The agent handles extraction and assembly -- two thirds of the work. The remaining third is human judgment: does this framing accurately represent the situation, are these the right numbers to highlight.
Not iterating on the prompt. Use the review process to refine your instructions. By the third or fourth month, the prompt will be dialed in and review will be fast.
What this replaces
The traditional process: open eight documents, manually extract 50 to 100 data points, organize into a template, write narrative paragraphs, format, distribute. Total: four to six hours.
With docrew: collect documents, give the agent your structure, review the output. Total: 30 to 45 minutes, mostly review.
The consistency benefit matters as much as the time savings. Human assembly is error-prone under pressure -- a number from the wrong tab, a status from last month's document, a variance calculated backwards. The agent extracts from the current documents every time, and it does not get tired at data point number 87.