Automate PDF Report Generation from Your Data

Step-by-step guide to automating PDF report generation from databases, APIs, and spreadsheets. Python and Node.js examples included.

By LightningPDF Team · · 7 min read
Automate PDF Report Generation from Your Data
TL;DR: Automate PDF reports by combining your data source with HTML templates and LightningPDF's API. Generate charts, tables, and branded reports programmatically in seconds.

Automate PDF Report Generation from Your Data

Most businesses produce weekly or monthly reports -- sales summaries, financial snapshots, project updates. Most teams produce them, and most still rely on someone opening a spreadsheet, copying numbers into a document, and exporting a PDF by hand. This manual process is slow, error-prone, and impossible to scale.

Automated PDF report generation is the process of programmatically pulling data from databases, APIs, or spreadsheets, merging it into an HTML template, and converting the result into a polished PDF document via an API, eliminating manual formatting and data entry.

This guide walks you through building a complete automated reporting pipeline. You will connect to data sources, build dynamic HTML templates, generate PDFs through the LightningPDF API, and schedule reports to run automatically. Code examples are provided in Python and Node.js.

Why Automate Report Generation?

The Cost of Manual Reporting

Consider a SaaS company that sends monthly performance reports to 500 clients. Each report requires pulling metrics from the database, formatting tables and charts, adding the client's branding, and exporting to PDF. If each report takes 10 minutes to produce manually, that is over 83 hours of labor every month -- more than two full-time work weeks dedicated to a task that a script can handle in minutes.

Beyond the time cost, manual reporting introduces errors. A misplaced decimal, an outdated chart, or a copy-paste mistake in one report undermines credibility. When a client questions a number, tracking down the source of the error in a manually assembled document is painful.

What Automated Reporting Gives You

  • Consistency: Every report follows the exact same template, formatting, and calculation logic.
  • Speed: Generate hundreds of reports in minutes instead of days.
  • Accuracy: Data flows directly from the source into the report with no manual transcription.
  • Timeliness: Schedule reports to generate and deliver automatically, never missing a deadline.
  • Audit trail: Every report is generated from a known data query at a known time, making discrepancies easy to trace.

Common Use Cases

Here's where automated reports show up:

  • SaaS analytics: Monthly usage reports for customers showing API calls, storage, active users
  • Financial services: Portfolio performance reports, transaction summaries, compliance documents
  • E-commerce: Sales reports, inventory summaries, supplier performance reviews
  • Healthcare: Patient visit summaries, lab result reports, billing statements
  • Education: Student progress reports, course completion certificates, attendance records
  • Marketing: Campaign performance reports, SEO audits, social media analytics

Connecting to Data Sources

Before you can generate a report, you need to pull the data that populates it. Here are the most common patterns.

From a Database (PostgreSQL/MySQL)

import psycopg2
from datetime import datetime, timedelta

def get_monthly_sales_data(client_id, month, year):
    conn = psycopg2.connect(
        host="db.example.com",
        database="analytics",
        user="report_reader",
        password="secure-password"
    )
    cursor = conn.cursor()

    cursor.execute("""
        SELECT
            product_name,
            SUM(quantity) as total_qty,
            SUM(revenue) as total_revenue,
            COUNT(DISTINCT order_id) as order_count
        FROM orders
        WHERE client_id = %s
          AND EXTRACT(MONTH FROM order_date) = %s
          AND EXTRACT(YEAR FROM order_date) = %s
        GROUP BY product_name
        ORDER BY total_revenue DESC
    """, (client_id, month, year))

    columns = [desc[0] for desc in cursor.description]
    rows = [dict(zip(columns, row)) for row in cursor.fetchall()]

    cursor.close()
    conn.close()
    return rows

From a REST API

import requests

def get_analytics_data(api_token, date_range):
    response = requests.get(
        "https://analytics.example.com/api/v2/metrics",
        headers={"Authorization": f"Bearer {api_token}"},
        params={
            "start_date": date_range["start"],
            "end_date": date_range["end"],
            "metrics": "pageviews,sessions,conversions,revenue"
        }
    )
    return response.json()["data"]

From a Spreadsheet (CSV/Excel)

import pandas as pd

def get_spreadsheet_data(file_path):
    df = pd.read_excel(file_path, sheet_name="Monthly Data")
    # Clean and aggregate
    summary = df.groupby("category").agg({
        "revenue": "sum",
        "units": "sum",
        "profit_margin": "mean"
    }).reset_index()
    return summary.to_dict("records")

Building Dynamic Report Templates

HTML-based reports let you use the full web stack -- CSS for styling, JavaScript charting libraries for visualizations, and template engines for dynamic content. For a thorough grounding in HTML-to-PDF conversion approaches, our HTML-to-PDF complete guide covers the fundamentals.

A Complete Report Template

Here is a professional report template with a header, summary metrics, a data table, and a footer:

<!DOCTYPE html>
<html>
<head>
<style>
  @page {
    size: A4;
    margin: 20mm;
    @bottom-center {
      content: "Page " counter(page) " of " counter(pages);
      font-size: 10px;
      color: #94a3b8;
    }
  }
  * { margin: 0; padding: 0; box-sizing: border-box; }
  body {
    font-family: 'Helvetica Neue', Arial, sans-serif;
    color: #1e293b;
    font-size: 13px;
    line-height: 1.6;
  }
  .report-header {
    background: linear-gradient(135deg, #4F46E5 0%, #7C3AED 100%);
    color: #fff;
    padding: 30px 40px;
    margin: -20mm -20mm 30px -20mm;
    display: flex;
    justify-content: space-between;
    align-items: center;
  }
  .report-header h1 { font-size: 24px; font-weight: 700; }
  .report-header .meta { text-align: right; opacity: 0.9; font-size: 13px; }
  .summary-cards {
    display: flex;
    gap: 16px;
    margin-bottom: 30px;
  }
  .card {
    flex: 1;
    background: #f8fafc;
    border: 1px solid #e2e8f0;
    border-radius: 8px;
    padding: 20px;
    text-align: center;
  }
  .card .value {
    font-size: 28px;
    font-weight: 700;
    color: #4F46E5;
  }
  .card .label {
    font-size: 11px;
    text-transform: uppercase;
    letter-spacing: 1px;
    color: #64748b;
    margin-top: 4px;
  }
  .card .change {
    font-size: 12px;
    margin-top: 8px;
  }
  .change.up { color: #16a34a; }
  .change.down { color: #dc2626; }
  h2 {
    font-size: 16px;
    color: #1e293b;
    border-bottom: 2px solid #4F46E5;
    padding-bottom: 8px;
    margin: 30px 0 16px 0;
  }
  table {
    width: 100%;
    border-collapse: collapse;
    margin-bottom: 20px;
    page-break-inside: auto;
  }
  thead { display: table-header-group; }
  tr { page-break-inside: avoid; }
  th {
    background: #4F46E5;
    color: #fff;
    padding: 10px 14px;
    text-align: left;
    font-size: 11px;
    text-transform: uppercase;
    letter-spacing: 0.5px;
  }
  td {
    padding: 10px 14px;
    border-bottom: 1px solid #e2e8f0;
  }
  tbody tr:nth-child(even) { background: #f8fafc; }
  .chart-placeholder {
    background: #f1f5f9;
    border: 2px dashed #cbd5e1;
    border-radius: 8px;
    padding: 40px;
    text-align: center;
    color: #64748b;
    margin: 20px 0;
  }
  .footer {
    margin-top: 40px;
    padding-top: 16px;
    border-top: 1px solid #e2e8f0;
    font-size: 11px;
    color: #94a3b8;
    text-align: center;
  }
</style>
</head>
<body>
  <div class="report-header">
    <div>
      <h1>Monthly Sales Report</h1>
      <p>Acme Corp -- Client Analytics Division</p>
    </div>
    <div class="meta">
      <p>Report Period: February 2026</p>
      <p>Generated: Feb 23, 2026</p>
      <p>Prepared for: Widget Industries</p>
    </div>
  </div>

  <div class="summary-cards">
    <div class="card">
      <div class="value">$124,500</div>
      <div class="label">Total Revenue</div>
      <div class="change up">+12.3% vs last month</div>
    </div>
    <div class="card">
      <div class="value">1,847</div>
      <div class="label">Orders</div>
      <div class="change up">+8.1% vs last month</div>
    </div>
    <div class="card">
      <div class="value">$67.40</div>
      <div class="label">Avg Order Value</div>
      <div class="change up">+3.9% vs last month</div>
    </div>
    <div class="card">
      <div class="value">94.2%</div>
      <div class="label">Fulfillment Rate</div>
      <div class="change down">-0.8% vs last month</div>
    </div>
  </div>

  <h2>Revenue by Product Category</h2>
  <table>
    <thead>
      <tr>
        <th>Category</th>
        <th>Units Sold</th>
        <th>Revenue</th>
        <th>Margin</th>
        <th>MoM Change</th>
      </tr>
    </thead>
    <tbody>
      <tr>
        <td>Electronics</td><td>423</td><td>$52,875</td>
        <td>32.1%</td><td style="color:#16a34a">+15.2%</td>
      </tr>
      <tr>
        <td>Accessories</td><td>891</td><td>$35,640</td>
        <td>45.3%</td><td style="color:#16a34a">+9.8%</td>
      </tr>
      <tr>
        <td>Software Licenses</td><td>156</td><td>$23,400</td>
        <td>89.0%</td><td style="color:#16a34a">+22.1%</td>
      </tr>
      <tr>
        <td>Support Plans</td><td>89</td><td>$12,585</td>
        <td>72.5%</td><td style="color:#dc2626">-3.4%</td>
      </tr>
    </tbody>
  </table>

  <div class="footer">
    <p>This report was auto-generated. Data sourced from analytics database.</p>
    <p>Acme Corp -- Confidential</p>
  </div>
</body>
</html>

Notice the CSS page-break-inside: avoid on table rows and display: table-header-group on thead -- these ensure tables reflow cleanly across pages. For more on handling page breaks, see our page break troubleshooting guide.

Generating Reports with LightningPDF

With your data source connected and your template built, generating the PDF takes one API call. Send the populated HTML to the LightningPDF API and receive a PDF back.

Python: Full Report Pipeline

import requests
import base64
from datetime import datetime

API_URL = "https://api.lightningpdf.dev/api/v1/pdf/generate"
API_KEY = "your-api-key"

def build_report_html(data, client_name, period):
    """Merge data into the report HTML template."""

    # Build summary cards
    total_revenue = sum(row["total_revenue"] for row in data)
    total_orders = sum(row["order_count"] for row in data)
    avg_order = total_revenue / total_orders if total_orders else 0

    # Build table rows
    table_rows = ""
    for row in data:
        margin = (row["total_revenue"] - row.get("cost", 0)) / row["total_revenue"] * 100
        table_rows += f"""
        <tr>
            <td>{row['product_name']}</td>
            <td>{row['total_qty']:,}</td>
            <td>${row['total_revenue']:,.2f}</td>
            <td>{margin:.1f}%</td>
        </tr>"""

    return f"""<!DOCTYPE html>
    <html><head><style>
      body {{ font-family: Helvetica, sans-serif; padding: 40px; color: #1e293b; }}
      .header {{
        background: linear-gradient(135deg, #4F46E5, #7C3AED);
        color: #fff; padding: 30px; margin: -40px -40px 30px -40px;
      }}
      .header h1 {{ font-size: 24px; margin-bottom: 4px; }}
      .cards {{ display: flex; gap: 16px; margin-bottom: 30px; }}
      .card {{
        flex: 1; background: #f8fafc; border: 1px solid #e2e8f0;
        border-radius: 8px; padding: 20px; text-align: center;
      }}
      .card .value {{ font-size: 28px; font-weight: 700; color: #4F46E5; }}
      .card .label {{
        font-size: 11px; text-transform: uppercase;
        letter-spacing: 1px; color: #64748b;
      }}
      h2 {{ border-bottom: 2px solid #4F46E5; padding-bottom: 8px;
           margin: 24px 0 12px 0; font-size: 16px; }}
      table {{ width: 100%; border-collapse: collapse; }}
      th {{ background: #4F46E5; color: #fff; padding: 10px 14px;
          text-align: left; font-size: 11px; text-transform: uppercase; }}
      td {{ padding: 10px 14px; border-bottom: 1px solid #e2e8f0; }}
      tr:nth-child(even) {{ background: #f8fafc; }}
      tr {{ page-break-inside: avoid; }}
    </style></head><body>
      <div class="header">
        <h1>Monthly Performance Report</h1>
        <p>{client_name} -- {period}</p>
        <p>Generated: {datetime.now().strftime('%B %d, %Y')}</p>
      </div>
      <div class="cards">
        <div class="card">
          <div class="value">${total_revenue:,.0f}</div>
          <div class="label">Total Revenue</div>
        </div>
        <div class="card">
          <div class="value">{total_orders:,}</div>
          <div class="label">Total Orders</div>
        </div>
        <div class="card">
          <div class="value">${avg_order:,.2f}</div>
          <div class="label">Avg Order Value</div>
        </div>
      </div>
      <h2>Revenue by Product</h2>
      <table>
        <thead><tr><th>Product</th><th>Units</th><th>Revenue</th><th>Margin</th></tr></thead>
        <tbody>{table_rows}</tbody>
      </table>
    </body></html>"""


def generate_report(html, filename):
    """Send HTML to LightningPDF and save the PDF."""
    response = requests.post(
        API_URL,
        headers={
            "X-API-Key": API_KEY,
            "Content-Type": "application/json"
        },
        json={
            "html": html,
            "options": {
                "format": "A4",
                "print_background": True,
                "margin": {
                    "top": "20mm",
                    "bottom": "20mm",
                    "left": "15mm",
                    "right": "15mm"
                }
            }
        }
    )

    if response.status_code == 200:
        result = response.json()
        pdf_bytes = base64.b64decode(result["data"]["pdf"])
        with open(filename, "wb") as f:
            f.write(pdf_bytes)
        print(f"Report saved: {filename} ({result['data']['generation_time_ms']}ms)")
        return pdf_bytes
    else:
        raise Exception(f"Generation failed: {response.status_code} {response.text}")


# Usage: generate a single report
data = get_monthly_sales_data(client_id=42, month=2, year=2026)
html = build_report_html(data, "Widget Industries", "February 2026")
generate_report(html, "report_widget_industries_feb_2026.pdf")

Node.js: Report Generation

const fs = require('fs');

const API_URL = 'https://api.lightningpdf.dev/api/v1/pdf/generate';
const API_KEY = 'your-api-key';

async function generateReport(data, clientName, period) {
  const totalRevenue = data.reduce((sum, row) => sum + row.revenue, 0);
  const totalOrders = data.reduce((sum, row) => sum + row.orders, 0);

  const tableRows = data.map(row => `
    <tr>
      <td>${row.category}</td>
      <td>${row.orders.toLocaleString()}</td>
      <td>$${row.revenue.toLocaleString('en-US', {minimumFractionDigits: 2})}</td>
      <td>${row.growth > 0 ? '+' : ''}${row.growth.toFixed(1)}%</td>
    </tr>
  `).join('');

  const html = `<!DOCTYPE html>
  <html><head><style>
    body { font-family: Helvetica, sans-serif; padding: 40px; }
    .header { background: linear-gradient(135deg, #4F46E5, #7C3AED);
              color: #fff; padding: 30px; margin: -40px -40px 30px; }
    .cards { display: flex; gap: 16px; margin-bottom: 30px; }
    .card { flex: 1; background: #f8fafc; border: 1px solid #e2e8f0;
            border-radius: 8px; padding: 20px; text-align: center; }
    .card .value { font-size: 28px; font-weight: 700; color: #4F46E5; }
    .card .label { font-size: 11px; text-transform: uppercase;
                   letter-spacing: 1px; color: #64748b; }
    table { width: 100%; border-collapse: collapse; }
    th { background: #4F46E5; color: #fff; padding: 10px; text-align: left; }
    td { padding: 10px; border-bottom: 1px solid #e2e8f0; }
  </style></head><body>
    <div class="header">
      <h1>Monthly Report</h1>
      <p>${clientName} -- ${period}</p>
    </div>
    <div class="cards">
      <div class="card">
        <div class="value">$${totalRevenue.toLocaleString()}</div>
        <div class="label">Revenue</div>
      </div>
      <div class="card">
        <div class="value">${totalOrders.toLocaleString()}</div>
        <div class="label">Orders</div>
      </div>
    </div>
    <table>
      <thead><tr><th>Category</th><th>Orders</th><th>Revenue</th><th>Growth</th></tr></thead>
      <tbody>${tableRows}</tbody>
    </table>
  </body></html>`;

  const response = await fetch(API_URL, {
    method: 'POST',
    headers: {
      'X-API-Key': API_KEY,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      html,
      options: { format: 'A4', print_background: true }
    })
  });

  if (!response.ok) throw new Error(`API error: ${response.status}`);

  const result = await response.json();
  const pdfBuffer = Buffer.from(result.data.pdf, 'base64');
  const filename = `report_${clientName.toLowerCase().replace(/\s+/g, '_')}_${period.replace(/\s+/g, '_')}.pdf`;
  fs.writeFileSync(filename, pdfBuffer);
  console.log(`Report saved: ${filename} (${result.data.generation_time_ms}ms)`);
  return pdfBuffer;
}

Batch Report Generation

When you need to generate reports for many clients at once, the batch API is significantly more efficient than making individual calls. Instead of one HTTP request per report, you send a single request containing all the report data:

def generate_batch_reports(clients_data):
    """Generate reports for multiple clients in one API call."""
    items = []
    for client in clients_data:
        html = build_report_html(
            client["data"],
            client["name"],
            client["period"]
        )
        items.append({
            "html": html,
            "options": {"format": "A4", "print_background": True},
            "metadata": {"client_id": client["id"]}
        })

    response = requests.post(
        "https://api.lightningpdf.dev/api/v1/pdf/batch",
        headers={
            "X-API-Key": API_KEY,
            "Content-Type": "application/json"
        },
        json={"items": items}
    )

    batch = response.json()
    print(f"Queued {batch['data']['total']} reports for generation")
    return batch["data"]["batch_id"]

For language-specific patterns beyond what is shown here, see our guides for Python, Node.js, and Go.

Scheduling Automated Reports

The final piece of the pipeline is scheduling. You want reports to generate and deliver themselves without any human intervention.

Cron Job (Linux/macOS)

The simplest approach is a cron job that runs your script on a schedule:

# Generate monthly reports on the 1st of every month at 6:00 AM
0 6 1 * * /usr/bin/python3 /opt/reports/generate_monthly_reports.py >> /var/log/reports.log 2>&1

Python Script with Schedule

For more control, use a scheduling library:

import schedule
import time

def monthly_report_job():
    """Generate and email monthly reports for all clients."""
    clients = get_active_clients()
    for client in clients:
        data = get_monthly_sales_data(client["id"], month=2, year=2026)
        html = build_report_html(data, client["name"], "February 2026")
        pdf_bytes = generate_report(html, f"report_{client['id']}.pdf")
        send_report_email(client["email"], pdf_bytes, client["name"])
    print(f"Sent {len(clients)} monthly reports")

# Run on the 1st of each month
schedule.every().month.at("06:00").do(monthly_report_job)

while True:
    schedule.run_pending()
    time.sleep(60)

Node.js with node-cron

const cron = require('node-cron');

// Run at 6:00 AM on the 1st of every month
cron.schedule('0 6 1 * *', async () => {
  console.log('Starting monthly report generation...');
  const clients = await getActiveClients();
  for (const client of clients) {
    const data = await getClientData(client.id);
    const pdf = await generateReport(data, client.name, 'February 2026');
    await sendEmail(client.email, pdf);
  }
  console.log(`Sent ${clients.length} reports`);
});

Webhook-Triggered Reports

For event-driven reports (for example, generating a report each time a data import completes), use webhooks:

from flask import Flask, request

app = Flask(__name__)

@app.route("/webhook/data-import-complete", methods=["POST"])
def handle_import_complete():
    payload = request.json
    client_id = payload["client_id"]
    period = payload["period"]

    data = get_monthly_sales_data(client_id, period["month"], period["year"])
    html = build_report_html(data, payload["client_name"], period["label"])
    generate_report(html, f"report_{client_id}_{period['label']}.pdf")

    return {"status": "report_generated"}, 200

LightningPDF vs Alternatives for Report Generation

Feature LightningPDF Puppeteer (Self-Hosted) DocRaptor PDFShift
Generation speed <100ms native, <2s Chromium 1-3 seconds 2-5 seconds 1-3 seconds
Batch API Up to 100 per call Manual loops Not available Not available
Template marketplace 15+ starter templates None None None
CSS support Full (flexbox, grid, gradients) Full Chrome PrinceXML subset Full Chrome
Cost (5K reports/mo) $29 (Pro) ~$300 infra $79 $29
Scheduling built-in Webhooks + batch DIY Webhooks Webhooks
Chart rendering Inline SVG, Chart.js via Chromium Full JS support Limited JS Full JS
Page break handling Automatic + CSS control CSS only CSS Paged Media CSS only

For deeper comparisons, see LightningPDF vs Puppeteer for the build-vs-buy analysis, LightningPDF vs DocRaptor for print-quality considerations, and LightningPDF vs PDFShift for a feature-by-feature breakdown. Our full API comparison benchmarks all seven major PDF APIs across speed, features, and pricing.

Tips for Production-Quality Reports

Embedding Charts as SVG

Render charts as inline SVG -- they scale perfectly and don't need a network request. Unlike JavaScript-based charting libraries that require a browser runtime, SVG is static markup that renders identically in every PDF engine:

<svg width="400" height="200" viewBox="0 0 400 200">
  <!-- Simple bar chart -->
  <rect x="20" y="40" width="60" height="140" fill="#4F46E5" rx="4"/>
  <rect x="100" y="80" width="60" height="100" fill="#7C3AED" rx="4"/>
  <rect x="180" y="20" width="60" height="160" fill="#4F46E5" rx="4"/>
  <rect x="260" y="100" width="60" height="80" fill="#7C3AED" rx="4"/>
  <!-- Labels -->
  <text x="50" y="195" text-anchor="middle" font-size="11" fill="#64748b">Q1</text>
  <text x="130" y="195" text-anchor="middle" font-size="11" fill="#64748b">Q2</text>
  <text x="210" y="195" text-anchor="middle" font-size="11" fill="#64748b">Q3</text>
  <text x="290" y="195" text-anchor="middle" font-size="11" fill="#64748b">Q4</text>
</svg>

Handling Large Tables

When your report contains tables with hundreds of rows, page breaks become critical. Use CSS to keep the table header visible on every page and prevent rows from splitting:

thead { display: table-header-group; }
tfoot { display: table-footer-group; }
tr { page-break-inside: avoid; }

Multi-Page Layouts

For reports with distinct sections, force page breaks between them:

.section { page-break-before: always; }
.section:first-child { page-break-before: auto; }

Getting Started

  1. Sign up for a free LightningPDF account -- 50 PDFs per month, no credit card required.
  2. Grab your API key from the dashboard.
  3. Connect to your data source using the patterns above.
  4. Build or customize a report template from the marketplace, or design your own in the visual designer.
  5. Wire up generation with the Python or Node.js examples in this guide.
  6. Schedule your reports with cron, node-cron, or webhook triggers.

When you are ready to scale beyond the free tier, the Starter plan at $9/month covers 2,000 reports, and the Pro plan at $29/month handles 10,000 reports.

Read the full API documentation for additional options including landscape orientation, custom page sizes, header and footer injection, and watermarks.

Frequently Asked Questions

How do I automate PDF report generation?

Connect your data source to an HTML template using Python or Node.js, then send the populated HTML to the LightningPDF API. The API returns a base64-encoded PDF. Schedule the script with cron or a task scheduler to run automatically on a daily, weekly, or monthly basis without manual intervention.

Can I include charts and graphs in automated PDF reports?

Yes. The most reliable method is rendering charts as inline SVG directly in your HTML template. You can also use JavaScript charting libraries like Chart.js, which LightningPDF renders through its Chromium engine. For maximum compatibility and speed, pre-render charts server-side and embed them as SVG or base64 PNG images.

How many reports can I generate per month with LightningPDF?

The free tier includes 50 PDFs per month. The Starter plan at $9 per month covers 2,000 reports, the Pro plan at $29 per month handles 10,000, and higher tiers scale to 150,000 and beyond. Batch API calls let you generate up to 100 reports in a single request for maximum throughput.

L

LightningPDF Team

Building fast, reliable PDF generation tools for developers.

Ready to generate PDFs?

Start free with 50 PDFs per month. No credit card required.

Get Started Free