The One About Automating PDF Invoices with n8n
I really hate manual data entry. Every month I have to download my credit card statement and spend 20 minutes copy-pasting transactions into my budget spreadsheet. Date, description, amount. Repeat 50 times.
Instead of continuing this manual cycle of procrastinating, I figured this would be a simple and good excuse to explore n8n for automating it.
The Setup
The only manual part I have to do now is drop a PDF credit card statement into a Google Drive folder, and then n8n handles the rest for me.
The workflow handles everything:
- Detects the new file (checks every hour, could increase this)
- Extracts the text
- Sends it to Claude to parse transactions
- Logs everything to Google Sheets
- Notifies me via Discord when done
Boom! I still go through the added transactions and assign them to my wife or me (or if it should be split).
How It Works
Google Drive Trigger → Download PDF → Extract Text → Clean Text → Claude → Parse JSON → Google Sheets → Discord
The key is using Claude to parse the PDF. Instead of regex, table extraction libraries, or OCR services, I just send the extracted text with a prompt:
Extract all transactions from this credit card statement.
Return only JSON with this structure:
{
"transactions": [
{
"transactionDate": "YYYY-MM-DD",
"description": "string",
"amount": number
}
]
}
Temperature: 0
I tried more elaborate prompts at first, but this simple version worked best. Sometimes that’s just how it goes 🤷.
Claude analyzes the statement and returns good JSON. No hallucinations so far with this setup. I did notice it missed some transactions when building out the system, but the nice thing is I’ve defined the exact JSON structure I want back.
What I Learned
Quick proof of concept, didn’t have to spend much time getting the result I wanted. Claude just gets what a transaction looks like, even when formatting varies. The cool thing is we have two different credit cards and both work fine (albeit the structure is quite similar).
Being explicit about the JSON structure and setting temperature to 0 made the difference between messy results and good output. Didn’t play around with this too much, just wanted to make sure Sonnet wouldn’t hallucinate transactions.
I didn’t build this to save time — I built it because me and my wife kept procrastinating on the manual part. Now we actually do the tracking instead of putting it off.
What’s Next
There are a few things I’d like to improve to make it more stable and feature rich:
- Auto-categorizing transactions since that’s still manual (although I do prefer this one since I get a monthly overview)
- Maybe add a basic comparison on what we spent this month? Although I’d rather make a separate workflow for that, something that runs once per month or is triggerable from Discord/WhatsApp
- Replace Discord with WhatsApp since my wife isn’t on Discord
- Test this with Haiku to see if I get similar results for cheaper
- Error handling, monitoring, etc. would also be nice
That’s it!
Stack: n8n, Claude 3.7 Sonnet, Google Drive, Google Sheets, Discord