Imagine you are from the Finance department, whose daily contributions are significant to the company’s financial health. Yet, a large chunk of your valuable time is swallowed up by a highly repetitive, manual process: invoice processing.
Every day, you find yourself playing a relentless “matching game” in the office, one your boss will never scold you for, even though it consumes so much time: “Word Search” (extract the information from the invoice) and “Candy Crush Saga” (matching the extracted data with their corresponding purchase orders and the goods receipt).
Your inbox is a constant stream of emails, each with a different vendor invoice attached as a PDF. Throughout the day, hundreds of invoices arrive, from a variety of vendors, some large and structured, others from small suppliers with custom templates.
For each invoice, you must open the attachment and begin the manual data extraction process. You’re not just looking for the total amount; you are searching for:
And this is where the real challenge begins.
After extracting the invoice data, you must perform a “3-way match” to ensure the company is paying for exactly what it ordered. This involves, but may not be limited to:
Beyond just tired eyes from staring at invoices all day, the manual matching game causes bigger problems for how well things run and for your well-being:
Time consumption
Every invoice takes several minutes to go through. When you have dozens or even hundreds of them each day, this is not just a small task; eventually, it becomes your whole job. This leaves you little time for other important tasks.
High chance of mistakes
Typing or copying numbers from a PDF into a computer system is easy to mess up. A simple mistake like mixing up numbers or letters in an invoice or getting the total wrong can cause payment to be late, upset your suppliers, or lead to financial problems that are hard to fix later.
Scalability issues
As your company gets bigger, so does the number of invoices. Just hiring more people to deal with the extra work is not a good long-term solution. The old way of doing things can’t keep up, and everyone gets swamped, especially during busy times. This pressure can also contribute to a high employee turnover rate.
Stress from deadlines
Invoice due dates do not wait for anyone. If you get stuck playing the matching game, it can mean late fees, losing trust with suppliers, or even messing up important deliveries. The constant pressure to get everything done right and on time can be very stressful.
Now, let’s talk about the true “power-up” in your “matching game”: Azure Document Intelligence.
Azure Document Intelligence is a cloud-based service that uses cutting-edge machine learning to perfectly extract structured data from almost any document, whether it’s neatly organized or a bit chaotic. It is not just a basic Optical Character Recognition (OCR), where it just “sees” words; it is smart enough to understand the full context and layout of a document.
Each model is pre-trained to tackle common document types (mostly the United States format, but there is also a Custom Extraction Model, which allows you to label and train your templates).
Let’s have a look at some prebuilt models that might be helpful to power up your Invoice Matching Game:
Invoice
Extracts vendor and customer details, invoice number, billing/shipping addresses, dates, line items (description, quantity, unit price), subtotals, taxes, and total amount. This model is useful for automating accounts payable workflows.
Receipt
Extract merchant name, transaction date/time, itemized purchases, taxes, and total amount. This model is ideal for expense tracking and consumer analytics.
Besides the above mentioned prebuilt models, there are still some other prebuilt models such as identity documents, US health insurance cards, US personal tax, US mortgage, US pay stubs, US bank statements, US checks, credit cards, US marriage certificates, contracts, and business cards.
Custom Extraction Model
Don’t find any helpful prebuilt model for your special templates document? No worries!
There is also a Custom Extraction Model in Azure Document Intelligence, a powerful tool that allows you to train your document processing model tailored to specific document types and layouts that are not covered by the prebuilt models.
This model requires you to build using your own labeled documents to extract specific fields and data points. It is very easy to build this custom extraction model.
So, how does this AI magic work? It’s surprisingly straightforward. You simply upload a document to the service, and its AI springs into action, scanning the document layout. Then you’ll need to create the fields of the label (it could be a text field, selection mark, signature, or table).
If you’ve a pre-trained custom extraction model ready, you can choose to auto-label the document with the pre-trained model. Or else, you may start to label the fields you want to extract using the Document Intelligence Studio.
Finally, it returns this extracted data to you in a perfectly structured JSON format, ready for your next move in the automation game.
So, what’s next?
It is a low-code/no-code platform for creating automated workflows. We can explain it as the orchestrator, connecting different services on Azure and Microsoft Dynamics 365.
A company receives invoices as PDF attachments in an Outlook email inbox. The goal is to extract key data and save it to a SharePoint list for review.
Step 1: Set up a Power Automate flow
Set up a Power Automate flow with the “When a new email arrives” trigger. You may filter emails with a specific subject line from a specific sender to keep the flow focused.
Step 2: Get the attachment
Add the “Get attachments” action from the Outlook connector. Then save the attachment to a SharePoint Site or OneDrive folder.
Step 3: Analyze the Document with Azure Document Intelligence
This is the key step. Use the HTTP action to call the Azure Document Intelligence REST API. Explain the necessary components: the API endpoint URL (including the model ID like invoice-matching-ocr), the API key, and the request body containing the attachment’s content. There might be a need for a “delay” action or a loop to check whether the status is successful or not.
Step 4: Parse the Results
Once the analysis is complete, the API returns a JSON file. Use the “Parse JSON” action in Power Automate to extract the specific data points you need. For example, invoice_id, vendor_name, total_amount, invoice_date, and etc.
Step 5: Store the Data
Use an action to store the extracted data. For example, you can use the following actions, any of which also work well to store your data.
Then, map the extracted JSON values to the columns in your destination file. For example, the vendor_name JSON field goes into the vendor_name column in the SharePoint list.
Step 6: Notify the PIC to verify the data extracted
Finally, to keep the human in the loop to ensure the data extracted is correct, the automation workflow can be ended with “Start and wait for an approval”. With this, an approval request will be sent to the Invoice Matching Finance Officer to check if the information extracted from the invoice is correctly recorded. If yes, then they can press “Approve” and proceed to the payment step. Or else, they can always “Reject” and check on their own.
Besides creating a custom extraction model to extract different templates of invoices and purchasing orders, as per the benefits number 1 mentioned above, we can also tailor it to your documents, for example, a Traffic Ticket.
Here, we have a scenario where a car rental company would like to extract information from the traffic ticket issued by the Royal Thai Police and then match it with their records in an Excel file. With this automation flow, they will be able to know which tenant should take responsibility for paying the traffic ticket.
So, the automation workflow is like this:
If you do not want to have these fully automated workflows, or maybe you would like to start your OCR workflow from uploading an invoice, no problem, we can also create a Website Application, connected to the Custom Extraction Model on Azure Document Intelligence, to do so.
A recap of the benefits of this seamless data extraction automation workflow:
As our journey through the automated workflow comes to an end, it’s clear that combining Azure Document Intelligence and Power Automate is not just about streamlining tasks. It’s about fundamentally changing the game. No longer are organizations stuck in reactive, manual grind levels, constantly battling paperwork and data entry errors.
With this powerful tech stack, you can transform your operations into a proactive, automated powerhouse, giving your teams the ultimate cheat code to reclaim valuable time and focus on strategic missions.
This is not just efficiency; it’s about unlocking the true value of your data, turning unstructured documents into actionable insights that fuel smarter decisions. So, are you ready to level up your business?
It’s time to stop playing defense against manual bottlenecks and start a new game where automation leads to victory royale. Deploy this solution today and experience the next generation of business efficiency!