There are many challenges for people to solve during their automation journey. Over the years, methods to tackle these challenges have evolved. Repetitive manual processes initially were automated using Macros, a pre-recorded set of rules, that execute tedious repetitive manual jobs. Then, RDA and RPA technologies took automation to the next level, and each of them was more successful in minimizing manual work performed by a human.
The latest frontier of automation, Intelligent Automation (IA), is promising to eliminate the remaining limitations in automation by solving unsettled issues with AI and ML Technologies that aim to automate any cognitive tasks - such as reading and understanding documents, speech and text commands, and generating voice and text context in response. The technology landscape of Intelligent Automation is composed of tools and platforms such as Business Process Mining, Business Process Orchestration, ML and AI, Natural Language Processing (NLP), and Intelligent Document Processing (IDP).
While new technologies evolve, automation of data extraction from documents remains a significant barrier, preventing many companies from the successful implementation of IA solutions. Data extraction from relatively simple documents (such as simple tables or forms) has been implemented successfully industry-wide, with low rates of unrecognized documents. However, more complex documents, such as invoices and contracts, hold all kinds of important data that cannot be extracted fully automatically. Some examples of such cases in different industries include but are not limited to:
Banking, Investment, Asset Management and Insurance:
Loan Contract Processing
Financial Statement Processing
Invoice Data Processing
Mortgage Document Data Extraction
Document Content Search
Confirmations and Pre/Post Matching
Customer Onboarding, Account Opening
Loan Applications
Compliance-Related Processes
Receipt Processing
Vendor Onboarding
Claims Handling
Mortgage Processing
Thinking about incorporating intelligent automation into your business or products?
Manufacturing, Supply Chain
Sales Order Processing
Accounts Payable/Receivable
Parts Requests from Customers
Remittance Processing
Order Scheduling and Tracking of Shipments
Bill of Landing
Transport Notes
Media
Contract Management
Healthcare
Billings and Claims Management
Insurance Processing
IDP is a solution that uses ML, Computer Vision and NLP technologies to capture, categorize, and extract data from PDFs, images, and office documents, and to map data to structured data sources. Due to the use of “smart” technologies, the quality of data extraction from complex documents could be significantly improved.
IDP is usually integrated with RPA, BPM and internal systems and applications, or provided as a Software as a Service platform. There is a variety of technologies that could be used to implement IDP solutions:
Open Source Solutions
spaCy
AllenNLP
Stanford University NLP
Tabula
Excalibur
Pros
Mostly free of charge
Could be fully integrated into existing infrastructure without external endpoints
Cons
Lower recognition quality out of the box, as these models and libraries are not trained on exabytes of data
Requires qualified specialists to deploy in a local infrastructure, to create API, to tune and to train
Thinking about incorporating intelligent automation into your business or products?
Machine Learning as a Service (MLaaS) Solutions:
Amazon Comprehend
Google Cloud Platform Natural Language
IBM Watson Discovery
Microsoft Azure Cognitive Natural Language Processing
Pros
Fast and easy jump-start, no need to deploy infrastructure first, API available
Pre-trained on exabytes of data, which provides much better results out of the box
Transparent pricing that depends on the volumes of data, gradually decreasing
Convenient integrated annotation service
Cons
Constant expenditure compared to opensource solution deployed on-premises
Working with tables is a challenge
Software as a Service (SaaS) Solutions:
IBM Watson Compare & Comply
AWS Textract
Pros
Works out of the box. Decent quality of tables recognition and some key-value pairs
Hassle-free integration via API
Cons
Impossible or hard to tune
If the key-value pair is not recognized, requires additional efforts to extract it
DataArt helps companies implement IDP solutions by combining open-source, MLaaS and SaaS solutions, as well as our own IP and solution accelerators. We offer our customers IDP consulting services, pilot implementation, integration with RPA, BPM and internal systems.
We are glad you found us
Please explore our services and find out how we can support your business goals.
Contact us!
Get in Touch
Register for a webinarHow to Accelerate Business Data Analytics with Snowflake and Matillion
Learn how to quickly get from diverse data sources to business insights by utilizing a combo of simple, rich and powerful, cloud-native modern data services.