The document automation sphere would be complicated enough even if it weren’t in a constant state of flux, with companies like ours rushing to make the impossible a reality. As is, many businesses struggle in the face of choice – whether to automate at all, which processes to hand over to AI first, and how to choose the right solution for the business.
Always seeking to tailor our solutions to actual client needs, Applica reached out to preeminent authority on intelligent text automation, Edward Benson of the natural language database startup NLUDB, to inquire about the future of intelligent automation and the role of deep learning in what’s to come. This post is based on our recent correspondence.
Recognizing that many businesses, even many Fortune 500 companies, still rely on humans alone to handle documents, Benson advocates for letting automation drive productivity, profit, and competitive edge. Rather than emphasizing one automation model over the rest, he encourages organizations to seek tailor-made technological solutions that address their specific needs. In his book, Benson clearly lays out the upsides, advantages, and obstacles
to each available automation type, from simple rule-based systems with straightforward applications to the much more nuanced type of results possible only with deep learning.
The crucial questions with regard to choosing the right automation solution concern the goals of automation and the technical aspects of the data extraction itself. The former requires analysis of the business process at stake. Is automation possible? Will automation speed, refine, or enhance the results? The latter question, regarding the technology, concerns the optimal type of automation. Here, the choices concern software requirements, data storage constraints, engineer involvement, and – most importantly – the complexity of the data. Because the more intricate the data and the more demanding the extraction needs, the more likely deep learning is necessary.
“Once we figure out how to do something, it stops being called AI.”
When we asked Benson about the future of the intelligent automation industry, we wanted to know if he sees it as exclusively deep-learning based and code-free, or whether he predicts there will always be a use for some of the legacy tools currently on the market. “Deep learning is rapidly becoming a critical piece to many automation tasks,” Benson told us. “It’s undoubtedly where the industry is headed. I think we’ll see two basic outcomes with legacy tooling. Tools that perform well for a defined task won’t need to change: as the saying goes, ‘if it ain’t broke, don’t fix it.’ Tools that underperform with respect to deep learning will have to adopt deep learning as a technique or risk replacement. As for coding versus code-free, I think we will continue to see code used as the ‘glue’ which holds components together but decreasingly as the core algorithm which produces results. That core will be a combination of learned models and task-dependent constraints.”
Asked whether deep learning is already the key to companies getting the most out of their investments in document automation, Benson stated that it has become an important component of any document automation strategy. “First, because there is a valuable category of business problems for which deep learning is uniquely capable of automating. And second, because deep learning can be combined with classical approaches to make the whole system
better than its parts.”
When it comes to the benefits companies can look forward to if they are early adopters of deep learning-based automation technology, Benson said there are too many to list. His three favorites? One, variance-robust document understanding. “This is due to the broad linguistic base that large-scale language models bring to the table. It means they can learn about concepts rather than specific words.” Two, multilingual processing as a default. “Companies will soon be able to have a unified engineering approach across all of their geographic regions.” Three, dramatically reduced training data requirements. “This is due to the practice of fine-tuning a pre-trained deep learning model rather than training from scratch.”
Next, we inquired about complex and variable documents, which are understandably hardest to automate. These are the 80% of documents that businesses are still paying humans to review. We wanted to know what Benson thinks will be the technological tipping point in this arena. “There’s a joke that AI is defined as ‘we don’t know how to do that yet,’ Benson told us. “Once we figure out how to do something, it stops being called AI. Variable documents are similar. We only call them ‘variable’ because our algorithms can’t see their regularity. But humans can. On a shipping manifest, a human easily understands that recipient has the same meaning as TO, for delivery to, and destinatario. Even if that person doesn’t speak Chinese, they could use layout clues to guess that 收件人 had the same meaning.
“And this reveals the tipping point: when our models can learn to interpret the information that makes these situations so easy for humans but hard for current automation approaches. This includes a number of additional layers of understanding beyond mere text: large-scale language modeling, document layout semantics, document context clues, domain constraint modeling, as well as non-text features like lines, penmarks, and checkboxes. These are the extra clues that make a variable document seem so regular to a human.”
“Deep Learning isn‘t at odds with traditional software.”
So, we wanted to know, are unstructured documents the arena for most of the automation progress forecast for the next 5-10 years, or can we expect innovation with regard to the processing of structured and semi-structured documents as well? “All of the above,” Benson says. “People often refer to structured documents as being ‘easy,’ but there are thriving software companies devoted entirely to cleaning and interpreting “structured” data! Data understanding is challenging across the board. It’s because the information humans rely on is sophisticated, error ridden, and nuanced no matter what form it’s in. I think the current excitement around unstructured documents is driven by the zero-to-one factor: an entirely new category of automation is opening up for the first time, and that is resulting in a productivity gold rush. But there will be years and years of hard work and valuable progress across the board in the document understanding space.”
It’s a fact that deep learning may at times complement rather than replace existing legacy automation solutions that companies already have in use. After all, we know from experience that often this is what many businesses wind up with, either long-term or as they transition from rule- and code-based systems to more complex types of AI. Benson adds, “I don’t see deep learning as being at odds with traditional software at all,” he said, “just as an airplane isn’t at odds with its autopilot system. Deep learning is merely a better way to do certain things to make all our software get better.” The thing to do, according to Benson, is to create, promote, and deploy deep learning systems that can interact easily with traditional software environments, so that companies can mix and match in the way that brings them the most value.
Edward Benson is an expert on document automation with vast entrepreneurial experience in building next-generation text processing solutions. He is the author of the book Automating Paperwork: A Practical Overview for Enterprise, which provides “an end-to-end tour of the technical decisions and tradeoffs” involved in selecting the right automation solution for every business.