Research & development

Deep Learning Starts With Ourselves


Applica is always improving: Our team works hard to develop new innovations in document understanding and add additional features.

Researched, Reviewed, Published

Applica’s R&D team regularly publishes research papers about the breakthroughs we’ve achieved.
LIDL: Local Intrinsic Dimension Estimation Using Approximate Likelihood
Piotr Tempczyk, Rafał Michaluk, Łukasz Garncarek (Applica), Przemysław Spurek, Jacek Tabor, Adam Goliński
Most of the existing methods for estimating the local intrinsic dimension of a data distribution do not scale well to high-dimensional data. Many of them rely on a non-parametric nearest neighbors approach which suffers from the curse of dimensionality. We attempt to address that challenge by proposing a novel approach to the problem: Local Intrinsic Dimension estimation using approximate Likelihood (LIDL). Our method relies on an arbitrary density estimation method as its subroutine and hence tries to sidestep the dimensionality challenge by making use of the recent progress in parametric neural methods for likelihood estimation. We carefully investigate the empirical properties of the proposed method, compare them with our theoretical predictions, and show that LIDL yields competitive results on the standard benchmarks for this problem and that it scales to thousands of dimensions. What is more, we anticipate this approach to improve further with the continuing advances in the density estimation literature.
Read Full Abstract
STable: Table Generation Framework for Encoder-Decoder Models
Michał Pietruszka, Michał Turski, Łukasz Borchmann, Tomasz Dwojak, Gabriela Pałka, Karolina Szyndler, Dawid Jurkiewicz, Łukasz Garncarek
The output structure of database-like tables, consisting of values structured in horizontal rows and vertical columns identifiable by name, can cover a wide range of NLP tasks. Following this constatation, we propose a framework for text-to-table neural models applicable to problems such as extraction of line items, joint entity and relation extraction, or knowledge base population. The permutation-based decoder of our proposal is a generalized sequential method that comprehends information from all cells in the table. The training maximizes the expected log-likelihood for a table's content across all random permutations of the factorization order. During the content inference, we exploit the model's ability to generate cells in any order by searching over possible orderings to maximize the model's confidence and avoid substantial error accumulation, which other sequential models are prone to. Experiments demonstrate a high practical value of the framework, which establishes state-of-the-art results on several challenging datasets, outperforming previous solutions by up to 15%.
Read Full Abstract

Curious to Read more?

Browse through all our scientific papers.

Our Solution Has Won Multiple Prizes

Applica’s solution regularly wins awards and competitions around the world.
April 2021
Applica’s innovative TILT model crushed the competition in the ICDAR Infographics VQA Challenge
March 2021
Applica continues to dominate the venerated Key Information Extraction Competition
February 2021
Applica beats all other AI solutions in the Document Visual Question Answering Challenge
February 2021
The Applica team wins Best Paper at SemEval 2020

Meet the Technology

Find out what makes Applica’s approach to document automation so special—and so much more powerful than other approaches.
To the Tech

Dive Into the Details

Ready for some math? Our research blog documents the latest breakthroughs, ideas, and observations from Applica’s R&D team.
To the Research Blog