3 papers - avg viability 7.3
VERSE provides a strategic tool for enhancing vision-language models in document understanding by visualizing and improving visual embeddings.
A new LLM-based evaluation framework for PDF table extraction that significantly outperforms existing metrics, providing practical guidance for parser selection.
PIP offers a mask-based paradigm for a 5-36x speedup in key information extraction from documents, maintaining accuracy while enhancing efficiency.