Skip to content

evid - PDF Document Manager

evid is a Python-based tool designed for creating and managing datasets of PDF documents with associated metadata and labels. It enables easy citation of documents and supports generating responses using Large Language Models (LLMs) by organizing documents with metadata and LaTeX-based labelling. The tool features a user-friendly PyQt6 GUI for adding and browsing documents, making it ideal for researchers, professionals, and anyone needing to manage PDF-based datasets.

Key Features

  • PDF Logging: Add PDFs with metadata such as title, authors, tags, and dates.
  • Automatic Date Extraction: Extract dates from PDFs using advanced text parsing.
  • PyQt6 GUI: Intuitive interface with tabs for adding and browsing documents.
  • LaTeX Integration: Generate LaTeX documents for labels and responses, with BibTeX support for citations.
  • Modular Database: Organize documents into datasets with YAML-based metadata storage.

Getting Started

Quick Start

git clone <repository-url>
cd evid
poetry install
poetry run evid gui