evid - PDF Document Manager
evid
is a Python-based tool designed for creating and managing datasets of PDF documents with associated metadata and labels. It enables easy citation of documents and supports generating responses using Large Language Models (LLMs) by organizing documents with metadata and LaTeX-based labelling. The tool features a user-friendly PyQt6 GUI for adding and browsing documents, making it ideal for researchers, professionals, and anyone needing to manage PDF-based datasets.
Key Features
- PDF Logging: Add PDFs with metadata such as title, authors, tags, and dates.
- Automatic Date Extraction: Extract dates from PDFs using advanced text parsing.
- PyQt6 GUI: Intuitive interface with tabs for adding and browsing documents.
- LaTeX Integration: Generate LaTeX documents for labels and responses, with BibTeX support for citations.
- Modular Database: Organize documents into datasets with YAML-based metadata storage.
Getting Started
- Installation Guide: Set up
evid
on your system. - Usage Guide: Learn how to add, browse, and manage documents.
- Development: Contribute to
evid
or extend its functionality.
Quick Start
git clone <repository-url>
cd evid
poetry install
poetry run evid gui