Pascal Calarco
Scholarly Communications and Systems Librarian
University of Windsor
FedHarv is a specialized command-line utility designed to automate the discovery, retrieval, and packaging of open access (OA) scholarly outputs for batch ingestion into DSpace repositories. It replaces manual harvesting with a robust, automated pipeline. FedHarv is architected for OA/copyright compliance and high-quality metadata, operating on a “Diamond/Gold/Hybrid Exclusive” policy. It systematically isolates “Green” (self-archived) and “Bronze” (free-to-read) content to ensure the repository only hosts files with clear re-use licenses or explicit open access status. It leverages several APIs to do so, including OpenAlex, CrossRef TDM, FundRef, DOAJ, Unpaywall, and DataCite, and packages metadata and bitstreams into customizable collection folders for an institution’s repository. FedHarv will be available under an MIT license for others to use and modify in Summer 2026.