How the citation-report site works

A short wiki-style guide explaining how the site fetches data from OpenAlex, calculates monthly citation counts, and what technologies are used.

Overview

This site fetches data from the OpenAlex API and computes citation summaries for academic authors. It combines data retrieval, in-memory aggregation, and PDF/CSV export features to produce a sharable citation report.

Getting access to OpenAlex

  1. OpenAlex is a public API — you do not need an API key for basic usage. See openalex.org for documentation and rate limit details.
  2. The app queries endpoints like /authors and /works to find authors and fetch their publications and citation details.
  3. Because OpenAlex is a shared public API, the app is conservative about concurrent requests and provides progress feedback while fetching large datasets.

How data is fetched

High-level steps:

  • Resolve an author (by OpenAlex id or by name search).
  • Fetch the author's works in pages (the app uses a per-page size compatible with OpenAlex limits).
  • For monthly citation counts we fetch citing works for each author work and extract their publication dates.

Monthly counts algorithm (summary)

The monthly counts are computed by counting the publication dates (month granularity) of papers that cite the author's works. Key points:

  • We compute two series: all citations and non-self citations (exclude citations where the author appears among citing authors).
  • Each citing paper contributes +1 to the month (YYYY-MM-01) of its publication date. If only a year is available, we count it as January of that year.
  • To speed up calculations the app uses a fast ISO date string parsing path (string slicing) and parallelizes per-work citing-paper fetches using a thread pool. This keeps the algorithm logically identical but much faster in wall-clock time.

Why other sites don't provide this

Services like Google Scholar do not provide programmatic access to citation event timelines at scale and often block scraping. OpenAlex provides an open, research-friendly API that exposes citation links and publication dates that make monthly timelines possible.

Libraries and languages used

  • Python — primary language for backend and data processing.
  • Flask — web framework for routes and templates.
  • Requests — HTTP client for calling OpenAlex.
  • Pandas — data manipulation and CSV/Excel exports (used for generating time series summaries).
  • Matplotlib — plotting monthly/yearly charts used in the PDF/report.
  • ReportLab / PyPDF2 — assembling multi-page PDFs and cover pages.
  • Pillow (PIL) — image handling and downsampling for PDFs.