HiFi-KPI: A Dataset for Hierarchical KPI Extraction from Earnings Filings

Dataset

Description

Dataset Summary
HiFi-KPI is a large-scale dataset designed for financial numerical key performance indicator (KPI) extraction from earnings filings. It is derived from iXBRL filings mandated by the SEC, featuring hierarchical labels structured from the XBRL taxonomy. The dataset consists of ∼1.8M paragraphs and ∼5M entities, each linked to labels in the iXBRL calculation and presentation taxonomies.

Languages
The dataset is in English, extracted from SEC 10-K and 10-Q filings.
Date made available21 Feb 2025
PublisherHugging Face

Emneord

  • NLP
  • Quantitative Finance

Cite this