The data quality endpoint generates a structured report that surfaces documents and mappings that may require review or re-processing. The report is organized into named queues, each highlighting a different type of quality issue: documents with unusually few extracted line items, portfolio-only documents, documents with no extracted data, mappings flagged for review, numeric parse failures, and documents whose source PDFs are not available. You can scope the report to a single fund or review the entire dataset at once.Documentation Index
Fetch the complete documentation index at: https://demircancelebi.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Endpoints
| Method | Path | Description |
|---|---|---|
GET | /data-quality | Returns a structured data quality report |
https://mkk-roan.vercel.app/api
GET /data-quality
Returns aDataQualityReport with aggregate summary statistics and per-queue lists of problematic documents or mappings.
Query parameters
Scope the report to the fund with this internal database ID. Omit to report across all funds.
Scope the report to the fund with this fund code (e.g.,
OJB). Use instead of fund_id when you know the fund code.Maximum number of items to return in each queue list (e.g.,
low_line_item_documents, empty_documents). Accepts values from 1 to 500.Documents with fewer extracted line items than this threshold are added to the
low_line_item_documents queue. Accepts values from 0 to 100. Lower this value to reduce false positives for funds with naturally sparse data.Response schema
The filters applied to generate this report.
The effective limits used to generate the report.
Aggregate counts across all quality queues.
Breakdown of line item mapping methods used across documents in scope.
Documents with fewer extracted line items than the
low_line_item_threshold. Each item includes document metadata and the actual line_item_count.Documents that contain portfolio entries but have no extracted line item values. These may indicate parsing failures for the financial statement pages.
Documents with no extracted line item values and no portfolio entries. These are candidates for re-processing.
Line item mappings flagged for human review, typically because mapping confidence is below an internal threshold or the mapping method is
fuzzy or model.Line item values where the
value string could not be parsed into a numeric_value. Each item includes the raw value string and its source document.Portfolio entries where one or more numeric fields (market value, nominal value, etc.) could not be parsed. Each item includes the raw field values and source document.
Documents for which no PDF binary is available in storage. The document has been indexed but the source file cannot be streamed.
Queue lists are capped at the
limit parameter. The summary counts always reflect the total across all documents in scope, not just the items returned in the lists.Error responses
| Status | Description |
|---|---|
400 | Invalid parameter value (e.g., limit out of range or low_line_item_threshold outside 0–100). |
Example requests
Example response
200