ipsc-hadith-v3.jsonl
Full hadith with parsed chains, grades, enrichments. v3.26 staged adds 8,241 records.
بسم الله الرحمن الرحيم
ملفات البيانات
449,285 structured, graded hadith records in JSONL (v3.4 deployed; v3.26 staged adds 8,241 more). 67–70 fields per record. Built for Azure AI Search, Elasticsearch, or any document store.
Looking for the commercial API on Azure infrastructure? See /api for the three-tier access model and licensing.
Data Files
ipsc-hadith-v3.jsonl
Full hadith with parsed chains, grades, enrichments. v3.26 staged adds 8,241 records.
ipsc-narrators-v3.jsonl
NRS database with tiers, assessments, graph positions. 27,118 carry NRS reliability assessments. v3.33 reconciliation against Taqrib confirmed 6,034 / flagged 1,345 mismatches.
ipsc-ilal-v3.jsonl
Hidden defect cross-references from al-Daraqutni
ipsc-matn-clusters-v3.jsonl
Cross-collection content clusters with attestation. v3.12 fresh-embedding rebuild after the v3.8 matn-corruption recovery.
ipsc-entities-v3.jsonl
Hadith entity aggregations (all chains per teaching)
ipsc-glossary-v1.jsonl
Glossary v1 (2026-04-25): 730 canonical hadith-science terms with multi-source classical citations. 3-tier (Public / Research / Scholar).
manifest.json
Corpus statistics, file inventory, and the authoritative AI-involvement disclosure block (always public).
67–70 Fields
| Category | Fields |
|---|---|
| Identity | id, workId, collection, hadithNumber, bookName, chapterName |
| Source text | arabicText, isnadText, matn, englishText |
| Chain analysis | isnadStructured (array), chainContinuity, chainAttribution, chainQualityIndex (number, 0.0–1.0 chain data reliability score) |
| Grading | computedGrade, autoGrade, autoGradeDetail (worstTier, worstNarrator, worstLabel, worstPosition, narratorCount, resolvedCount, resolutionRate, chainContinuity, mursalCap, supportingChains, taqwiyah, hasMudallisAnanah, hasIkhtilat), gradeConfidence (number, 0.0–1.0 confidence score), computedConfidence, gradingNotes |
| Provenance | _provenance (array) — consolidated correction history with classical source citations. v3.26 entries: _v326PidTiebreaker, _v335ShipBlocker. |
| Enrichment | crossLinks_ilal, crossLinks_rijal, matnClusterId, clusterId (v3.26), attestationLevel, shudhudh |
| v3.26 fields | _pidTiebreakerVerdict (method=v3.26-llm-tiebreaker-sonnet, confidence, reasoning), _naqd3Override (sourceCollection, sourceAuthority, originalGrade, cappedGrade) (public-tier visible), _chainMatnConflict (public-tier visible), _v319MatchAlternatives (scholar-tier) |
للباحثين · For Researchers
The IPSC provides structured data for computational hadith scholarship. JSONL format, 67–70 fields per record, ~2.9 GB for the hadith corpus alone.
حالات استخدام نموذجية · Representative use cases
Each record carries parsed chain positions, resolved Person IDs, NRS tiers, transmission formulas, grading provenance, and cross-references — ready for analysis without preprocessing.
Example
{
"id": "bukhari-sahih-000001",
"collection": "Sahih al-Bukhari",
"hadithNumber": "1",
"arabicText": "...",
"isnadStructured": [
{
"position": 1,
"name": "...",
"canonicalPersonId": "PERSON-005691",
"_nrs": { "tier": 3, "label": "thiqah" }
}
],
"computedGrade": "sahih",
"autoGradeDetail": {
"worstTier": 3,
"chainContinuity": "continuous",
"supportingChains": 47
}
} API Access