بسم الله الرحمن الرحيم
How to Cite the IPSC
Permanent citation formats for academic work.
Generator
Interactive Citation Builder
Version
Corpus Versioning
3926a127…99e08370 Requirements
Attribution Guidelines
When citing IPSC data, please include: the version number, the permanent URL, and a reference to the methodology document.
Individual hadith pages at ipsc.theogrid.ai/hadith/[id] are in development. For now, cite the hadith ID and reference the methodology at ipsc.theogrid.ai/methodology.
Changelog
Version History
LLM PID tiebreaker (12,660 PIDs assigned via Eve-Theology f5/reasoner batched 25-narrator prompts; 47.4% → 69.5% PID resolution on v3.19+ ingests). Ship-blocker remediation (2,168 records in Mawḍūʿāt / Tanzīh / Fawāʾid capped to very-weak via _naqd3Override). Phase 9 partial: 7,364 new matn embeddings, cluster merge 39.2%. NRS-Taqrīb reconciliation: 6,034 confirmed, 1,345 mismatches flagged. All hard gates green. Full changelog.
Citation-cascade refinement (v3.13–v3.18) and 6 new Phase 2-A primary collection ingests (Ibn al-Sunnī ʿAmal 770, Hannād Zuhd 1,429, al-Quḍāʿī Shihāb 1,497, al-ʿUqaylī Ḍuʿafāʾ 2,103, Ibn al-Mubārak Jihād 268, Ibn ʿAdī Kāmil 2,174 = +8,241 records). v3.19 OpenITI→IPSC ingestion pipeline. v3.25.1 fixed 5 pipeline robustness bugs.
Vector embeddings refresh (448,237 matns re-embedded; previous embeddings were on broken-matn data). Kanz al-ʿUmmāl re-ingestion: 14,603 records newly attributed via OpenITI symbol parser. NAQD-3 fresh-embedding re-run: 1,851 findings (300 critical, 638 high) — the previous near-zero N3-CC contradiction count was an artifact of broken embeddings.
External-examiner cycle drove a v3.11 framing recalibration. Added the _provenanceDisclosure manifest block so AI-involvement scope travels with every record, established the standing provenance-discipline rules, and closed NAQD-1 V1 (112 → 0 findings). Audit practice & scholar-collaboration program.
Corpus integrity push: temporal-issue field regeneration (256K false positives removed, 127K real issues found), arabicText_normalized cleanup (35,982 records), CDN-attribution flagging. v3.7 first-attempt PID validator failed regression (12 T10+ violations); v3.9 multi-stage validator (5 structural pre-filters + LLM tiebreaker) captured 24,485 safe PID swaps. v3.8 critical matn recovery from broken-regex corruption (449,028 matns restored from backup; bug survived two release cycles because regression didn't sample matn content).
First IPSC corpus deployed to Azure AI Search (10 tier indexes; +3 with Glossary v1; 1,569,379 docs total). 8 vector indexes (matnEmbedding × 3, narratorEmbedding × 3, defectEmbedding × 2). 7,652 narrators with classical scholarly quotations. Glossary v1 ship: 730 canonical hadith-science terms across 3 tier indexes.
Matn-criticism pipeline (Phases A–G): 304 Qur'an rulings, 271 anachronisms, 335 fabrication patterns, 1,301 mutawatir canon entries, prophetic linguistic baseline. Two-pass architecture: Pass 1 deterministic string-op scan (420,110 records, 11,441 flagged); Pass 2 multi-tier LLM (Haiku triage → Eve-Theology f5/reasoner detail → Opus 1M scholar-grade). 815 chain-matn conflicts identified.
Matn criticism pipeline (437,740 matns analyzed). Ilal cross-linking (87,844 hadith). Teacher-student graph (889,913 edges). Hawala chain splitting (8,258 records). Quran cross-reference (1,279,676 term matches). NRS: 27,099 to 27,118.
Initial corpus release. 449,415 hadith from 86 classical works. 27,099 NRS entries. Five-condition grading engine. Person ID resolution pipeline (6 iterations). Structured isnad parsing. Matn clustering (54,270 bag-of-words clusters).
Every narrator reassessment, grade change, methodology update, and audit response is logged. Researchers can see exactly what changed and why.