SIGMA — Unified Global Standards Index

Free. Complete. Machine-readable. Human-navigable.

[![License](https://img.shields.io/badge/license-CC%20BY%204.0-lightgrey)](LICENSE) [![Entries](https://img.shields.io/badge/entries-88%2C203-brightgreen)](#current-data-scope) [![Domains](https://img.shields.io/badge/domains-40-blue)](data/reference/domain_taxonomy.csv) [![Quality Gate](https://img.shields.io/badge/quality%20gate-active-1f5f4b)](docs/QUALITY_GATE.md) [![Roadmap](https://img.shields.io/badge/roadmap-to%20100%25-c58b2b)](docs/superpowers/plans/2026-05-02-roadmap-to-100-percent-global-standards-index.md)

SIGMA is an open project to build the world’s most complete public index of global standards, treaties, frameworks, guidelines, and classification systems.

Repository Contents

Current Data Scope

Sample Records

SIGMA IDNameDomainMandateWhy it matters
HL-WHO-IHR-2005International Health RegulationsHealth & MedicalTreaty-bindingCore legally binding framework for preventing and responding to cross-border public health risks.
FS-CAC-CXG2-1985Guidelines on Nutrition LabellingFood Safety & AgricultureVoluntary-with-regulatory-adoptionInternational guidance for nutrition declarations and related food labelling information.
RFC-9110HTTP SemanticsICTVoluntary-with-regulatory-adoptionFoundation for interoperable web communication and API semantics.
ILO-C087Freedom of Association and Protection of the Right to Organise ConventionLabour & EmploymentTreaty-bindingOne of the core international labour rights instruments.
NSB-ANSI-USAAmerican National Standards InstituteNational Standards Body RegistryStandards bodyKey national standards body and ISO member for the United States.

Setup

  1. Create a virtual environment: python3 -m venv venv
  2. Activate it: source venv/bin/activate
  3. Install dependencies: pip install -e .

Validation

GitHub Domain Agents

SIGMA includes a free-safe GitHub Actions scaffold for domain-based automation. The Domain Agents workflow can be run manually or on a weekly schedule; it dispatches registered agents from data/reference/domain_worker_registry.csv, writes a state artifact, validates generated data, and opens a pull request when a worker changes tracked source or report files.

Start with dry_run enabled in GitHub Actions. Add optional provider tokens only through GitHub repository secrets, never in commits. See docs/GITHUB_AGENTIC_SETUP_GUIDE.md for the fillable secret template and click-by-click setup.

Relationship Extraction

Generate high-confidence relationship edges:

python3 scripts/extract_relationships.py
python3 scripts/validate_relationships.py data/relationships --processed-dir data/processed

Or run make relationships.

Release Build

Build downloadable artifacts in dist/:

make release

The release build currently emits:

dist/ is ignored by Git because release artifacts are generated outputs.

Public Search

The GitHub Pages build now publishes a searchable standards browser:

make pagefind-search

The static site generator writes public/search.html, public/search-index.json, and Pagefind-readable record pages under public/search-records/. The Pages workflow then runs npx -y pagefind@1.5.2 --site public --output-subdir pagefind so the deployed site has a low-bandwidth Pagefind search bundle plus a JSON fallback search for local previews.

The search page also includes an optional browser-side semantic helper built with Transformers.js. It loads a small quantized feature-extraction model only when a visitor requests it, runs locally in the browser, and falls back to lexical search when the model, network, or browser runtime is unavailable.

Google Sheet Sync

The public curation sheet 000_SIGMA_MASTER_DATABASE can be synchronized into the processed data layer:

make sync-google-sheet

The sync reads the public CSV export, keeps the 22-field SIGMA master schema, and writes data/processed/google_sheet_master.csv. Sheet-only curation metadata such as related_sigma_ids, llm_enriched, and last_updated is ignored by the release bundle until those fields are formally added to the published schema.

Phase 2A Health Priority Ingestion

The first Life Sciences & Health priority ingestor transforms source-confirmed WHO/Sphere/WASH records into the processed data layer:

make health-priority

The curated source table is data/reference/health_priority_sources.csv, and the generated canonical output is data/processed/health_priority_standards.csv. This slice expands Domain 1 Health & Medical, Domain 33 Water, Sanitation & Hygiene, and related humanitarian WASH coverage without duplicating the original seed IDs.

Phase 2B Codex Priority Ingestion

The first Codex Alimentarius ingestor transforms source-confirmed food-safety records into the processed data layer:

make codex

The curated source table is data/reference/codex_priority_sources.csv, and the generated canonical output is data/processed/codex_standards.csv. This slice expands Domain 2 Food Safety & Agriculture with cross-cutting Codex hygiene, additive, contaminant, labelling, analysis, sampling, and import/export inspection records.

Phase 2C Humanitarian Standards Expansion

The humanitarian priority ingestor extends Domain 17 beyond Sphere WASH:

make humanitarian-priority

The curated source table is data/reference/humanitarian_priority_sources.csv, and the generated canonical output is data/processed/humanitarian_priority_standards.csv. This slice adds CHS, INEE, IASC, UNHCR, and WHO Emergency Medical Teams standards and guidance.

Phase 2D WHO IRIS/OAI Staging

The WHO IRIS harvester stages likely normative WHO metadata for curator review before publication:

make who-iris-stage

The deterministic fixture is data/reference/who_iris_oai_sample.xml, and the staged output is data/staging/who_iris_filtered_metadata.csv. Staged rows are deliberately excluded from the release bundle until a curator promotes them into data/processed.

Phase 2E UN Treaty Collection Staging

The first UN treaty staging pass transforms source-confirmed UN Treaty Collection and OHCHR core-instrument records into a curator-review table:

make un-treaties-stage

The curated source table is data/reference/un_treaty_priority_sources.csv, and the staged output is data/staging/un_treaty_candidates.csv. This slice activates the UN Treaty Collection task for Domain 15 and stages the nine OHCHR core international human-rights treaties for review before any release-bundle promotion.

Phase 3A IAEA Safety Standards Priority Ingestion

The first IAEA ingestor transforms source-confirmed Safety Standards Series records into the processed data layer:

make iaea-priority

The curated source table is data/reference/iaea_priority_sources.csv, and the generated canonical output is data/processed/iaea_safety_standards.csv. This slice activates the IAEA gap-analysis task for Domain 31 and covers priority Safety Fundamentals, General Safety Requirements, and Specific Safety Requirements for radiation protection, emergency preparedness, nuclear regulatory frameworks, site evaluation, and nuclear power plant safety.

Phase 4A Sustainability Reporting Standards

The first sustainability reporting ingestor transforms curated GRI and SASB source records into the processed data layer:

make sustainability-reporting

The curated source table is data/reference/sustainability_reporting_sources.csv, and the generated canonical output is data/processed/sustainability_reporting_standards.csv. This slice activates the GRI/SASB gap-analysis task for Domain 26 and establishes the pattern for expanding into ISSB, ESRS, and GHG Protocol records.

Phase 5A NIST Cybersecurity and AI Priority Ingestion

The first NIST ingestor transforms source-confirmed CSRC and AI RMF records into the processed data layer:

make nist-priority

The curated source table is data/reference/nist_priority_sources.csv, and the generated canonical output is data/processed/nist_priority_standards.csv. This slice activates the NIST gap-analysis task for Domain 29 and includes cross-domain AI RMF records for Domain 30.

Phase 5B W3C Web Standards Priority Ingestion

The first W3C ingestor transforms source-confirmed W3C Technical Reports records into the processed data layer:

make w3c-priority

The curated source table is data/reference/w3c_priority_sources.csv, and the generated canonical output is data/processed/w3c_standards.csv. This slice activates the W3C gap-analysis task for Domain 28 and covers priority W3C Standards across accessibility, web APIs, verifiable credentials, data catalogues, graphics, real-time communications, and web fonts.

Phase 5C ITU Telecommunications Priority Ingestion

The first ITU ingestor transforms source-confirmed ITU-T Recommendation records into the processed data layer:

make itu-priority

The curated source table is data/reference/itu_priority_sources.csv, and the generated canonical output is data/processed/itu_recommendations.csv. This slice activates the ITU gap-analysis task for Domain 28 and covers priority telecommunications recommendations across numbering, optical transport, broadband access, video coding, public-key infrastructure, quality assessment, and machine learning in future networks.

Phase 5D ETSI ICT Standards Priority Ingestion

The first ETSI ingestor transforms source-confirmed ETSI deliverable records into the processed data layer:

make etsi-priority

The curated source table is data/reference/etsi_priority_sources.csv, and the generated canonical output is data/processed/etsi_standards.csv. This slice activates the ETSI roadmap task for Domain 28 and covers priority ETSI standards across ICT accessibility, consumer IoT cybersecurity, radio spectrum access, connected transport, 5G system architecture, NFV interoperability, and conformance assessment.

Phase 5E OASIS, Ecma, and GS1 Priority Ingestion

The open ICT priority ingestor transforms source-confirmed OASIS, Ecma, and GS1 records into the processed data layer:

make open-ict-priority

The curated source table is data/reference/open_ict_priority_sources.csv, and the generated canonical output is data/processed/open_ict_standards.csv. This slice activates the OASIS, Ecma, and GS1 roadmap tasks with priority records for MQTT, SAML, OpenDocument, STIX, TAXII, ECMAScript, JSON, GS1 General Specifications, EPCIS/CBV, and GS1 Digital Link.

Phase 6A IEC Electrotechnical Priority Ingestion

The first IEC ingestor transforms source-confirmed IEC Webstore metadata records into the processed data layer:

make iec-priority

The curated source table is data/reference/iec_priority_sources.csv, and the generated canonical output is data/processed/iec_standards.csv. This slice activates the IEC gap-analysis task for Domain 9 and covers priority electrotechnical families including IEC 60364, IEC 60601, IEC 61000, IEC 61508, IEC 62443, and IEC 60079.

Phase 6B CCSDS and ECSS Space Standards Priority Ingestion

The first space standards ingestor transforms source-confirmed CCSDS and ECSS records into the processed data layer:

make space-priority

The curated source table is data/reference/space_priority_sources.csv, and the generated canonical output is data/processed/space_standards.csv. This slice activates the CCSDS/ECSS gap-analysis task for Domain 14 and covers priority space packet, telemetry, telecommand, file delivery, systems engineering, project management, and product-assurance standards.

Phase 7A Culture and Heritage Priority Ingestion

The first culture and heritage ingestor transforms source-confirmed UNESCO, ICOMOS, ICOM, and ICCROM records into the processed data layer:

make culture-priority

The curated source table is data/reference/culture_priority_sources.csv, and the generated canonical output is data/processed/culture_heritage_standards.csv. This slice activates the culture and heritage roadmap task for Domain 21 and covers UNESCO heritage conventions, ICOMOS conservation frameworks, the ICOM Code of Ethics for Museums, and ICCROM crisis first-aid guidance.

Phase 7B Sports and Recreation Priority Ingestion

The first sports and recreation ingestor transforms source-confirmed WADA, IOC, IFAB, World Athletics, CAS, FIBA, and ITF records into the processed data layer:

make sports-priority

The curated source table is data/reference/sports_priority_sources.csv, and the generated canonical output is data/processed/sports_recreation_standards.csv. This slice activates the sports and recreation roadmap task for Domain 22 and covers anti-doping governance, Olympic Movement governance, football laws, athletics rules, sports arbitration, basketball rules, and tennis rules.

Life-Science Research Utilities

BindingDB ligand and target lookups can be run through the compact REST wrapper:

python3 scripts/bindingdb_lookup.py pdb 1Q0L --cutoff 100 --identity 92 --max-items 10
python3 scripts/bindingdb_lookup.py smiles 'CC(=O)OC1=CC=CC=C1C(=O)O' --cutoff 0.9

The wrapper delegates requests to scripts/rest_request.py, defaults broad lookups to compact output, and supports --save-raw --raw-output-path ... when a curator needs to archive the full BindingDB payload.

Phase 8A National Standards Body Registry

The first national standards body registry slice transforms source-confirmed ISO member-body records into the processed data layer:

make national-standards-bodies

The curated source table is data/reference/national_standards_bodies_sources.csv, and the generated canonical output is data/processed/national_standards_bodies.csv. This initial slice covers 10 high-impact ISO member bodies and establishes the pattern for expanding toward the full ISO national body network.

Research Task Matrix

The 24-month research plan is tracked as machine-readable work items in data/reference/research_tasks.csv. Run:

make research-tasks

This generates data/reports/research_task_coverage.csv and docs/RESEARCH_TASKS.md, covering all 40 domains, all major phases, and the enhanced integration roadmap.

Roadmap to 100 Percent

The remaining work toward the complete global standards index is documented in docs/superpowers/plans/2026-05-02-roadmap-to-100-percent-global-standards-index.md. It follows the canonical Phase 0 through Phase 9 research plan, then extends into the enhanced graph, API, automation, multilingual, and sustainability roadmap.

The friend-reviewed gap analysis is preserved in docs/SIGMA_GAP_ANALYSIS_AND_ENHANCEMENT_PLAN.md. Its actionable recommendations have been converted into roadmap checkpoints and machine-readable research tasks.

The current project status report is preserved in docs/PROJECT_STATUS_REPORT_2026-05-02.md. It summarizes the journey so far, current release metrics, remaining workstreams, lessons learned, challenges, strategies, and operating best practices.

Contribute Without Coding

Non-technical contributors can propose corrections, missing standards, or new source families through GitHub Issues. Use the issue templates for new entries and corrections, and include an official source URL for every claim.

Project Owner and Contact

The project owner and lead curator is Mohammad Ariful Islam. For public questions, corrections, missing sources, or contribution proposals, open an issue at https://github.com/sigma-standards/sigma-index/issues.

Phase 9A Quality Gate

The first release quality gate performs deterministic checks that are safe for local and CI runs:

make quality-gate

It writes data/reports/quality_gate.csv and docs/QUALITY_GATE.md, covering duplicate sigma_id values, missing required master-schema fields, and malformed official_url values. Live URL reachability remains a separate audit step because source sites can rate-limit, redirect, or block automated requests.

Publishing

GitHub Actions includes:

Enable Pages in repository settings with GitHub Actions as the source, then run the Publish Pages workflow.

Build the same site locally after creating release artifacts:

make release
python3 scripts/build_static_site.py

The local and remote Pages builds use the same script so navigation, downloads, documentation links, and all-domain coverage stay synchronized.

Enhancement Focus Added (May 2026)

The plan now explicitly includes:

Airtable / MCP Note

No Airtable MCP server is required for the current build. The project plan uses free/open collaboration paths: Google Sheets for early curation and NocoDB as the Airtable-style open-source option if a database UI is needed.

Contributing

  1. Open an issue with a source-backed addition/correction.
  2. Submit a pull request with schema-compliant updates.
  3. Include source URLs for all added or changed records.

License