1 Synesis: Ideas

Ideas: Language Extension Specification Proposals

1.1 Introduction to Extensions

This specification details the new features planned for version 1.2 of the Synesis language. The focus of this update is to evolve the language from a linear annotation system to a qualitative inference engine and knowledge management (Zettelkasten), as well as introduce native support for canonical loci (sacred texts).

1.2 Output Artifact Configuration (RENDER)

The RENDER command allows explicitly declaring, within the project file, which file formats should be generated and where they should be saved. This ensures compilation reproducibility without relying on command-line parameters.

1.2.1 Syntax

RENDER AS [JSON | CSV | EXCEL] INTO "path/to/filename"

1.2.1.1 Semantics

RENDER: Command verb that initiates compilation target definition. Indicates creation of a structured representation (artifact) from interpreted data.
AS: Clause that specifies serialization format.
JSON: Generates complete abstract syntax tree with traceability metadata.
CSV: Generates flat relational tables (separated by block type).
EXCEL: Generates .xlsx file with tabs corresponding to CSV tables.
INTO: Clause that defines destination path.
Must be a string in double quotes.
Relative paths to the project root are recommended to ensure portability across different machines.
If destination directory doesn’t exist, compiler will attempt to create it.

1.2.1.2 Example in Context (.synp)

PROJECT "Phenomenological Study 2026"

TEMPLATE "templates/research_v1.synt"
INCLUDE BIBLIOGRAPHY "data/references.bib"
INCLUDE ANNOTATIONS "data/interviews/*.syn"
INCLUDE ONTOLOGY "ontology/main.syno"

# Declarative Output Configuration
# Generates multiple formats simultaneously in organized directories

RENDER AS JSON  INTO "dist/full_data.json"
RENDER AS EXCEL INTO "reports/human_readable/matrix.xlsx"
RENDER AS CSV   INTO "reports/raw_data/export.csv"

END

1.2.1.3 Validation Rules

Multiplicity: Multiple RENDER commands are allowed in the same project. The compiler will process all sequentially.
Write Permission: Compiler must have write permission to target directory. If destination file is locked (e.g., open in Excel), compilation will fail with I/O error.
Absolute Paths: Using absolute paths (e.g., C:/Users/...) will emit a Warning, as it breaks project portability.

1.3 Advanced Semantic Analysis

The compiler now acts as a methodological assistant, not just a syntactic validator.

1.3.1 Ontological Redundancy Detection (Fuzzy Matching)

The compiler will analyze the ontology for concepts with similar spelling (Lexical Similarity), preventing knowledge graph fragmentation.

Algorithm: Levenshtein / Jaro-Winkler distance.
Behavior: Warning during compilation.
Detection Example: SocialInteraction vs Social_Interaction.

1.3.2 Hierarchical Cycle Validation

Prevents creation of logical loops in taxonomic definitions. * Rule: A concept cannot be an ancestor of itself. * Fatal Error: Compilation is aborted if a cycle is detected (e.g., A > B > C > A).

1.3.3 Dead Code Elimination (Ghost Concepts)

Identifies concepts defined in ontology that were never applied to any ITEM or ZETTEL. * Behavior: Warning (Codebook cleanup report).

1.4 New Structural Commands

1.4.1 TAXONOMY (Rigid Hierarchy)

Replaces loose use of TOPIC for strict “is-a” relationship definitions. Allows property inheritance.

Syntax:

TAXONOMY
    Emotions
        > Negative
            > Anger
            > Fear
        > Positive
            > Joy
END

1.4.2 CONSTRAINT (Business Rules)

Defines logical invariants to ensure methodological consistency of annotations.

Syntax:

CONSTRAINT
    # Mutual exclusion: An item cannot have both codes simultaneously
    FORBID Code "Sentiment: Positive" WITH Code "Risk: Critical"

    # Logical dependency: Code B requires presence of Code A
    REQUIRE Code "Solution" IF Code "Problem" PRESENT
END

1.4.3 CROSSREF (Citation Network)

Formalizes rhetorical relationships between bibliographic sources (Source-to-Source), independent of internal annotations.

Syntax:

CROSSREF
    @smith2019 -> REFUTES -> @doe2018
    @jones2020 -> EXTENDS -> @smith2019
END

1.5 Knowledge Management (Zettelkasten)

Introduces capability to create “Permanent Notes” that synthesize knowledge independent of a specific bibliographic source.

1.5.1 4.1 ZETTEL Block

Top-level entity, sibling of SOURCE and ONTOLOGY. Does not require @bibref.

Syntax:

# Syntax: ZETTEL [UNIQUE_ID]
ZETTEL 2026011401
    title: "The Atomicity Principle"
    body: "A DSL should force concept breakdown..."

    # Explicit connections (Rhizome)
    links: 2026011205, 2025123001

    # Metadata
    tags: DSL Design
    origin: @turing1936
END

1.5.2 Exclusive Fields (ZETTEL FIELDS)

In the .synt template file, new field types are supported:

LINK (Type REFERENCE): Creates bidirectional edges between ZETTEL nodes.
ORIGIN (Type REFERENCE): Creates citation edge to a SOURCE.

1.6 Canonical Annotation (Biblical/Exegetical)

Specialized addressing system for texts that use the Locus pattern (Book, Chapter, Verse) instead of author-date bibliographic reference.

1.6.1 BIBLE Block

Replaces SOURCE block for sacred texts. Uses USX standard for book identification.

Syntax:

# Syntax: BIBLE [USX_CODE] [CHAP:VERSE] [(VERSION)]
BIBLE ROM 8:28-30 (ESV)

    # Pericope title (String with mandatory quotes to support punctuation)
    pericope: "The efficacy of calling: The eternal purpose."

    ITEM
        quote: "And we know that for those who love God all things work together..."
        code: Providence
    END
END

1.6.2 Addressing Rules (Parsing)

Parser applies strict rules to avoid ambiguity:

USX Code: Exactly 3 uppercase letters (e.g., MAT, GEN, PSA).
Separators: Mandatory use of : to separate chapter/verse and - for ranges.
Pericope: Must be delimited by double quotes "" or triple """ to allow special characters, commas, and line breaks without breaking the parser.

1.6.3 Canon Template Example

TEMPLATE biblical_exegesis

BIBLE FIELDS
    OPTIONAL pericope
END

ITEM FIELDS
    REQUIRED quote
    REQUIRED code
    OPTIONAL original_term  # E.g.: Greek/Hebrew
END

1.6.3.1 Appendix A: Implementation Notes (Lark Grammar)

Grammar snippet to support BIBLE header:

// Canon Block Definition
canon_block: "BIBLE" usx_code bible_ref version? block_body "END"

// USX Codes (strict 3 uppercase letters)
usx_code: UCASE_LETTER UCASE_LETTER UCASE_LETTER

// Supported references: 8, 8:28, 8:28-30, 8:28-9:1
bible_ref: chapter
         | chapter ":" verse
         | chapter ":" verse "-" verse
         | chapter ":" verse "-" chapter ":" verse

chapter: INT
verse: INT | "END"
version: "(" UCASE_LETTER+ ")" | "@" LCASE_LETTER+ DIGIT+

1.7 Immediate Visualization

JSON output is useful for processing, but humans need to see the graph.

Idea: Create a synesis serve command that starts a simple local server showing the connection graph (using D3.js or Cytoscape) generated from exported JSON.

1.8 Data Science Integration

CSV/Excel export support is good, but direct Pandas integration would be killer.

Example: A Python library import synesis that allows loading the project directly into a DataFrame without going through the terminal, facilitating use in Jupyter Notebooks.

1.9 Possible Workflow

Correct. The premise is exact: the Compiler Core (Parser + Semantic Analysis) acts as the single source of truth. Both file export, the Python library (synesis.load), and the LSP server consume the same Abstract Syntax Tree (AST) and the same Symbol Table.

Here is the objective workflow for this implementation, focusing on the “Compiler as a Library” architecture:

1.9.1 Step 1: I/O Decoupling (Input/Output)

The first step is to ensure the compiler doesn’t “know” it’s writing files.

Action: Separate export functions. Today, there’s probably a export_csv(ast, filepath) function.
Refactoring: Split into two:

transform_to_tabular_data(ast): Receives AST and returns lists of dictionaries/objects in memory (without touching disk).
write_csv(data, filepath): Receives data and only writes the file.

1.9.2 Step 2: Public Core API Exposure

The main Synesis module should expose programmatic access to compilation.

Action: Make orchestration function externally accessible.
Definition: Create/Adjust a method (e.g., Synesis.compile_source()) that accepts a string or file path and returns the Project object (the semantic validation result containing all linked structures).

1.9.3 Step 3: Python Adapter Creation (Data Science)

This is the consumer you suggested (synesis.load).

Dependency: Define Pandas as optional dependency.
Flow:

Calls Step 2 API to get Project object in memory.
Calls Step 1 transformation function (transform_to_tabular_data).
Instantiates Pandas DataFrames using this data.
Applies typing (converts strings to datetime objects, categories, etc.) based on AST types.

1.9.4 Step 4: LSP Adaptation (Editor Intelligence)

LSP consumes same AST, but needs different data (location instead of value).

Action: Ensure AST nodes (Item, Source, Field) preserve origin metadata (line, column, file).
Flow:

LSP sends current file text (even incomplete).
Core attempts parsing (with error tolerance).
Core returns Symbol Table (list of available ontologies and definitions).
LSP uses this table to provide Autocomplete and Go to Definition.

1.9.5 Step 5: CLI as “Thin Client”

The synesis compile terminal command becomes just another client, like Python or LSP.

Action: CLI only collects arguments, calls Step 2 API, and uses file writing functions (write_csv, write_json) defined in Step 1.

1.9.6 Data Flow Summary

Input: .syn / .synt files
Central Engine (Synesis Core): Generates AST + Semantic Validation.
Distribution:

via CLI -> Writes to Disk (JSON/CSV).
via Python SDK -> Delivers Objects/DataFrames in Memory.
via LSP -> Delivers Position Metadata and Errors to Editor.