evoproc.ga_scaffold_structured

GA scaffold for LLM‑generated global‑state procedures.

This module wires up a lightweight genetic algorithm (GA) over your Procedure JSON schema. It uses LLM‑driven crossover and mutation operators, a validator‑driven structural hygiene scorer, optional task‑evaluation scoring, and diversity via random immigrants.

Global‑state semantics

  • Step 1 must take exactly one input: [“problem_text”].

  • Later steps may read any variable produced by earlier steps (no strict pass‑through).

  • The final step must output exactly [“final_answer”] (a description, not a computed value).

Typical usage

from evoproc.ga_scaffold_structured import *
from evoproc.scorers import (
    StructuralHygieneScorer,
    ProcScorerAdapter,
    TaskEvalScorer,
)
from evoproc.validators import validate_procedure_structured


ga = ProcedureGA(
    model="gemma3:latest",
    create_proc_fn=create_procedure_prompt,
    query_fn=query,
    schema_json_fn=lambda: Procedure.model_json_schema(),
    validate_fn=validate_procedure_structured,
    repair_fn=query_repair_structured,
    scorer=ProcScorerAdapter(
    StructuralHygieneScorer(validate_fn=validate_procedure_structured)
    ),
    cfg=GAConfig(population_size=8, max_generations=5, seed=42),
)


best, history = ga.run(
    task_description="Solve: Natalia sold clips to 48 friends in April...",
    # Supply these three for TaskEval scoring; otherwise structural scoring is used:
    final_answer_schema=None,
    eval_fn=None, # (state, proc) -> float
    run_steps_fn=None, # executes a procedure end-to-end and returns `state`
    print_progress=True,
)

Classes

CrossoverOperator

LLM‑based crossover for global‑state procedures.

GAConfig

Genetic algorithm configuration.

Individual

Population member wrapper.

MutationOperator

LLM-driven mutation for global-state procedures.

ProcedureGA

GA driver that orchestrates initialize → evaluate → select → reproduce → next gen.

Scorer

GA-facing scorer protocol.

class evoproc.ga_scaffold_structured.GAConfig[source]

Bases: object

Genetic algorithm configuration.

Variables:
  • population_size – Number of individuals per generation.

  • elitism – Number of top individuals copied unchanged to the next generation.

  • crossover_rate – Probability of producing a child via crossover in reproduction.

  • mutation_rate – Probability of producing a child via mutation (fallback is also mutation).

  • max_generations – Total number of evolutionary iterations.

  • tournament_k – Tournament size for parent selection.

  • seed – Random seed for reproducibility (also forwarded to LLM ops where applicable).

  • random_immigrant_rate – Fraction of remaining slots each generation filled with

  • procedures (freshly generated)

population_size: int = 5
elitism: int = 2
crossover_rate: float = 0.7
mutation_rate: float = 0.3
max_generations: int = 10
tournament_k: int = 3
seed: Optional[int] = None
random_immigrant_rate: float = 0.1
__init__(population_size=5, elitism=2, crossover_rate=0.7, mutation_rate=0.3, max_generations=10, tournament_k=3, seed=None, random_immigrant_rate=0.1)
Parameters:
  • population_size (int)

  • elitism (int)

  • crossover_rate (float)

  • mutation_rate (float)

  • max_generations (int)

  • tournament_k (int)

  • seed (int | None)

  • random_immigrant_rate (float)

Return type:

None

class evoproc.ga_scaffold_structured.Individual[source]

Bases: object

Population member wrapper.

Variables:
  • proc (dict) – Procedure JSON that validates your Pydantic‑derived schema.

  • fitness (Optional[float]) – Last computed scalar fitness (None until evaluated).

  • notes (str) – Optional debugging/instrumentation notes.

proc: Dict[str, Any]
fitness: Optional[float] = None
notes: str = ''
__init__(proc, fitness=None, notes='')
Parameters:
Return type:

None

class evoproc.ga_scaffold_structured.Scorer[source]

Bases: Protocol

GA-facing scorer protocol.

Implementations must accept an object with a .proc JSON field and return a scalar fitness.

score(ind, **kwargs)[source]
Return type:

float

Parameters:
__init__(*args, **kwargs)
class evoproc.ga_scaffold_structured.CrossoverOperator[source]

Bases: object

LLM‑based crossover for global‑state procedures.

Combines two parent procedures (A, B) into a single coherent child by prompting the LLM to synthesize an integrated plan that:

  • Preserves the Step 1 rule (inputs == [“problem_text”]).

  • Adheres to global‑state semantics (later steps can read any earlier variables).

  • Ends with exactly one output "final_answer".

  • Validates against the provided schema.

Notes

This operator does not splice JSON directly; it asks the LLM to synthesize a crossover child, which tends to yield more coherent procedures than mechanical concatenation.

__init__(model, query_fn, schema_json_fn, validate_fn, repair_fn, seed=1234)[source]

Initialize the crossover operator.

Parameters:
class evoproc.ga_scaffold_structured.MutationOperator[source]

Bases: object

LLM-driven mutation for global-state procedures.

Applies exactly one small edit per call (rewrite / split / insert / remove / rename / verify), returning a full, schema‑valid procedure JSON. Post‑processes with repair_fn and rejects candidates with fatal validator diagnostics. If a procedure‑level scorer is supplied, only not‑worse mutations are accepted.

__init__(model, query_fn, schema_json_fn, validate_fn, repair_fn, proc_scorer, rng, seed, *, accept_if_not_worse=True, max_llm_tries=2)[source]

Initialize the mutation operator.

Parameters:
  • model (str) – LLM name to use for mutation prompts.

  • query_fn (Callable[[str, str, Optional[Dict[str, Any]], Optional[int]], str]) – Callable query(prompt, model, fmt, seed) -> str returning JSON text.

  • schema_json_fn (Callable[[], Dict[str, Any]]) – Callable returning the Procedure JSON schema (dict).

  • validate_fn (Callable[[Dict[str, Any]], List[Dict[str, Any]]]) – Returns a list of diagnostics for a procedure JSON.

  • repair_fn (Callable[[Dict[str, Any], str], Dict[str, Any]]) – Minimally repairs a procedure JSON using the LLM.

  • proc_scorer (Optional[Any]) – Optional object exposing score_proc(proc_json) -> float.

  • rng (Optional[Random]) – Optional PRNG for sampling mutation intents.

  • seed (int) – Forwarded to query_fn for deterministic results.

  • accept_if_not_worse (bool) – If True, reject candidates that score worse than the original.

  • max_llm_tries (int) – Number of mutation attempts before falling back to the original.

Return type:

None

class evoproc.ga_scaffold_structured.ProcedureGA[source]

Bases: object

GA driver that orchestrates initialize → evaluate → select → reproduce → next gen.

You provide your model + callable hooks (query_fn, create_proc_fn, validators, repair, and optionally a task‑eval runner). By default, the GA uses a structural hygiene scorer; you can swap in task‑eval scoring by supplying final_answer_schema, eval_fn, and run_steps_fn to run().

__init__(model, create_proc_fn, query_fn, schema_json_fn, validate_fn, repair_fn, scorer=None, cfg=GAConfig(population_size=5, elitism=2, crossover_rate=0.7, mutation_rate=0.3, max_generations=10, tournament_k=3, seed=None, random_immigrant_rate=0.1), rng=None)[source]

Initialize the GA with model/context functions and configuration.

Parameters:
Return type:

None

initialize_population(task_description)[source]

Generate the initial population by repeatedly calling _generate_one.

Return type:

List[Individual]

Returns:

A list of Individual with proc populated.

Parameters:

task_description (str)

evaluate(pop, scorer=None, **kwargs)[source]

Compute fitness for every individual in‑place using a scorer.

Parameters:
  • pop (List[Individual]) – Population to evaluate.

  • scorer (Optional[Scorer]) – Optional override; must implement score(individual) -> float.

  • kwargs (Any)

Return type:

None

run(task_description, final_answer_schema=None, eval_fn=None, run_steps_fn=None, print_progress=False)[source]

Execute the full GA loop and return the best individual plus history of elites.

If final_answer_schema, eval_fn, and run_steps_fn are all provided, the GA uses task‑eval scoring for that generation; otherwise it uses the structural hygiene scorer.

Parameters:
  • task_description (str) – Natural‑language problem the procedures should solve.

  • final_answer_schema (Optional[Dict[str, Any]]) – JSON schema for the final step (required for TaskEval scoring).

  • eval_fn (Optional[Callable[[Dict[str, Any], Dict[str, Any]], float]]) – Callable (state, proc) -> float that grades an executed procedure.

  • run_steps_fn (Optional[Callable[..., Dict[str, Any]]]) – Callable that executes a procedure and returns the final state dict.

  • print_progress (bool) – If True, prints generation‑level fitness summaries.

Return type:

Tuple[Individual, List[Individual]]

Returns:

A tuple of (best_individual, elites_history).