evoproc.scorers

Scorers for procedure evolution.

  • StructuralHygieneScorer: scores a procedure JSON using validators + structure heuristics.

  • TaskEvalScorer: runs a procedure end-to-end and scores with a user-provided eval_fn.

  • ProcScorerAdapter: adapts any “proc scorer” (that scores JSON) to GA-style “individual” scoring.

Usage with GA

from src.scoring import StructuralHygieneScorer, ProcScorerAdapter from src.validators import validate_procedure_structured

proc_scorer = StructuralHygieneScorer(validate_fn=validate_procedure_structured) scorer = ProcScorerAdapter(proc_scorer) # GA expects scorer.score(individual)

Classes

HasProc

Minimal protocol for GA 'individual'-like objects.

ProcScorer

Scores a raw procedure JSON (not an Individual).

ProcScorerAdapter

Adapter that lets a procedure-level scorer be used by GA code that calls scorer.score(individual).

Scorer

GA-facing scorer protocol.

StructuralHygieneScorer

Structural hygiene scorer for global-state procedures (scores procedure JSON).

TaskEvalScorer

Execute a procedure (via run_steps_fn) and grade with a user-provided eval_fn(state, proc) -> float.

class evoproc.scorers.HasProc[source]

Bases: Protocol

Minimal protocol for GA ‘individual’-like objects.

proc: Dict[str, Any]
__init__(*args, **kwargs)
class evoproc.scorers.Scorer[source]

Bases: Protocol

GA-facing scorer protocol.

score(ind, **kwargs)[source]
Return type:

float

Parameters:
__init__(*args, **kwargs)
class evoproc.scorers.ProcScorer[source]

Bases: Protocol

Scores a raw procedure JSON (not an Individual).

score_proc(p)[source]
Return type:

float

Parameters:

p (Dict[str, Any])

__init__(*args, **kwargs)
class evoproc.scorers.ProcScorerAdapter[source]

Bases: Scorer

Adapter that lets a procedure-level scorer be used by GA code that calls scorer.score(individual).

__init__(proc_scorer)[source]
Parameters:

proc_scorer (ProcScorer)

Return type:

None

score(ind, **kwargs)[source]
Return type:

float

Parameters:
class evoproc.scorers.StructuralHygieneScorer[source]

Bases: object

Structural hygiene scorer for global-state procedures (scores procedure JSON).

Components (higher is better; starts at base):
  • Validator penalties:
    • fatal diagnostics: -w_fatal each

    • repairable diags: -w_repair each

  • Redefinition penalty: -w_redefine * (# vars redefined)

  • Unused outputs penalty: -w_unused * (# unused outputs)

  • Soft length cap: -w_len * sigmoid(max(0, n_steps - target_steps))

  • Extraction-first reward:+w_extract if step 1 looks like an extraction

__init__(validate_fn, *, base=1.0, w_fatal=1.0, w_repair=0.2, w_redefine=0.25, w_unused=0.25, w_len=0.3, target_steps=6, w_extract=0.25)[source]
Parameters:
Return type:

None

score_proc(p)[source]

Return a scalar fitness for a procedure JSON.

Return type:

float

Parameters:

p (Dict[str, Any])

class evoproc.scorers.TaskEvalScorer[source]

Bases: Scorer

Execute a procedure (via run_steps_fn) and grade with a user-provided eval_fn(state, proc) -> float. Expects GA to call score(individual).

Parameters:
  • run_steps_fn (Callable[[Dict[str, Any], str, Dict[str, Any], str], Dict[str, Any]]) – Callable that executes the procedure over the question with the given schema/model and returns a final state dict (e.g., your run_steps).

  • eval_fn (Callable[[Dict[str, Any], Dict[str, Any]], float]) – Callable (state, proc) -> float returning a scalar fitness.

  • question (str) – The task prompt to run.

  • final_answer_schema (Dict[str, Any]) – JSON schema passed to the last step runner.

  • model (str) – LLM name for execution.

  • strict_require_key (Optional[str]) – If set, returns -1.0 when the key is missing from state.

__init__(run_steps_fn, eval_fn, question, final_answer_schema, model, strict_require_key=None)[source]
Parameters:
Return type:

None

score(ind, **kwargs)[source]
Return type:

float

Parameters: