Pipeline transformability

This chapter explains how pipelines and Stage-2 pipelines compose, why the seed-rooted Stage-2 form has its own sugar surface even though it shares the Stage2Pipeline<Base, L> type, and how the transformation layers fit together.

The core mechanics are in scope for every user; the two appendices at the end are flagged interested user only and cover the hard design problems that were worked through to arrive at the current shape. Most readers can stop at The two stages at a glance.

The two stages at a glance

A pipeline has a Stage 1 (base slots — coalgebraic form) and a Stage 2 (stacked lifts — algebraic transformation form). A .lift() call moves a pipeline from Stage 1 to Stage 2.

Two transformation vocabularies:

Stage 1 (reshape): rewrites base slots in place. A SeedPipeline’s .filter_seeds, .wrap_grow, .map_node_bi, .map_seed_bi all produce another SeedPipeline of (possibly different) type parameters. Cheap; no lift chain involved.
Stage 2 (lift-chain compose): appends a library lift to the chain. .wrap_init, .zipmap, .map_r_bi, .map_n_bi, .filter_edges, .memoize_by, .explain, .wrap_accumulate, .wrap_finalize — each one delegates internally to then_lift(Domain::xxx_lift(...)).

The same sugar name may appear at both stages with different semantics. map_node_bi at Stage 1 is a reshape (new Stage-1 pipeline); map_n_bi at Stage 2 composes a ShapeLift onto the chain. Distinct names make the stage unambiguous at the call site.

The Lift trait — the single transformation primitive

Every Stage-2 sugar ultimately builds a value implementing Lift<D, N, H, R>:

#![allow(unused)]
fn main() {
/// Domain-generic transformer over the `(treeish, fold)` pair.
///
/// A `Lift` rewrites the graph side and/or the fold side, possibly
/// changing their carrier types, and hands the result to a
/// continuation. The caller's continuation-return type `T` flows
/// through, so the chain of output types stays inferred across
/// composition (`ComposedLift<L1, L2>`).
///
/// Grow is deliberately absent from this signature. Only the Seed
/// finishing lift ([`SeedLift`](super::SeedLift)) needs a grow
/// input; it is composed internally by
/// `hylic_pipeline::PipelineExecSeed::run` and does not travel as
/// a 3-slot signature through the `Lift` trait.
///
/// See [Lifts](https://hylic.balcony.codes/concepts/lifts.html).
pub trait Lift<D, N, H, R>
where D: Domain<N> + Domain<Self::N2>,
      N: Clone + 'static, H: Clone + 'static, R: Clone + 'static,
{
    /// Output node type after the lift has been applied.
    type N2:   Clone + 'static;
    /// Output heap type after the lift has been applied.
    type MapH: Clone + 'static;
    /// Output result type after the lift has been applied.
    type MapR: Clone + 'static;

    /// Apply the lift to `(treeish, fold)` and invoke `cont` with
    /// the transformed pair.
    fn apply<T>(
        &self,
        treeish: <D as Domain<N>>::Graph<N>,
        fold:    <D as Domain<N>>::Fold<H, R>,
        cont: impl FnOnce(
            <D as Domain<Self::N2>>::Graph<Self::N2>,
            <D as Domain<Self::N2>>::Fold<Self::MapH, Self::MapR>,
        ) -> T,
    ) -> T;
}
}

A Lift takes a (treeish, fold) pair over (N, H, R) and produces another over (N2, MapH, MapR). The library ships four atoms:

IdentityLift — pass-through; the seed of every chain.
ComposedLift<L1, L2> — sequential composition, L1’s outputs feeding L2’s inputs.
ShapeLift — the universal store-three-xforms lift that every library sugar instantiates (one per axis: treeish, fold, plus N-type).
SeedLift — the finishing lift that closes the seed axis (assembled at run time; not user-constructed in the common path).

A chain is a right-associated tree of ComposedLift values rooted at IdentityLift; every .then_lift(L) or sugar call wraps the current tip in ComposedLift<current, L>. Each lift specifies how it rewrites the pair via its apply method, which uses CPS so the caller’s continuation-return type threads through composition.

See Lifts — cross-axis transforms for the full catalogue and the atom-level reference.

One Stage-2 type, two Base configurations

Stage 2 has a single pipeline type — Stage2Pipeline<Base, L> — but its sugar surface bifurcates by Base because the chain L operates over a different node type in each configuration.

Treeish-rooted: `Stage2Pipeline<TreeishPipeline<…>, L>`

When the base is a TreeishPipeline (or another treeish-rooted Stage2Pipeline being extended), the chain L: Lift<D, N, H, R> operates over the base’s N directly. All sugars come from the blanket trait Stage2SugarsShared / Stage2SugarsLocal.

Seed-rooted: `Stage2Pipeline<SeedPipeline<…>, L>`

When the base is a SeedPipeline, SeedLift is assembled at .run() time from the base’s grow plus the caller-supplied root_seeds and entry_heap, and composed as the first lift in the run-time chain. SeedLift’s output node type is SeedNode<N>, so every lift in the stored chain L operates over SeedNode<N> — not the user’s N.

Even though the chain runs over SeedNode<N>, Stage-2 sugars on this form take user closures over &N. The translation is done by Wrap dispatch: one Stage2SugarsShared / Stage2SugarsLocal trait covers both Bases, peeling &SeedNode<N> to &N on the seed-rooted side and acting as a pass-through on the treeish-rooted side. EntryRoot defaults are encoded in the per-sugar peel routines (EntryRoot passes through wrap_init, filter_edges always admits its children, etc.) — see the table later in this chapter for the full set.

The seed-pipeline-unification (one cycle ago) collapsed the two historical struct types LiftedPipeline and LiftedSeedPipeline into the single Stage2Pipeline<Base, L>, and reduced the parallel-but-separate sugar catalogues into one Wrap-dispatched trait per domain. The old type names no longer exist.

The follow-on stage2-base-run-unification cycle did the same operation on the run path: a single Stage2Pipeline::run_with_inputs body lives in stage2/run/mod.rs, generic over Base: Stage2Base. The base contributes a RunInputs<'i, CurN> GAT, a PreLift lift, and a provide_run_essentials callback that builds the pre-chain lift from inputs and yields the descend-root reference at the post-chain type. The two former per-domain run files (stage2/run/seed_shared.rs, stage2/run/seed_local.rs) were retired; the per-domain residue lives in seed/stage2_base_*.rs, where the Domain<N>::Grow<Seed, N> GAT non-normalisation forces a per-domain pinning of the SeedLift constructor.

Run composition: how `.run()` actually produces a result

For a treeish-rooted Stage2Pipeline<TreeishPipeline<…>, L>:

1. Base = TreeishPipeline: yield (treeish, fold) over (N, H, R).
2. Chain L applied: (treeish, fold) over (N, H, R)
                  → (treeish', fold') over (N2, MapH, MapR).
3. Executor: run(&fold', &treeish', &root) → MapR.

For a seed-rooted Stage2Pipeline<SeedPipeline<…>, L>:

1. Base = SeedPipeline: hold (grow, seeds_from_node, fold) over (N, Seed, H, R).
   At .run time the user adds (root_seeds, entry_heap: H).

2. Fuse: treeish_base = seeds_from_node.map(grow)   — over N.

3. SeedLift::apply (the first, innermost lift):
   (treeish_base, fold_base) over (N, H, R)
   → (treeish_lifted, fold_lifted) over (SeedNode<N>, H, R).

4. Chain L applied: (treeish_lifted, fold_lifted) over (SeedNode<N>, H, R)
                  → (treeish', fold') over (SeedNode<N2>, MapH, MapR).

5. Executor: run(&fold', &treeish', &SeedNode::entry_root()) → MapR.

The critical design point in step 3 is that SeedLift is applied inside the chain, not as an outer wrap. That’s what lets the user chain L transform the fold and treeish in ways that would otherwise depend on the seed-closing step having already happened. See Appendix A for the history of this choice.

Transformation variance at a glance

Each axis in the (N, H, R) triple has a distinct variance:

Axis	Role in fold	Variance	Sugar
N	`init(&N) → H` (contra)	invariant (used in both fold and graph)	`map_node_bi` (S1) / `map_n_bi` (S2) — bijection required
H	`&mut H` (internal)	invariant (never crosses node boundaries)	`wrap_init`, `wrap_accumulate`, `wrap_finalize`
R	`finalize(&H) → R`; `&R` in acc	invariant (appears in and out)	`zipmap`, `map_r_bi`

Invariance on all three is why every N- and R-change sugar requires both a forward and a backward closure. See Transforms and variance for the categorical picture.

Where the abstractions stop

One deliberate asymmetry remains in the current surface:

Auto-lift on TreeishPipeline but not on SeedPipeline. tp.wrap_init(w) works directly (the Stage-1 → Stage-2 transition is implicit); sp.wrap_init(w) is a compile error — write sp.lift().wrap_init(w) explicitly. The asymmetry is intentional: a treeish-rooted Stage-2 chain operates over the same N as the base, so the lift is invisible to the user; a seed-rooted Stage-2 chain operates over SeedNode<N>, and silencing that transition would surface in any lift whose output type mentions N (the explainer being the canonical example). Forcing .lift() makes the row type’s appearance traceable to a single line in the user’s source.

The historical second asymmetry — a parallel inherent-methods catalogue on the seed-rooted form — was eliminated by the seed-pipeline-unification. Stage-2 sugars are now one trait per domain (Stage2SugarsShared / Stage2SugarsLocal), blanket implemented over both Bases, with Wrap doing the per-Base type projection. See Wrap dispatch.

Appendix A — interested user only: seed pipeline, lifting, and run composition

This section is intentionally deeper than the preceding narrative and is safe to skip. It exists for readers who want to know why the current shape is as it is.

The problem the SeedPipeline solves

A hylomorphism fuses a coalgebra (N → children) with an algebra (children → result). In practice the dependency graph often speaks a different type than the algebra: module paths, URLs, database keys — a Seed that must be grown to an N before the algebra can inspect it. A SeedPipeline carries the triple (grow: Seed → N, seeds_from_node: N → Seed*, fold: N → H → R) and fuses seed-axis and node-axis into one treeish at run time.

The fusion is:

seeds_from_node:  N → Seed*            via Edgy<N, Seed>
    .map(grow):   Seed → N            via the domain's grow xform
=>  treeish:      N → N*               via Edgy<N, N> ≡ Treeish<N>

The result is a Treeish<N> that the executor can walk. But before walking, execution needs a root. The user supplies entry seeds (root_seeds: Edgy<(), Seed>) and an initial heap (entry_heap: H); these must be turned into (a) a starting node the executor can descend from, and (b) a top-level accumulation protocol for the children’s results.

The EntryRoot-as-node compromise

The executor’s run method takes a single root: run(fold, treeish, &N) → R. To handle a forest of entry seeds under a single-root executor, the library invents a synthetic root row:

#![allow(unused)]
fn main() {
/// Opaque row type in a seed-closed chain's treeish. Values are
/// either the synthetic `EntryRoot` row (seed fan-out) or a resolved
/// `Node(N)`. User code inspects via [`is_entry_root`](Self::is_entry_root),
/// [`as_node`](Self::as_node), [`into_node`](Self::into_node), and
/// [`map_node`](Self::map_node); the variants are sealed.
#[derive(Clone, PartialEq, Eq, Hash)]
pub struct SeedNode<N> {
    // Exposed `pub` (not `pub(crate)`) so the doc-hidden
    // `seed_node_internal` module can re-export it for
    // `hylic-pipeline`'s dispatch. User code should treat this field
    // as opaque and use `is_entry_root` / `as_node` / `map_node`.
    #[doc(hidden)]
    pub inner: SeedNodeInner<N>,
}

/// Library-internal variant carrier for `SeedNode<N>`. Exposed
/// `pub` only to make crate-external re-export through the
/// `seed_node_internal` doc-hidden module possible. User code
/// should never name this directly.
#[doc(hidden)]
#[derive(Clone, PartialEq, Eq, Hash)]
pub enum SeedNodeInner<N> {
    EntryRoot,
    Node(N),
}
}

SeedLift wraps the treeish so that:

SeedNode::EntryRoot.visit fans out to SeedNode::Node(grow(s)) for each entry seed.
SeedNode::Node(n).visit delegates to the base treeish.

And wraps the fold so that:

init(SeedNode::EntryRoot) = entry_heap_fn() — returns the user’s entry_heap.
init(SeedNode::Node(n)) = base.init(n).
accumulate / finalize are uniform.

The executor then begins at &SeedNode::entry_root() and walks normally. At the value level the EntryRoot row participates in the fold like any other node — it receives children’s R via accumulate, has its own finalize, and produces the final R.

This is a compromise, not the most principled shape: a native-forest executor (one that accepts run_forest(fold, treeish, roots: &[N], initial_heap: H) → R directly) would eliminate SeedNode<N> entirely from the chain’s node type and strip the leak from user-visible result types. The refactor cost is significant (touches the Executor trait, every executor impl, and the accumulation protocol) and was deferred. See Sealed SeedNode for how the current shape is sealed at the user surface.

Why SeedLift is composed first (not last)

The library considered two architectures for where SeedLift sits relative to the stored user chain L.

Option A (rejected). SeedLift as the outermost lift, wrapped around the user’s chain. The user’s chain operates over plain N; SeedLift wraps the result to introduce SeedNode<N> at the outside.

Under this arrangement, N-changing lifts inside the user’s chain would produce an N2 that needs to be re-introduced as the inner node type that SeedLift’s grow-output is wrapped as. Because the Lift trait cannot, in general, surface both a forward and backward map over N (variance is already invariant on N), Option A would force L::N2 = Base::N — the chain could not change N. For N-change to work on the seed path, this invariance had to be broken.

Option B (shipped). SeedLift as the innermost lift, composed first at run time. The user’s chain operates over SeedNode<N> from .lift() onward. N-changing lifts inside the chain change SeedNode<N> → SeedNode<N2>, which is natural — the chain sees EntryRoot as part of the structure and any N-transform that preserves EntryRoot works.

The cost Option B pays is the SeedNode<N> leak into chain-tip types (visible in ExplainerResult<SeedNode<N>, H, R>). That cost is bounded: the sugar layer hides the variant in user closures (EntryRoot is auto-routed), and SeedExplainerResult (via From) projects the trace to N-typed when the user wants a sealed view.

Why SeedLift is assembled at run time, not at `.lift()`

SeedLift needs three ingredients: grow (from the base), root_seeds (from the caller of .run), and entry_heap (from the caller of .run). Only the first is available at .lift() time. The library has two places to put the root_seeds and entry_heap:

(a) as parameters to .lift(), turning it into a real constructor — which then requires knowing the entry data before any Stage-2 sugars are composed.

(b) as parameters to .run(), letting Stage-2 chains be composed and reused across different (seeds, h0) inputs.

The library chose (b): a seed-rooted Stage2Pipeline is a reusable computation; seeds + initial heap vary per call. This lets patterns like:

let lsp = pipe.lift().wrap_init(w).zipmap(m);
let r1 = lsp.run_from_slice(&exec, &seeds1, h0);
let r2 = lsp.run_from_slice(&exec, &seeds2, h0);

work without reconstructing the chain.

The default semantics of EntryRoot-dispatch in sugars

Each user-closure sugar on the seed-rooted Stage2Pipeline makes a per-sugar decision about EntryRoot:

Sugar	EntryRoot behaviour
`wrap_init(w)`	EntryRoot bypasses `w`; original init (returns `entry_heap`) runs
`wrap_accumulate(w)`	applied uniformly (no N-signature to dispatch on)
`wrap_finalize(w)`	applied uniformly
`filter_edges(p)`	EntryRoot always admits its children; `p(&CurN)` applied to Nodes
`memoize_by(k)`	EntryRoot uncached (keyed `None`); Nodes keyed `Some(k(n))`
`zipmap(m)` / `map_r_bi`(fwd,bwd)	applied uniformly
`map_n_bi(co, contra)`	EntryRoot → EntryRoot; Node(n) ↔ Node(f(n))
`explain()`	EntryRoot is a fold row with its own trace

Users who need different defaults — e.g., filter_edges that excludes EntryRoot’s fan-out — use .then_lift(Domain::xxx_lift::<SeedNode<CurN>, …>(pred)) directly. The sugars hide the common case; the raw surface remains available for specialisation.

Appendix B — interested user only: the typing front

Why `Lift::apply` uses CPS

A direct apply signature would return the transformed pair (Graph<D, N2>, Fold<D, N2, H2, R2>). Composition stacks these returns:

fn apply(self) -> (Graph<D, N_k>, Fold<D, N_k, H_k, R_k>)

After three chained lifts, N_k, H_k, R_k are all associated types of a ComposedLift<ComposedLift<ComposedLift<…>, …>, …>. No single named alias exists for the final return type before the whole chain is constructed, and Rust’s inference cannot thread the unnamed types through composition.

The CPS form threads the caller’s T outward:

fn apply<T>(
    &self,
    treeish: Graph<D, N>,
    fold:    Fold<D, N, H, R>,
    cont: impl FnOnce(Graph<D, N2>, Fold<D, N2, H2, R2>) -> T,
) -> T;

Whatever type the caller’s closure produces at the innermost apply (usually the executor’s MapR) flows back through every enclosing apply call. Rust infers the intermediate pair types at each junction without needing a named alias.

GAT normalisation helpers

The Domain<N> trait exposes three GATs:

type Fold<H, R>;
type Graph<E>;
type Grow<Seed, NOut>;

For Shared: Grow<Seed, N> = Arc<dyn Fn(&Seed) -> N + Send + Sync>. For Local: Grow<Seed, N> = Rc<dyn Fn(&Seed) -> N>.

Inside a generic impl body parameterised over some other type (say N), Rust’s trait solver does not reduce <Shared as Domain<N>>::Grow<Seed, N> to Arc<dyn Fn…>. The types compare equal by definition, but the solver doesn’t do the reduction unless Self = Shared is pinned in the enclosing scope.

The library works around this via free helper functions that pin the domain:

fn shared_grow_as_arc<Seed, NOut>(
    g: <Shared as Domain<NOut>>::Grow<Seed, NOut>,
) -> Arc<dyn Fn(&Seed) -> NOut + Send + Sync> { g }

Inside this function’s body, Self = Shared is the only option and the GAT normalises. The function is a no-op at runtime (the same value, type-coerced in), but makes the type checker happy in a context that was about to timeout.

stage2/run/gat_helpers.rs collects the helpers (a Shared set and a Local set), one per GAT crossing point.

The trait-twin Shared/Local pattern

Every sugar file has a _shared.rs and a _local.rs version. Bodies are line-for-line identical; only Arc vs Rc, and Send + Sync vs nothing, differ.

Why `SeedNode<N>` cannot be fully hidden in chain-tip types

Lift’s associated type N2 is the chain’s output node type, which appears in every type parameter of every chain-tip result the user sees. On the seed path, SeedLift sets N2 = SeedNode<N>; any later lift that preserves N preserves SeedNode<N>; any later lift that changes N via map_n_bi produces SeedNode<N2>. Concretely: ExplainerResult’s first type parameter — the per-node “heap.node” field — ends up being SeedNode<N>.

Hiding this would require either:

A projection pre-baked into each lift’s output (each lift carries a “strip SeedNode” step). This would require the library to know at composition time whether the input was a SeedPipeline — a structural property the Lift trait doesn’t expose.
A seal at the value-variant level (SeedNode’s variants become non-matchable). The current design does this: variants are pub(crate), user code inspects via is_entry_root, as_node, map_node. The type name still appears in chain-tip result types, but the enum-nature is sealed.

The library ships (2) plus a projection helper (SeedExplainerResult::from) for users who want the type-name seal on the explainer result. (1) would be cleaner but requires trait-level machinery that’s not currently on the roadmap.

Unified sugar catalogue across both bases

The chain-bound mismatch between Lift<D, SeedNode<N>, H, R> (seed-rooted) and Lift<D, N, H, R> (treeish-rooted) is bridged by a Wrap dispatch trait:

Wrap is a type-only trait with a GAT Of<UN>. Two impls: Identity::Of<UN> = UN (treeish-rooted), SeedWrap::Of<UN> = SeedNode<UN> (seed-rooted).
Stage2Base is implemented by every Stage-1 base; it carries type Wrap: Wrap so each base names its own projection.
Stage2SugarsShared / Stage2SugarsLocal are the unified sugar traits, blanket-implemented on every Stage2Pipeline<Base, L> where Base: Stage2Base and L: Lift<D, <Base::Wrap as Wrap>::Of<UN>, …>. Each sugar has one canonical body; the per-Base build closure is provided by a WrapShared / WrapLocal impl that peels &SeedNode<UN> to &UN on the seed-rooted side.

See Wrap dispatch for the implementation.

Constraints carried by the typing

The CPS shape of Lift::apply is what makes the chain composable in Rust without HKT-style trait parameters.
stage2/run/gat_helpers.rs carries one helper per GAT crossing point per domain. Zero-cost at runtime; refreshes whenever rustc’s GAT normalisation gets stronger.
The _shared.rs / _local.rs sugar twin files keep both domains readable without macros; the bodies stay line-for- line identical.
SeedNode<N> is sealed but visible in chain-tip types whose lift output mentions the chain’s N. The SeedExplainerResult::from projection lifts the row out of the explainer’s tip type for callers that prefer not to see it.

Keyboard shortcuts

hylic