Skip to content

Project System

The project system is the central abstraction in Forgather. A Project resolves a configuration file through the template inheritance chain and provides access to all configured components.

Related documentation:

Quick Example

from forgather.project import Project

proj = Project("train_tiny_llama.yaml")

# Materialize the full training script
training_script = proj()

# Materialize individual components
model_factory = proj("model")
train_dataset  = proj("train_dataset")

model = model_factory()

forgather.project.Project dataclass

Central user-facing abstraction for a Forgather ML experiment.

A Project loads a YAML configuration file through a Jinja2 template inheritance chain, parses it into a node graph, and can materialise any named target from that graph into live Python objects. It is the primary entry point for interactive experiment development and for training scripts.

Parameters:

Name Type Description Default
config_name str

Name of the configuration template to load (e.g. "train_tiny_llama.yaml"). An empty string or None loads the project's default configuration as declared in meta.yaml.

''
project_dir str or PathLike

Path to the project directory. Must contain a meta.yaml file and a templates/ sub-directory. Defaults to the current working directory.

'.'
**kwargs

Additional keyword arguments forwarded to the Jinja2 preprocessor as template variables.

{}

Attributes:

Name Type Description
config_name str

Name of the selected configuration; automatically set to the project default when config_name is empty or None.

project_dir str

Absolute path to the project directory.

meta MetaConfig

Parsed project metadata (search paths, default config, etc.).

environment ConfigEnvironment

Jinja2 + YAML preprocessing environment used to load templates.

config Any

The parsed node graph produced from the preprocessed YAML. None when no configuration has been loaded yet.

pp_config str

The fully preprocessed YAML text (after Jinja2 rendering), useful for debugging template issues.

Examples:

Load a project from the current directory using the default configuration:

>>> proj = Project()
>>> training_script = proj()

Load a specific configuration and materialise individual targets:

>>> proj = Project("train_tiny_llama.yaml", "examples/tutorials/tiny_llama")
>>> model = proj("model")
>>> model, tokenizer = proj("model", "tokenizer")
Notes

When debugging a configuration it is usually easier to construct the project incrementally for better diagnostic messages. See project_config.ipynb for a step-by-step notebook example.

Source code in src/forgather/project.py
@dataclass()
class Project:
    """Central user-facing abstraction for a Forgather ML experiment.

    A ``Project`` loads a YAML configuration file through a Jinja2 template
    inheritance chain, parses it into a node graph, and can materialise any
    named target from that graph into live Python objects.  It is the primary
    entry point for interactive experiment development and for training scripts.

    Parameters
    ----------
    config_name : str, optional
        Name of the configuration template to load (e.g. ``"train_tiny_llama.yaml"``).
        An empty string or ``None`` loads the project's default configuration as
        declared in ``meta.yaml``.
    project_dir : str or os.PathLike, optional
        Path to the project directory.  Must contain a ``meta.yaml`` file and a
        ``templates/`` sub-directory.  Defaults to the current working directory.
    **kwargs
        Additional keyword arguments forwarded to the Jinja2 preprocessor as
        template variables.

    Attributes
    ----------
    config_name : str
        Name of the selected configuration; automatically set to the project
        default when *config_name* is empty or ``None``.
    project_dir : str
        Absolute path to the project directory.
    meta : MetaConfig
        Parsed project metadata (search paths, default config, etc.).
    environment : ConfigEnvironment
        Jinja2 + YAML preprocessing environment used to load templates.
    config : Any
        The parsed node graph produced from the preprocessed YAML.  ``None``
        when no configuration has been loaded yet.
    pp_config : str
        The fully preprocessed YAML text (after Jinja2 rendering), useful for
        debugging template issues.

    Examples
    --------
    Load a project from the current directory using the default configuration:

    >>> proj = Project()
    >>> training_script = proj()

    Load a specific configuration and materialise individual targets:

    >>> proj = Project("train_tiny_llama.yaml", "examples/tutorials/tiny_llama")
    >>> model = proj("model")
    >>> model, tokenizer = proj("model", "tokenizer")

    Notes
    -----
    When debugging a configuration it is usually easier to construct the project
    incrementally for better diagnostic messages.  See ``project_config.ipynb``
    for a step-by-step notebook example.
    """

    config_name: str
    project_dir: str
    meta: MetaConfig
    environment: ConfigEnvironment
    config: Any
    pp_config: str

    def __init__(
        self,
        config_name: Optional[str] = "",
        project_dir: Optional[str | os.PathLike] = ".",
        **kwargs,
    ):
        assert os.path.exists(
            project_dir
        ), f"The directory, '{project_dir}', does not exist."
        assert os.path.isdir(project_dir), f"'{project_dir}' is not a directory."

        self.project_dir = os.path.abspath(project_dir)

        # Load project meta-data
        self.meta = MetaConfig(self.project_dir)

        # Get the default configuration
        default_config = self.meta.default_config()
        if config_name is None:
            config_name = ""
        self.config_name = config_name if len(config_name) else default_config

        # Construct a project environment
        self.environment = ConfigEnvironment(
            searchpath=self.meta.searchpath,
            global_vars=preprocessor_globals(project_dir, self.meta.workspace_root),
        )

        if config_name is not None:
            self.load_config(config_name, **kwargs)
        else:
            self.config = None
            self.pp_config = None

    def load_config(self, config_name: str, **kwargs):
        """Load and parse the named configuration template.

        Preprocesses the template through the Jinja2 environment, then parses
        the resulting YAML into a node graph.  The results are stored in
        ``self.config`` and ``self.pp_config``.

        Parameters
        ----------
        config_name : str
            Name of the configuration template to load, relative to the
            project's ``config_prefix`` directory (e.g. ``"train.yaml"``).
        **kwargs
            Additional keyword arguments forwarded to the Jinja2 preprocessor
            as template variables.
        """
        # Load the pre-processed config and the config graph
        self.config, self.pp_config = self.environment.load(
            self.meta.config_path(config_name), **kwargs
        ).get()

    def add_template(self, name, data):
        """Add an in-memory template definition to the Jinja2 loader.

        Parameters
        ----------
        name : str
            Template name used to reference this template from other templates
            (e.g. via ``-- extends`` or ``-- include``).
        data : str
            Raw template source text.
        """
        loader = self.environment.get_loader()
        loader.add_template(name, data)

    def __call__(self, *args, asdict=False, **kwargs):
        """Materialise one or more targets from the loaded configuration graph.

        Each call traverses the node graph and constructs fresh Python objects
        for the requested targets.  Calling this method multiple times will
        produce independent object instances; share a single call when you need
        objects that reference each other (e.g. model and optimizer sharing the
        same parameter tensors).

        Parameters
        ----------
        *args : str
            Names of the output targets to build.  When called with no
            arguments (or with a single empty string), the ``"main"`` target is
            built.  When multiple names are given, a generator that yields the
            corresponding objects in the same order is returned.
        asdict : bool, optional
            When ``True`` the return value is always a :class:`~forgather.dotdict.DotDict`
            mapping target names to their materialised objects, regardless of
            how many targets were requested.  Default is ``False``.
        **kwargs
            Additional context variables forwarded to the graph materialisation
            engine.

        Returns
        -------
        object
            The materialised ``"main"`` target when called with no arguments.
        object
            The single materialised target when exactly one name is given and
            *asdict* is ``False``.
        generator
            A generator yielding the materialised targets in order when multiple
            names are given and *asdict* is ``False``.
        DotDict
            A dot-accessible dictionary mapping every requested target name to
            its materialised object when *asdict* is ``True``.

        Raises
        ------
        RuntimeError
            If no configuration has been loaded (i.e. ``self.config`` is ``None``).

        Examples
        --------
        >>> proj = Project("train.yaml")

        Build the default ``main`` target:

        >>> training_script = proj()

        Build a single named target:

        >>> model = proj("model")

        Unpack multiple targets in one call (avoids duplicate construction):

        >>> model, tokenizer = proj("model", "tokenizer")

        Collect targets into a dot-accessible dict:

        >>> outputs = proj("model", "tokenizer", asdict=True)
        >>> outputs.model
        """

        if self.config is None:
            raise RuntimeError("The project does not have a loaded configuration")

        if len(args) == 0 or args[0] == "":
            mtargets = ("main",)
        elif isinstance(args[0], list):
            # Preserve legacy interface for now.
            asdict = True
            mtargets = args[0]
        else:
            mtargets = args

        kwargs |= dict(pp_config=self.pp_config)
        outputs = Latent.materialize(
            self.config, mtargets=mtargets, context_vars=kwargs
        )

        if asdict:
            return DotDict(outputs)
        if len(mtargets) == 1:
            return outputs[mtargets[0]]
        else:
            return (outputs[key] for key in mtargets)

load_config(config_name, **kwargs)

Load and parse the named configuration template.

Preprocesses the template through the Jinja2 environment, then parses the resulting YAML into a node graph. The results are stored in self.config and self.pp_config.

Parameters:

Name Type Description Default
config_name str

Name of the configuration template to load, relative to the project's config_prefix directory (e.g. "train.yaml").

required
**kwargs

Additional keyword arguments forwarded to the Jinja2 preprocessor as template variables.

{}
Source code in src/forgather/project.py
def load_config(self, config_name: str, **kwargs):
    """Load and parse the named configuration template.

    Preprocesses the template through the Jinja2 environment, then parses
    the resulting YAML into a node graph.  The results are stored in
    ``self.config`` and ``self.pp_config``.

    Parameters
    ----------
    config_name : str
        Name of the configuration template to load, relative to the
        project's ``config_prefix`` directory (e.g. ``"train.yaml"``).
    **kwargs
        Additional keyword arguments forwarded to the Jinja2 preprocessor
        as template variables.
    """
    # Load the pre-processed config and the config graph
    self.config, self.pp_config = self.environment.load(
        self.meta.config_path(config_name), **kwargs
    ).get()

add_template(name, data)

Add an in-memory template definition to the Jinja2 loader.

Parameters:

Name Type Description Default
name str

Template name used to reference this template from other templates (e.g. via -- extends or -- include).

required
data str

Raw template source text.

required
Source code in src/forgather/project.py
def add_template(self, name, data):
    """Add an in-memory template definition to the Jinja2 loader.

    Parameters
    ----------
    name : str
        Template name used to reference this template from other templates
        (e.g. via ``-- extends`` or ``-- include``).
    data : str
        Raw template source text.
    """
    loader = self.environment.get_loader()
    loader.add_template(name, data)

__call__(*args, asdict=False, **kwargs)

Materialise one or more targets from the loaded configuration graph.

Each call traverses the node graph and constructs fresh Python objects for the requested targets. Calling this method multiple times will produce independent object instances; share a single call when you need objects that reference each other (e.g. model and optimizer sharing the same parameter tensors).

Parameters:

Name Type Description Default
*args str

Names of the output targets to build. When called with no arguments (or with a single empty string), the "main" target is built. When multiple names are given, a generator that yields the corresponding objects in the same order is returned.

()
asdict bool

When True the return value is always a :class:~forgather.dotdict.DotDict mapping target names to their materialised objects, regardless of how many targets were requested. Default is False.

False
**kwargs

Additional context variables forwarded to the graph materialisation engine.

{}

Returns:

Type Description
object

The materialised "main" target when called with no arguments.

object

The single materialised target when exactly one name is given and asdict is False.

generator

A generator yielding the materialised targets in order when multiple names are given and asdict is False.

DotDict

A dot-accessible dictionary mapping every requested target name to its materialised object when asdict is True.

Raises:

Type Description
RuntimeError

If no configuration has been loaded (i.e. self.config is None).

Examples:

>>> proj = Project("train.yaml")

Build the default main target:

>>> training_script = proj()

Build a single named target:

>>> model = proj("model")

Unpack multiple targets in one call (avoids duplicate construction):

>>> model, tokenizer = proj("model", "tokenizer")

Collect targets into a dot-accessible dict:

>>> outputs = proj("model", "tokenizer", asdict=True)
>>> outputs.model
Source code in src/forgather/project.py
def __call__(self, *args, asdict=False, **kwargs):
    """Materialise one or more targets from the loaded configuration graph.

    Each call traverses the node graph and constructs fresh Python objects
    for the requested targets.  Calling this method multiple times will
    produce independent object instances; share a single call when you need
    objects that reference each other (e.g. model and optimizer sharing the
    same parameter tensors).

    Parameters
    ----------
    *args : str
        Names of the output targets to build.  When called with no
        arguments (or with a single empty string), the ``"main"`` target is
        built.  When multiple names are given, a generator that yields the
        corresponding objects in the same order is returned.
    asdict : bool, optional
        When ``True`` the return value is always a :class:`~forgather.dotdict.DotDict`
        mapping target names to their materialised objects, regardless of
        how many targets were requested.  Default is ``False``.
    **kwargs
        Additional context variables forwarded to the graph materialisation
        engine.

    Returns
    -------
    object
        The materialised ``"main"`` target when called with no arguments.
    object
        The single materialised target when exactly one name is given and
        *asdict* is ``False``.
    generator
        A generator yielding the materialised targets in order when multiple
        names are given and *asdict* is ``False``.
    DotDict
        A dot-accessible dictionary mapping every requested target name to
        its materialised object when *asdict* is ``True``.

    Raises
    ------
    RuntimeError
        If no configuration has been loaded (i.e. ``self.config`` is ``None``).

    Examples
    --------
    >>> proj = Project("train.yaml")

    Build the default ``main`` target:

    >>> training_script = proj()

    Build a single named target:

    >>> model = proj("model")

    Unpack multiple targets in one call (avoids duplicate construction):

    >>> model, tokenizer = proj("model", "tokenizer")

    Collect targets into a dot-accessible dict:

    >>> outputs = proj("model", "tokenizer", asdict=True)
    >>> outputs.model
    """

    if self.config is None:
        raise RuntimeError("The project does not have a loaded configuration")

    if len(args) == 0 or args[0] == "":
        mtargets = ("main",)
    elif isinstance(args[0], list):
        # Preserve legacy interface for now.
        asdict = True
        mtargets = args[0]
    else:
        mtargets = args

    kwargs |= dict(pp_config=self.pp_config)
    outputs = Latent.materialize(
        self.config, mtargets=mtargets, context_vars=kwargs
    )

    if asdict:
        return DotDict(outputs)
    if len(mtargets) == 1:
        return outputs[mtargets[0]]
    else:
        return (outputs[key] for key in mtargets)

forgather.meta_config.MetaConfig dataclass

Project metadata loaded from meta.yaml.

MetaConfig reads and parses the meta.yaml file that sits at the root of every Forgather project. It resolves template search paths, locates the workspace root by walking up the directory tree, and exposes the configuration values needed by :class:~forgather.project.Project to set up its :class:ConfigEnvironment.

Parameters:

Name Type Description Default
project_dir str or PathLike

Path to the project directory containing meta.yaml. Defaults to the current working directory (".")

'.'
meta_name str

Name of the metadata file to load. Defaults to "meta.yaml".

PROJECT_META_NAME

Attributes:

Name Type Description
project_dir str

Path to the project directory as supplied to __init__.

name str

Name of the meta file (e.g. "meta.yaml").

project_name str or None

Human-readable project name declared in meta.yaml.

description str or None

Short project description declared in meta.yaml.

meta_path str

Absolute path to the meta file.

searchpath list of str

Ordered list of absolute directory paths searched for config templates. Derived from the searchdir key in meta.yaml, defaulting to [project_dir/templates].

system_path str or None

Optional system-level template search path from meta.yaml.

config_prefix str

Sub-directory inside the search path where leaf configuration files live. Defaults to "configs".

default_cfg str or None

Name of the default configuration file as declared in meta.yaml. When None, :meth:default_config picks the first template found under config_prefix.

config_dict dict

Raw dictionary parsed from meta.yaml.

workspace_root str

Absolute path to the workspace root directory (the directory that contains forgather_workspace/), found by walking up from project_dir.

Raises:

Type Description
ValueError

If the project directory does not exist, meta.yaml is not found, or no forgather_workspace/ directory exists in the ancestor hierarchy.

Examples:

>>> meta = MetaConfig("/path/to/my_project")
>>> print(meta.project_name)
My Project
>>> print(meta.searchpath)
['/path/to/my_project/templates', '/path/to/workspace/forgather_workspace']
Source code in src/forgather/meta_config.py
@dataclass()
class MetaConfig:
    """Project metadata loaded from ``meta.yaml``.

    ``MetaConfig`` reads and parses the ``meta.yaml`` file that sits at the
    root of every Forgather project.  It resolves template search paths,
    locates the workspace root by walking up the directory tree, and exposes
    the configuration values needed by :class:`~forgather.project.Project` to
    set up its :class:`ConfigEnvironment`.

    Parameters
    ----------
    project_dir : str or os.PathLike, optional
        Path to the project directory containing ``meta.yaml``.  Defaults to
        the current working directory (``"."``)
    meta_name : str, optional
        Name of the metadata file to load.  Defaults to ``"meta.yaml"``.

    Attributes
    ----------
    project_dir : str
        Path to the project directory as supplied to ``__init__``.
    name : str
        Name of the meta file (e.g. ``"meta.yaml"``).
    project_name : str or None
        Human-readable project name declared in ``meta.yaml``.
    description : str or None
        Short project description declared in ``meta.yaml``.
    meta_path : str
        Absolute path to the meta file.
    searchpath : list of str
        Ordered list of absolute directory paths searched for config templates.
        Derived from the ``searchdir`` key in ``meta.yaml``, defaulting to
        ``[project_dir/templates]``.
    system_path : str or None
        Optional system-level template search path from ``meta.yaml``.
    config_prefix : str
        Sub-directory inside the search path where leaf configuration files
        live.  Defaults to ``"configs"``.
    default_cfg : str or None
        Name of the default configuration file as declared in ``meta.yaml``.
        When ``None``, :meth:`default_config` picks the first template found
        under *config_prefix*.
    config_dict : dict
        Raw dictionary parsed from ``meta.yaml``.
    workspace_root : str
        Absolute path to the workspace root directory (the directory that
        contains ``forgather_workspace/``), found by walking up from
        *project_dir*.

    Raises
    ------
    ValueError
        If the project directory does not exist, ``meta.yaml`` is not found, or
        no ``forgather_workspace/`` directory exists in the ancestor hierarchy.

    Examples
    --------
    >>> meta = MetaConfig("/path/to/my_project")
    >>> print(meta.project_name)
    My Project
    >>> print(meta.searchpath)
    ['/path/to/my_project/templates', '/path/to/workspace/forgather_workspace']
    """

    # The path of the project directory
    project_dir: str

    # The name of the meta file
    name: str

    # The name of the current project
    project_name: Optional[str]

    # The description of the current project
    description: Optional[str]

    # The path to the meta file
    meta_path: str

    # Paths to search for config templates in
    searchpath: List[str]

    # The value of the system_path from the meta-config
    system_path: Optional[str]

    # The name of the sub-directory in which leaf configurations are located
    config_prefix: str

    # The default configuration
    default_cfg: Optional[str]

    # The raw config dictionary
    config_dict: dict

    # The path to the workspace root
    workspace_root: str

    def __init__(self, project_dir=".", meta_name=PROJECT_META_NAME):
        self.name = meta_name
        self.meta_path = os.path.join(project_dir, meta_name)
        config = self._load_config(self.meta_path, project_dir=project_dir)
        self.config_dict = config
        self.project_dir = project_dir
        self.searchpath = config.get(
            "searchdir", [os.path.join(project_dir, "templates")]
        )
        self.searchpath = [os.path.abspath(path) for path in self.searchpath]
        self.config_prefix = config.get("config_prefix", "configs")
        self.default_cfg = config.get("default_config", None)
        self.system_path = config.get("system_path", None)
        self.project_name = config.get("name", None)
        self.description = config.get("description", None)
        if self.system_path is not None:
            self.system_path = self.norm_path(self.system_path)

    def __str__(self):
        s = ""
        s += f"Project Name: {self.project_name}\n"
        s += f"Description: {self.description}\n"
        s += f"Default Config: {self.default_cfg}\n"
        s += f"Project Directory: {self.project_dir}\n"
        s += f"Workspace Root: {self.workspace_root}\n"
        s += f"Config Prefix: {self.config_prefix}\n"
        s += f"Search Path: {self.searchpath}\n"

        return s

    def norm_path(self, path):
        return os.path.normpath(os.path.join(self.project_dir, path))

    def default_config(self):
        """Return the name of the default configuration template.

        Returns
        -------
        str
            The value of ``default_config`` from ``meta.yaml`` if set;
            otherwise the name of the first template discovered under
            *config_prefix* across all search paths.
        """
        if self.default_cfg is not None:
            return self.default_cfg
        else:
            # Pick the first in the list.
            return next(self.find_templates(self.config_prefix))[0]

    def config_path(self, config_template=None):
        """Return the template-relative path for the given configuration name.

        Parameters
        ----------
        config_template : str or None, optional
            Name of the configuration template (e.g. ``"train.yaml"``).
            When ``None`` or an empty string, the default configuration is used.

        Returns
        -------
        str
            Path of the form ``"{config_prefix}/{config_template}"`` suitable
            for passing to :meth:`ConfigEnvironment.load`.
        """
        if config_template is None or len(config_template) == 0:
            config_template = self.default_config()
        return os.path.join(self.config_prefix, config_template)

    def find_templates(self, prefix="", suffix=".yaml"):
        """Iterate over all templates in the search path matching a prefix and suffix.

        Walks every directory in :attr:`searchpath`, descending into the
        sub-directory given by *prefix*, and yields ``(name, path)`` pairs for
        every file whose name ends with *suffix*.  Hidden directories
        (names starting with ``"."``) are skipped.

        Parameters
        ----------
        prefix : str, optional
            Sub-directory to search within each search-path entry.  Defaults to
            ``""`` (search from the root of each search-path entry).
        suffix : str, optional
            File extension filter.  Defaults to ``".yaml"``.

        Yields
        ------
        template_name : str
            Template name relative to the prefixed search directory, suitable
            for use with :meth:`ConfigEnvironment.load`.
        template_path : str
            Filesystem path to the template file.

        Examples
        --------
        Find all templates under a ``models`` directory in any search-path entry:

        >>> for template_name, template_path in meta.find_templates("models"):
        ...     print(template_name, template_path)
        """
        for templates_dir in self.searchpath:
            templates_dir = os.path.relpath(templates_dir)
            templates_dir = os.path.join(templates_dir, prefix)
            for dirpath, dirnames, filenames in os.walk(templates_dir):
                # Remove hidden
                for dirname in dirnames:
                    if dirname.startswith("."):
                        dirnames.remove(dirname)
                for filename in filenames:
                    if filename.endswith(suffix):
                        template_path = os.path.join(dirpath, filename)
                        # strip prefix
                        template_name = template_path[len(templates_dir) :]
                        if template_name.startswith("/"):
                            template_name = template_name[1:]
                        yield (template_name, template_path)

    def _load_config(self, config_path: str | os.PathLike, /, **kwargs) -> ConfigDict:
        project_directory, template_name = os.path.split(config_path)
        if not os.path.exists(project_directory):
            raise ValueError(f"The directory, '{project_directory}', does not exist.")
        elif not os.path.isdir(project_directory):
            raise ValueError(f"The directory, '{project_directory}', does not exist.")
        elif not os.path.isfile(config_path):
            raise ValueError(
                f"'The template, '{template_name}', does not exist in '{project_directory}'"
            )
        # Build searchpath for meta-config.
        # We include the project, the workspace config, and the user's Forgather config directory.
        searchpath = [project_directory]

        self.workspace_root = self.find_workspace_dir(project_directory)
        searchpath.append(os.path.join(self.workspace_root, WORKSPACE_CONFIG_DIR_NAME))
        kwargs["workspace_root"] = self.workspace_root

        user_templates_dir = os.path.join(forgather_config_dir(), "templates")
        if os.path.isdir(user_templates_dir):
            searchpath.append(user_templates_dir)

        self.environment = ConfigEnvironment(
            searchpath=searchpath,
            global_vars=preprocessor_globals(project_directory, self.workspace_root),
        )
        config = self.environment.load(template_name, **kwargs)
        return config.config

    @staticmethod
    def find_workspace_dir(project_dir):
        """Walk up the directory tree to find the Forgather workspace root.

        The workspace root is the nearest ancestor directory that contains a
        ``forgather_workspace/`` sub-directory.

        Parameters
        ----------
        project_dir : str
            Starting directory for the upward search.

        Returns
        -------
        str
            Absolute path to the workspace root directory.

        Raises
        ------
        ValueError
            If no ``forgather_workspace/`` directory is found in any ancestor.
        """

        def is_workspace(root_dir):
            workspace_config_dir = os.path.join(root_dir, WORKSPACE_CONFIG_DIR_NAME)
            return os.path.isdir(workspace_config_dir)

        workspace_root = MetaConfig._find_dir(project_dir, is_workspace)
        if not workspace_root:
            raise ValueError(
                f"Workspace directory,'forgather_workspace', was not found under project directory {project_dir}"
            )
        return workspace_root

    @staticmethod
    def find_project_dir(project_dir):
        """Walk up the directory tree to find the nearest Forgather project directory.

        A project directory is one that directly contains a ``meta.yaml`` file.

        Parameters
        ----------
        project_dir : str
            Starting directory for the upward search.

        Returns
        -------
        str
            Absolute path to the nearest project directory that contains
            ``meta.yaml``.

        Raises
        ------
        ValueError
            If no project directory is found at or above *project_dir*.
        """

        def is_project(root_dir):
            target_dir = os.path.join(root_dir, PROJECT_META_NAME)
            return os.path.isfile(target_dir)

        found_project_dir = MetaConfig._find_dir(project_dir, is_project)
        if not found_project_dir:
            raise ValueError(f"No projects where found at or below {project_dir}")
        return found_project_dir

    @staticmethod
    def _find_dir(root, match_regex):
        root = os.path.abspath(root)

        while True:
            if match_regex(root):
                return root
            parent_dir, _ = os.path.split(root)
            if parent_dir == root:
                return None
            root = parent_dir

default_config()

Return the name of the default configuration template.

Returns:

Type Description
str

The value of default_config from meta.yaml if set; otherwise the name of the first template discovered under config_prefix across all search paths.

Source code in src/forgather/meta_config.py
def default_config(self):
    """Return the name of the default configuration template.

    Returns
    -------
    str
        The value of ``default_config`` from ``meta.yaml`` if set;
        otherwise the name of the first template discovered under
        *config_prefix* across all search paths.
    """
    if self.default_cfg is not None:
        return self.default_cfg
    else:
        # Pick the first in the list.
        return next(self.find_templates(self.config_prefix))[0]

config_path(config_template=None)

Return the template-relative path for the given configuration name.

Parameters:

Name Type Description Default
config_template str or None

Name of the configuration template (e.g. "train.yaml"). When None or an empty string, the default configuration is used.

None

Returns:

Type Description
str

Path of the form "{config_prefix}/{config_template}" suitable for passing to :meth:ConfigEnvironment.load.

Source code in src/forgather/meta_config.py
def config_path(self, config_template=None):
    """Return the template-relative path for the given configuration name.

    Parameters
    ----------
    config_template : str or None, optional
        Name of the configuration template (e.g. ``"train.yaml"``).
        When ``None`` or an empty string, the default configuration is used.

    Returns
    -------
    str
        Path of the form ``"{config_prefix}/{config_template}"`` suitable
        for passing to :meth:`ConfigEnvironment.load`.
    """
    if config_template is None or len(config_template) == 0:
        config_template = self.default_config()
    return os.path.join(self.config_prefix, config_template)

find_templates(prefix='', suffix='.yaml')

Iterate over all templates in the search path matching a prefix and suffix.

Walks every directory in :attr:searchpath, descending into the sub-directory given by prefix, and yields (name, path) pairs for every file whose name ends with suffix. Hidden directories (names starting with ".") are skipped.

Parameters:

Name Type Description Default
prefix str

Sub-directory to search within each search-path entry. Defaults to "" (search from the root of each search-path entry).

''
suffix str

File extension filter. Defaults to ".yaml".

'.yaml'

Yields:

Name Type Description
template_name str

Template name relative to the prefixed search directory, suitable for use with :meth:ConfigEnvironment.load.

template_path str

Filesystem path to the template file.

Examples:

Find all templates under a models directory in any search-path entry:

>>> for template_name, template_path in meta.find_templates("models"):
...     print(template_name, template_path)
Source code in src/forgather/meta_config.py
def find_templates(self, prefix="", suffix=".yaml"):
    """Iterate over all templates in the search path matching a prefix and suffix.

    Walks every directory in :attr:`searchpath`, descending into the
    sub-directory given by *prefix*, and yields ``(name, path)`` pairs for
    every file whose name ends with *suffix*.  Hidden directories
    (names starting with ``"."``) are skipped.

    Parameters
    ----------
    prefix : str, optional
        Sub-directory to search within each search-path entry.  Defaults to
        ``""`` (search from the root of each search-path entry).
    suffix : str, optional
        File extension filter.  Defaults to ``".yaml"``.

    Yields
    ------
    template_name : str
        Template name relative to the prefixed search directory, suitable
        for use with :meth:`ConfigEnvironment.load`.
    template_path : str
        Filesystem path to the template file.

    Examples
    --------
    Find all templates under a ``models`` directory in any search-path entry:

    >>> for template_name, template_path in meta.find_templates("models"):
    ...     print(template_name, template_path)
    """
    for templates_dir in self.searchpath:
        templates_dir = os.path.relpath(templates_dir)
        templates_dir = os.path.join(templates_dir, prefix)
        for dirpath, dirnames, filenames in os.walk(templates_dir):
            # Remove hidden
            for dirname in dirnames:
                if dirname.startswith("."):
                    dirnames.remove(dirname)
            for filename in filenames:
                if filename.endswith(suffix):
                    template_path = os.path.join(dirpath, filename)
                    # strip prefix
                    template_name = template_path[len(templates_dir) :]
                    if template_name.startswith("/"):
                        template_name = template_name[1:]
                    yield (template_name, template_path)

find_workspace_dir(project_dir) staticmethod

Walk up the directory tree to find the Forgather workspace root.

The workspace root is the nearest ancestor directory that contains a forgather_workspace/ sub-directory.

Parameters:

Name Type Description Default
project_dir str

Starting directory for the upward search.

required

Returns:

Type Description
str

Absolute path to the workspace root directory.

Raises:

Type Description
ValueError

If no forgather_workspace/ directory is found in any ancestor.

Source code in src/forgather/meta_config.py
@staticmethod
def find_workspace_dir(project_dir):
    """Walk up the directory tree to find the Forgather workspace root.

    The workspace root is the nearest ancestor directory that contains a
    ``forgather_workspace/`` sub-directory.

    Parameters
    ----------
    project_dir : str
        Starting directory for the upward search.

    Returns
    -------
    str
        Absolute path to the workspace root directory.

    Raises
    ------
    ValueError
        If no ``forgather_workspace/`` directory is found in any ancestor.
    """

    def is_workspace(root_dir):
        workspace_config_dir = os.path.join(root_dir, WORKSPACE_CONFIG_DIR_NAME)
        return os.path.isdir(workspace_config_dir)

    workspace_root = MetaConfig._find_dir(project_dir, is_workspace)
    if not workspace_root:
        raise ValueError(
            f"Workspace directory,'forgather_workspace', was not found under project directory {project_dir}"
        )
    return workspace_root

find_project_dir(project_dir) staticmethod

Walk up the directory tree to find the nearest Forgather project directory.

A project directory is one that directly contains a meta.yaml file.

Parameters:

Name Type Description Default
project_dir str

Starting directory for the upward search.

required

Returns:

Type Description
str

Absolute path to the nearest project directory that contains meta.yaml.

Raises:

Type Description
ValueError

If no project directory is found at or above project_dir.

Source code in src/forgather/meta_config.py
@staticmethod
def find_project_dir(project_dir):
    """Walk up the directory tree to find the nearest Forgather project directory.

    A project directory is one that directly contains a ``meta.yaml`` file.

    Parameters
    ----------
    project_dir : str
        Starting directory for the upward search.

    Returns
    -------
    str
        Absolute path to the nearest project directory that contains
        ``meta.yaml``.

    Raises
    ------
    ValueError
        If no project directory is found at or above *project_dir*.
    """

    def is_project(root_dir):
        target_dir = os.path.join(root_dir, PROJECT_META_NAME)
        return os.path.isfile(target_dir)

    found_project_dir = MetaConfig._find_dir(project_dir, is_project)
    if not found_project_dir:
        raise ValueError(f"No projects where found at or below {project_dir}")
    return found_project_dir

forgather.config.ConfigEnvironment

Jinja2 preprocessing and YAML parsing environment for Forgather configurations.

ConfigEnvironment wraps a :class:~forgather.preprocess.PPEnvironment (a customised Jinja2 environment) and a suite of custom YAML constructors that translate !call, !singleton, !factory, !partial, and !var tags into :class:~forgather.latent.Node objects. The result of :meth:load is a :class:Config containing both the parsed node graph and the preprocessed YAML text.

Parameters:

Name Type Description Default
searchpath str, os.PathLike, or iterable of str/PathLike

Directories searched for templates, in priority order. Non-existent directories are silently ignored. Defaults to (".",).

tuple('.')
pp_environment Environment or None

A pre-configured Jinja2 environment to use instead of the default :class:~forgather.preprocess.PPEnvironment. When None (default) a fresh PPEnvironment is created from searchpath.

None
global_vars dict or None

Variables injected into the Jinja2 global namespace and available in every template. Merged with any variables already present in pp_environment. Defaults to {}.

None

Examples:

>>> env = ConfigEnvironment(searchpath=["/path/to/templates"])
>>> config = env.load("configs/train.yaml")
>>> node_graph, pp_text = config.get()
Source code in src/forgather/config.py
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
class ConfigEnvironment:
    """Jinja2 preprocessing and YAML parsing environment for Forgather configurations.

    ``ConfigEnvironment`` wraps a :class:`~forgather.preprocess.PPEnvironment`
    (a customised Jinja2 environment) and a suite of custom YAML constructors
    that translate ``!call``, ``!singleton``, ``!factory``, ``!partial``, and
    ``!var`` tags into :class:`~forgather.latent.Node` objects.  The result of
    :meth:`load` is a :class:`Config` containing both the parsed node graph and
    the preprocessed YAML text.

    Parameters
    ----------
    searchpath : str, os.PathLike, or iterable of str/PathLike, optional
        Directories searched for templates, in priority order.  Non-existent
        directories are silently ignored.  Defaults to ``(".",)``.
    pp_environment : jinja2.Environment or None, optional
        A pre-configured Jinja2 environment to use instead of the default
        :class:`~forgather.preprocess.PPEnvironment`.  When ``None`` (default)
        a fresh ``PPEnvironment`` is created from *searchpath*.
    global_vars : dict or None, optional
        Variables injected into the Jinja2 global namespace and available in
        every template.  Merged with any variables already present in
        *pp_environment*.  Defaults to ``{}``.

    Examples
    --------
    >>> env = ConfigEnvironment(searchpath=["/path/to/templates"])
    >>> config = env.load("configs/train.yaml")
    >>> node_graph, pp_text = config.get()
    """

    def __init__(
        self,
        searchpath: Iterable[str | os.PathLike] | str | os.PathLike = tuple("."),
        pp_environment: Optional[Environment] = None,
        global_vars: Optional[Dict[str, Any]] = None,
    ):
        if global_vars is None:
            global_vars = {}
        # Convert search path to tuple, if str or os.PathLike
        if isinstance(searchpath, os.PathLike) or isinstance(searchpath, str):
            searchpath = [searchpath]
        assert isinstance(searchpath, Iterable), "searchpath must be Iterable"

        # Remove non-existent directories from searchpath
        searchpath = list(filter(lambda path: os.path.isdir(path), searchpath))

        if pp_environment is None:
            pp_environment = PPEnvironment(searchpath=searchpath)
        self.pp_environment = pp_environment
        self.pp_environment.globals |= global_vars

    def get_pp_environment(self):
        return self.pp_environment

    def get_loader(self):
        return self.pp_environment.loader

    def preprocess(
        self,
        config_path: os.PathLike | str,
        /,
        **kwargs,
    ) -> ConfigText:
        """Render a configuration template through Jinja2 and return the result.

        Locates *config_path* in the search path, renders it with the
        configured global variables plus any extra *kwargs*, and returns the
        resulting YAML text.

        Parameters
        ----------
        config_path : str or os.PathLike
            Template path relative to the search path (e.g. ``"configs/train.yaml"``).
        **kwargs
            Additional keyword arguments passed as Jinja2 template variables,
            overriding globals for this render.

        Returns
        -------
        ConfigText
            The rendered YAML text.  :class:`ConfigText` is a ``str`` subclass
            that additionally exposes :meth:`~ConfigText.with_line_numbers`.
        """
        try:
            template = self.pp_environment.get_template(str(config_path))
            return ConfigText(template.render(**kwargs))
        except jinja2_exceptions.TemplateError as exc:
            raise self._wrap_template_error(exc, str(config_path)) from exc

    def preprocess_with_trace(
        self,
        config_path: os.PathLike | str,
        /,
        **kwargs,
    ) -> Tuple["ConfigText", List[Tuple[str, str]]]:
        """Preprocess *config_path* and also return the per-template trace.

        Runs :meth:`preprocess` inside :func:`capture_pp` so the second element
        of the returned tuple is the ordered list of
        ``(template_name, preprocessed_source)`` pairs that participated in the
        render — the same data that ``pp_verbose`` prints to stdout, but
        returned programmatically.

        Returns
        -------
        (ConfigText, list[tuple[str, str]])
            The fully rendered text plus the per-template trace (load order).
        """
        with capture_pp() as trace:
            text = self.preprocess(config_path, **kwargs)
        return text, list(trace)

    @staticmethod
    def _wrap_template_error(
        exc: jinja2_exceptions.TemplateError, config_path: str
    ) -> PreprocessError:
        """Convert a Jinja2 ``TemplateError`` into a structured ``PreprocessError``."""
        template_name: Optional[str] = None
        lineno: Optional[int] = None
        source_context: Optional[str] = None
        message = str(exc)

        if isinstance(exc, jinja2_exceptions.TemplateSyntaxError):
            template_name = exc.name or exc.filename or config_path
            lineno = exc.lineno
            message = exc.message or message
            if exc.source:
                source_context = _format_source_excerpt(exc.source, lineno)
        else:
            # UndefinedError, TemplateNotFound, etc. — fall back to whatever
            # contextual info is available; line info usually missing.
            template_name = getattr(exc, "name", None) or config_path

        return PreprocessError(
            message,
            template_name=template_name,
            lineno=lineno,
            source_context=source_context,
            original=exc,
        )

    def preprocess_from_string(
        self,
        config: str,
        /,
        **kwargs,
    ) -> ConfigText:
        """Render a configuration template supplied as a string through Jinja2.

        Parameters
        ----------
        config : str
            Raw template source text (may contain Jinja2 directives).
        **kwargs
            Additional keyword arguments passed as Jinja2 template variables.

        Returns
        -------
        ConfigText
            The rendered YAML text.
        """
        template = self.pp_environment.from_string(config)
        return ConfigText(template.render(**kwargs))

    def load(
        self,
        config_path: os.PathLike | str,
        /,
        **kwargs,
    ) -> Config:
        """Preprocess and parse a configuration file into a node graph.

        Combines :meth:`preprocess` and :meth:`load_from_ppstring` into a
        single call.

        Parameters
        ----------
        config_path : str or os.PathLike
            Template path relative to the search path.
        **kwargs
            Additional Jinja2 template variables forwarded to :meth:`preprocess`.

        Returns
        -------
        Config
            Container holding the parsed node graph and the preprocessed YAML
            text.

        Raises
        ------
        Exception
            Any YAML or node-graph parse error, annotated with the numbered
            preprocessed source for easier debugging.
        """
        pp_config = self.preprocess(config_path, **kwargs)
        return self.load_from_ppstring(pp_config)

    def load_from_string(
        self,
        config: str,
        /,
        **kwargs,
    ) -> Config:
        """Preprocess and parse a configuration supplied as a string.

        Parameters
        ----------
        config : str
            Raw template source text.
        **kwargs
            Additional Jinja2 template variables forwarded to
            :meth:`preprocess_from_string`.

        Returns
        -------
        Config
            Container holding the parsed node graph and the preprocessed YAML
            text.
        """
        pp_config = self.preprocess_from_string(config, **kwargs)
        return self.load_from_ppstring(pp_config)

    def load_from_ppstring(self, pp_config: str) -> Config:
        """Parse an already-preprocessed YAML string into a node graph.

        Parses *pp_config* with the custom YAML constructors (``!call``,
        ``!singleton``, ``!factory``, ``!partial``, ``!var``, etc.) and
        validates the resulting graph with :meth:`~forgather.latent.Latent.check`.

        Parameters
        ----------
        pp_config : str
            Fully rendered YAML text (output of Jinja2 preprocessing).

        Returns
        -------
        Config
            Container holding the parsed node graph and *pp_config*.

        Raises
        ------
        Exception
            Any YAML parse error or node-graph validation error, annotated with
            line-numbered source text.
        """
        try:
            loaded_config = load_depth_first(pp_config, Loader=ConfigLoader)
            Latent.check(loaded_config)
        except yaml.YAMLError as error:
            note = format_line_numbers(pp_config)
            raise self._wrap_yaml_error(error, pp_config) from add_exception_notes(
                error, note
            )
        except Exception as error:
            raise add_exception_notes(error, format_line_numbers(pp_config))
        if isinstance(loaded_config, dict):
            loaded_config = ConfigDict(loaded_config)
        return Config(loaded_config, pp_config)

    def render_code(
        self,
        config_path: os.PathLike | str,
        /,
        *,
        target: Optional[str] = "main",
        **kwargs,
    ) -> str:
        """Render *config_path* as Python source via :func:`forgather.codegen.generate_code`.

        Mirrors the ``forgather code`` CLI: preprocesses + parses the config,
        looks up *target* (default ``"main"``) in the resulting node graph,
        and runs the codegen template. When *target* is ``None`` the entire
        config graph is rendered (useful for reviewing every materialisable
        target in one document).

        Raises
        ------
        PreprocessError
            Jinja2 preprocessing failed (delegated from :meth:`preprocess`).
        YamlParseError
            The preprocessed text was not valid YAML.
        CodeGenError
            *target* was not found in the config or codegen itself raised.
        """
        from .codegen import generate_code  # avoid cycle at import time

        config = self.load(config_path, **kwargs).config
        if target is None:
            obj = config
        else:
            try:
                obj = config[target]
            except (KeyError, TypeError) as exc:
                available = sorted(config.keys()) if isinstance(config, Mapping) else []
                raise CodeGenError(
                    f"target {target!r} not found in config "
                    f"(available: {', '.join(available) if available else '<none>'})",
                    template_name=str(config_path),
                    original=exc,
                ) from exc
        try:
            return generate_code(obj)
        except Exception as exc:
            raise CodeGenError(
                str(exc),
                template_name=str(config_path),
                original=exc,
            ) from exc

    @staticmethod
    def _wrap_yaml_error(exc: yaml.YAMLError, pp_config: str) -> "YamlParseError":
        """Convert ``yaml.YAMLError`` into a structured :class:`YamlParseError`."""
        lineno: Optional[int] = None
        message = str(exc)
        # Most yaml errors expose problem_mark / problem; MarkedYAMLError is the
        # common base. We use problem_mark for line/column and problem for the
        # short message; everything else falls back to ``str(exc)``.
        problem_mark = getattr(exc, "problem_mark", None)
        if problem_mark is not None:
            lineno = problem_mark.line + 1  # PyYAML marks are 0-based
        problem = getattr(exc, "problem", None)
        context = getattr(exc, "context", None)
        if problem:
            message = f"{context}: {problem}" if context else problem
        source_context = _format_source_excerpt(pp_config, lineno)
        return YamlParseError(
            message,
            template_name="<preprocessed>",
            lineno=lineno,
            source_context=source_context,
            original=exc,
        )

    def find_referenced_templates(
        self,
        template_name: os.PathLike | str,
        /,
        **kwargs,
    ) -> Iterator[Tuple[int, str, str]]:
        """Iterate over the full template inheritance hierarchy for a given template.

        Traces actual template loading at render time so that dynamic
        ``extends`` / ``include`` references (those whose targets are computed
        by Jinja2 expressions) are captured in addition to statically declared
        ones.

        Parameters
        ----------
        template_name : str or os.PathLike
            Name of the root template to analyse (relative to the search path).
        **kwargs
            Forwarded to the inner :meth:`load` call so dynamic-args
            conditional includes (e.g. ``include "trainers/" + trainer_type +
            ".yaml"``) resolve to the right files. Omit for the static-default
            view.

        Yields
        ------
        level : int
            Depth of this template in the hierarchy (0 = root).
        name : str
            Template name as it appears in the loader.
        filename : str
            Filesystem path to the template file.
        """
        # Use render-time tracing to get complete template hierarchy
        load_sequence, dependencies = self._trace_template_rendering(
            template_name, **kwargs
        )

        # Convert to the expected format with hierarchy levels
        template_levels = self._build_hierarchy_levels(load_sequence, dependencies)

        for template_name, level in template_levels:
            # Get filename for this template
            filename = next(
                (filename for name, filename in load_sequence if name == template_name),
                template_name,
            )
            yield (level, template_name, filename)

    def get_template_dependencies(self, template_name: os.PathLike | str, /, **kwargs):
        """Return raw dependency relationships for a template, suitable for graph generation.

        Parameters
        ----------
        template_name : str or os.PathLike
            Name of the root template to analyse.
        **kwargs
            Forwarded to the inner :meth:`load` so the trace reflects which
            templates are actually included given those Jinja variables.

        Returns
        -------
        load_sequence : list of tuple[str, str]
            Ordered list of ``(template_name, filename)`` pairs in the order
            templates were loaded during rendering.
        dependencies_dict : dict[str, set[str]]
            Mapping from each template name to the set of template names it
            directly references (via ``extends`` or ``include``).
        """
        return self._trace_template_rendering(template_name, **kwargs)

    def _trace_template_rendering(
        self, template_name: os.PathLike | str, /, **kwargs
    ) -> Tuple[List[Tuple[str, str]], Dict[str, Set[str]]]:
        """
        Trace all templates loaded during rendering using a tracing loader

        Returns: (load_sequence, dependencies)
        """
        from .preprocess import PPLoader

        # Create a tracing version of the loader
        class TracingPPLoader(PPLoader):
            def __init__(self, original_loader):
                # Copy configuration from original loader
                if hasattr(original_loader, "searchpath"):
                    super().__init__(original_loader.searchpath)
                else:
                    super().__init__([])

                # Copy existing templates
                if hasattr(original_loader, "templates"):
                    self.templates = original_loader.templates.copy()

                self.load_trace = []
                self.load_stack = []
                self.dependencies = {}
                self.is_tracing = False
                # Also track static relationships from original method
                self.static_dependencies = {}

            def get_source(self, environment, template):
                result = super().get_source(environment, template)

                if self.is_tracing:
                    source, filename, uptodate = result
                    self.load_trace.append((template, filename))

                    # Analyze the source to understand relationship types
                    static_refs = self._get_static_references(
                        source, environment, template, filename
                    )

                    # Store static relationships for this template (these are the real dependencies)
                    if static_refs:
                        self.static_dependencies[template] = set(static_refs)

                    self.load_stack.append(template)

                return result

            def _get_static_references(
                self, source, environment, template_name, filename
            ):
                """Get static template references using the original method logic"""
                try:
                    ast = environment.parse(
                        source, name=template_name, filename=filename
                    )
                    return sorted(
                        (
                            x
                            for x in meta.find_referenced_templates(ast)
                            if x is not None
                        ),
                        key=lambda a: 1 if a.endswith(".yaml") else -1,
                    )
                except:
                    return []

        # Replace loader temporarily
        original_loader = self.pp_environment.loader
        tracing_loader = TracingPPLoader(original_loader)

        try:
            self.pp_environment.loader = tracing_loader
            tracing_loader.is_tracing = True

            # Render the template to trace all dependencies. Forwarding
            # **kwargs is what lets dynamic-args-driven includes resolve
            # to the right templates (e.g. trainer_type → trainers/X.yaml).
            self.load(template_name, **kwargs)

            # Use static dependencies, but also try to infer dynamic relationships
            static_deps = tracing_loader.static_dependencies.copy()

            # Post-process to identify likely dynamic relationships
            dynamic_deps = self._identify_dynamic_relationships(
                tracing_loader.load_trace, static_deps, tracing_loader
            )

            # Merge dynamic relationships into static ones
            for parent, children in dynamic_deps.items():
                if parent not in static_deps:
                    static_deps[parent] = set()
                static_deps[parent].update(children)

            return tracing_loader.load_trace.copy(), static_deps

        finally:
            # Restore original loader
            self.pp_environment.loader = original_loader

    def _build_hierarchy_levels(
        self, load_sequence: List[Tuple[str, str]], dependencies: Dict[str, Set[str]]
    ) -> List[Tuple[str, int]]:
        """
        Build hierarchy levels preserving multiple inheritance hierarchies

        This tries to maintain the structure where templates that are included/extended
        appear at appropriate levels rather than forcing a single linear hierarchy.
        """
        if not load_sequence:
            return []

        # Build reverse dependency map (child -> parents)
        parents = {}
        for parent, children in dependencies.items():
            for child in children:
                if child not in parents:
                    parents[child] = set()
                parents[child].add(parent)

        # Find root templates (those with no parents or are the starting template)
        root_template = load_sequence[0][0]
        all_templates = {name for name, _ in load_sequence}
        roots = {root_template}

        # Also consider templates that aren't children of any other template as roots
        children_set = set()
        for children in dependencies.values():
            children_set.update(children)

        for template in all_templates:
            if template not in children_set and template != root_template:
                roots.add(template)

        # Use a more sophisticated approach that preserves hierarchy structure
        levels = []
        processed = set()

        def assign_level(template, level, visited_path=None):
            if visited_path is None:
                visited_path = set()

            if template in visited_path:  # Cycle detection
                return
            if template in processed:
                return

            processed.add(template)
            levels.append((template, level))
            visited_path.add(template)

            # Process children
            children = dependencies.get(template, set())
            for child in sorted(children):  # Sort for consistent output
                if child not in processed:
                    assign_level(child, level + 1, visited_path.copy())

            visited_path.remove(template)

        # Process templates in order they appear in load_sequence
        # This preserves the natural discovery order while building hierarchy
        remaining_templates = {name for name, _ in load_sequence}

        # Start with the main root
        if root_template in remaining_templates:
            assign_level(root_template, 0)
            remaining_templates.remove(root_template)

        # Process templates in load order, but only if they haven't been processed
        # This helps maintain the structure where includes appear at reasonable levels
        for template_name, _ in load_sequence:
            if template_name in remaining_templates:
                # Check if this template has any unprocessed parents
                template_parents = parents.get(template_name, set())
                unprocessed_parents = template_parents - processed

                if not unprocessed_parents:  # No unprocessed parents, can be a root
                    assign_level(template_name, 0)
                    remaining_templates.remove(template_name)

        # Finally, ensure all templates from load_sequence are included
        # (in case there are disconnected components)
        for template_name, _ in load_sequence:
            if template_name not in processed:
                # Find the minimum level based on parents
                min_level = 0
                if template_name in parents:
                    parent_levels = []
                    for parent in parents[template_name]:
                        parent_level = next(
                            (level for t, level in levels if t == parent), -1
                        )
                        if parent_level >= 0:
                            parent_levels.append(parent_level)
                    if parent_levels:
                        min_level = max(parent_levels) + 1

                levels.append((template_name, min_level))
                processed.add(template_name)

        return levels

    def _identify_dynamic_relationships(
        self, load_sequence, static_deps, tracing_loader
    ):
        """
        Identify dynamic template relationships using a simple heuristic

        Look for specific known patterns like 'tiny.trainer_config' followed by 'trainers/*'
        """
        dynamic_deps = {}

        # Simple heuristic: look for known dynamic patterns
        for i, (template_name, filename) in enumerate(load_sequence):
            # Look for tiny.trainer_config followed by trainers/ templates
            if template_name == "tiny.trainer_config":
                # Look at the next few templates
                for j in range(i + 1, min(i + 3, len(load_sequence))):
                    next_template, _ = load_sequence[j]
                    if next_template.startswith("trainers/"):
                        # This is likely the dynamic resolution
                        dynamic_deps[template_name] = {next_template}
                        break

        return dynamic_deps

    def _has_dynamic_extends(self, source):
        """Check if template has dynamic extends/include syntax"""
        import re

        dynamic_pattern = re.compile(
            r"--\s*(?:extends|include)\s+([a-zA-Z_][a-zA-Z0-9_]*(?:\.[a-zA-Z_][a-zA-Z0-9_]*)*)(?:\s|$)",
            re.MULTILINE,
        )
        return bool(dynamic_pattern.search(source))

    def _looks_like_dynamic_target(self, template_name):
        """Check if template name looks like a likely dynamic resolution target"""
        return (
            template_name.startswith("trainers/")
            or template_name.startswith("models/")
            or template_name.startswith("datasets/")
            or template_name.startswith("callbacks/")
            or "trainer" in template_name.lower()
            or "model" in template_name.lower()
        )

preprocess(config_path, /, **kwargs)

Render a configuration template through Jinja2 and return the result.

Locates config_path in the search path, renders it with the configured global variables plus any extra kwargs, and returns the resulting YAML text.

Parameters:

Name Type Description Default
config_path str or PathLike

Template path relative to the search path (e.g. "configs/train.yaml").

required
**kwargs

Additional keyword arguments passed as Jinja2 template variables, overriding globals for this render.

{}

Returns:

Type Description
ConfigText

The rendered YAML text. :class:ConfigText is a str subclass that additionally exposes :meth:~ConfigText.with_line_numbers.

Source code in src/forgather/config.py
def preprocess(
    self,
    config_path: os.PathLike | str,
    /,
    **kwargs,
) -> ConfigText:
    """Render a configuration template through Jinja2 and return the result.

    Locates *config_path* in the search path, renders it with the
    configured global variables plus any extra *kwargs*, and returns the
    resulting YAML text.

    Parameters
    ----------
    config_path : str or os.PathLike
        Template path relative to the search path (e.g. ``"configs/train.yaml"``).
    **kwargs
        Additional keyword arguments passed as Jinja2 template variables,
        overriding globals for this render.

    Returns
    -------
    ConfigText
        The rendered YAML text.  :class:`ConfigText` is a ``str`` subclass
        that additionally exposes :meth:`~ConfigText.with_line_numbers`.
    """
    try:
        template = self.pp_environment.get_template(str(config_path))
        return ConfigText(template.render(**kwargs))
    except jinja2_exceptions.TemplateError as exc:
        raise self._wrap_template_error(exc, str(config_path)) from exc

preprocess_with_trace(config_path, /, **kwargs)

Preprocess config_path and also return the per-template trace.

Runs :meth:preprocess inside :func:capture_pp so the second element of the returned tuple is the ordered list of (template_name, preprocessed_source) pairs that participated in the render — the same data that pp_verbose prints to stdout, but returned programmatically.

Returns:

Type Description
(ConfigText, list[tuple[str, str]])

The fully rendered text plus the per-template trace (load order).

Source code in src/forgather/config.py
def preprocess_with_trace(
    self,
    config_path: os.PathLike | str,
    /,
    **kwargs,
) -> Tuple["ConfigText", List[Tuple[str, str]]]:
    """Preprocess *config_path* and also return the per-template trace.

    Runs :meth:`preprocess` inside :func:`capture_pp` so the second element
    of the returned tuple is the ordered list of
    ``(template_name, preprocessed_source)`` pairs that participated in the
    render — the same data that ``pp_verbose`` prints to stdout, but
    returned programmatically.

    Returns
    -------
    (ConfigText, list[tuple[str, str]])
        The fully rendered text plus the per-template trace (load order).
    """
    with capture_pp() as trace:
        text = self.preprocess(config_path, **kwargs)
    return text, list(trace)

preprocess_from_string(config, /, **kwargs)

Render a configuration template supplied as a string through Jinja2.

Parameters:

Name Type Description Default
config str

Raw template source text (may contain Jinja2 directives).

required
**kwargs

Additional keyword arguments passed as Jinja2 template variables.

{}

Returns:

Type Description
ConfigText

The rendered YAML text.

Source code in src/forgather/config.py
def preprocess_from_string(
    self,
    config: str,
    /,
    **kwargs,
) -> ConfigText:
    """Render a configuration template supplied as a string through Jinja2.

    Parameters
    ----------
    config : str
        Raw template source text (may contain Jinja2 directives).
    **kwargs
        Additional keyword arguments passed as Jinja2 template variables.

    Returns
    -------
    ConfigText
        The rendered YAML text.
    """
    template = self.pp_environment.from_string(config)
    return ConfigText(template.render(**kwargs))

load(config_path, /, **kwargs)

Preprocess and parse a configuration file into a node graph.

Combines :meth:preprocess and :meth:load_from_ppstring into a single call.

Parameters:

Name Type Description Default
config_path str or PathLike

Template path relative to the search path.

required
**kwargs

Additional Jinja2 template variables forwarded to :meth:preprocess.

{}

Returns:

Type Description
Config

Container holding the parsed node graph and the preprocessed YAML text.

Raises:

Type Description
Exception

Any YAML or node-graph parse error, annotated with the numbered preprocessed source for easier debugging.

Source code in src/forgather/config.py
def load(
    self,
    config_path: os.PathLike | str,
    /,
    **kwargs,
) -> Config:
    """Preprocess and parse a configuration file into a node graph.

    Combines :meth:`preprocess` and :meth:`load_from_ppstring` into a
    single call.

    Parameters
    ----------
    config_path : str or os.PathLike
        Template path relative to the search path.
    **kwargs
        Additional Jinja2 template variables forwarded to :meth:`preprocess`.

    Returns
    -------
    Config
        Container holding the parsed node graph and the preprocessed YAML
        text.

    Raises
    ------
    Exception
        Any YAML or node-graph parse error, annotated with the numbered
        preprocessed source for easier debugging.
    """
    pp_config = self.preprocess(config_path, **kwargs)
    return self.load_from_ppstring(pp_config)

load_from_string(config, /, **kwargs)

Preprocess and parse a configuration supplied as a string.

Parameters:

Name Type Description Default
config str

Raw template source text.

required
**kwargs

Additional Jinja2 template variables forwarded to :meth:preprocess_from_string.

{}

Returns:

Type Description
Config

Container holding the parsed node graph and the preprocessed YAML text.

Source code in src/forgather/config.py
def load_from_string(
    self,
    config: str,
    /,
    **kwargs,
) -> Config:
    """Preprocess and parse a configuration supplied as a string.

    Parameters
    ----------
    config : str
        Raw template source text.
    **kwargs
        Additional Jinja2 template variables forwarded to
        :meth:`preprocess_from_string`.

    Returns
    -------
    Config
        Container holding the parsed node graph and the preprocessed YAML
        text.
    """
    pp_config = self.preprocess_from_string(config, **kwargs)
    return self.load_from_ppstring(pp_config)

load_from_ppstring(pp_config)

Parse an already-preprocessed YAML string into a node graph.

Parses pp_config with the custom YAML constructors (!call, !singleton, !factory, !partial, !var, etc.) and validates the resulting graph with :meth:~forgather.latent.Latent.check.

Parameters:

Name Type Description Default
pp_config str

Fully rendered YAML text (output of Jinja2 preprocessing).

required

Returns:

Type Description
Config

Container holding the parsed node graph and pp_config.

Raises:

Type Description
Exception

Any YAML parse error or node-graph validation error, annotated with line-numbered source text.

Source code in src/forgather/config.py
def load_from_ppstring(self, pp_config: str) -> Config:
    """Parse an already-preprocessed YAML string into a node graph.

    Parses *pp_config* with the custom YAML constructors (``!call``,
    ``!singleton``, ``!factory``, ``!partial``, ``!var``, etc.) and
    validates the resulting graph with :meth:`~forgather.latent.Latent.check`.

    Parameters
    ----------
    pp_config : str
        Fully rendered YAML text (output of Jinja2 preprocessing).

    Returns
    -------
    Config
        Container holding the parsed node graph and *pp_config*.

    Raises
    ------
    Exception
        Any YAML parse error or node-graph validation error, annotated with
        line-numbered source text.
    """
    try:
        loaded_config = load_depth_first(pp_config, Loader=ConfigLoader)
        Latent.check(loaded_config)
    except yaml.YAMLError as error:
        note = format_line_numbers(pp_config)
        raise self._wrap_yaml_error(error, pp_config) from add_exception_notes(
            error, note
        )
    except Exception as error:
        raise add_exception_notes(error, format_line_numbers(pp_config))
    if isinstance(loaded_config, dict):
        loaded_config = ConfigDict(loaded_config)
    return Config(loaded_config, pp_config)

render_code(config_path, /, *, target='main', **kwargs)

Render config_path as Python source via :func:forgather.codegen.generate_code.

Mirrors the forgather code CLI: preprocesses + parses the config, looks up target (default "main") in the resulting node graph, and runs the codegen template. When target is None the entire config graph is rendered (useful for reviewing every materialisable target in one document).

Raises:

Type Description
PreprocessError

Jinja2 preprocessing failed (delegated from :meth:preprocess).

YamlParseError

The preprocessed text was not valid YAML.

CodeGenError

target was not found in the config or codegen itself raised.

Source code in src/forgather/config.py
def render_code(
    self,
    config_path: os.PathLike | str,
    /,
    *,
    target: Optional[str] = "main",
    **kwargs,
) -> str:
    """Render *config_path* as Python source via :func:`forgather.codegen.generate_code`.

    Mirrors the ``forgather code`` CLI: preprocesses + parses the config,
    looks up *target* (default ``"main"``) in the resulting node graph,
    and runs the codegen template. When *target* is ``None`` the entire
    config graph is rendered (useful for reviewing every materialisable
    target in one document).

    Raises
    ------
    PreprocessError
        Jinja2 preprocessing failed (delegated from :meth:`preprocess`).
    YamlParseError
        The preprocessed text was not valid YAML.
    CodeGenError
        *target* was not found in the config or codegen itself raised.
    """
    from .codegen import generate_code  # avoid cycle at import time

    config = self.load(config_path, **kwargs).config
    if target is None:
        obj = config
    else:
        try:
            obj = config[target]
        except (KeyError, TypeError) as exc:
            available = sorted(config.keys()) if isinstance(config, Mapping) else []
            raise CodeGenError(
                f"target {target!r} not found in config "
                f"(available: {', '.join(available) if available else '<none>'})",
                template_name=str(config_path),
                original=exc,
            ) from exc
    try:
        return generate_code(obj)
    except Exception as exc:
        raise CodeGenError(
            str(exc),
            template_name=str(config_path),
            original=exc,
        ) from exc

find_referenced_templates(template_name, /, **kwargs)

Iterate over the full template inheritance hierarchy for a given template.

Traces actual template loading at render time so that dynamic extends / include references (those whose targets are computed by Jinja2 expressions) are captured in addition to statically declared ones.

Parameters:

Name Type Description Default
template_name str or PathLike

Name of the root template to analyse (relative to the search path).

required
**kwargs

Forwarded to the inner :meth:load call so dynamic-args conditional includes (e.g. include "trainers/" + trainer_type + ".yaml") resolve to the right files. Omit for the static-default view.

{}

Yields:

Name Type Description
level int

Depth of this template in the hierarchy (0 = root).

name str

Template name as it appears in the loader.

filename str

Filesystem path to the template file.

Source code in src/forgather/config.py
def find_referenced_templates(
    self,
    template_name: os.PathLike | str,
    /,
    **kwargs,
) -> Iterator[Tuple[int, str, str]]:
    """Iterate over the full template inheritance hierarchy for a given template.

    Traces actual template loading at render time so that dynamic
    ``extends`` / ``include`` references (those whose targets are computed
    by Jinja2 expressions) are captured in addition to statically declared
    ones.

    Parameters
    ----------
    template_name : str or os.PathLike
        Name of the root template to analyse (relative to the search path).
    **kwargs
        Forwarded to the inner :meth:`load` call so dynamic-args
        conditional includes (e.g. ``include "trainers/" + trainer_type +
        ".yaml"``) resolve to the right files. Omit for the static-default
        view.

    Yields
    ------
    level : int
        Depth of this template in the hierarchy (0 = root).
    name : str
        Template name as it appears in the loader.
    filename : str
        Filesystem path to the template file.
    """
    # Use render-time tracing to get complete template hierarchy
    load_sequence, dependencies = self._trace_template_rendering(
        template_name, **kwargs
    )

    # Convert to the expected format with hierarchy levels
    template_levels = self._build_hierarchy_levels(load_sequence, dependencies)

    for template_name, level in template_levels:
        # Get filename for this template
        filename = next(
            (filename for name, filename in load_sequence if name == template_name),
            template_name,
        )
        yield (level, template_name, filename)

get_template_dependencies(template_name, /, **kwargs)

Return raw dependency relationships for a template, suitable for graph generation.

Parameters:

Name Type Description Default
template_name str or PathLike

Name of the root template to analyse.

required
**kwargs

Forwarded to the inner :meth:load so the trace reflects which templates are actually included given those Jinja variables.

{}

Returns:

Name Type Description
load_sequence list of tuple[str, str]

Ordered list of (template_name, filename) pairs in the order templates were loaded during rendering.

dependencies_dict dict[str, set[str]]

Mapping from each template name to the set of template names it directly references (via extends or include).

Source code in src/forgather/config.py
def get_template_dependencies(self, template_name: os.PathLike | str, /, **kwargs):
    """Return raw dependency relationships for a template, suitable for graph generation.

    Parameters
    ----------
    template_name : str or os.PathLike
        Name of the root template to analyse.
    **kwargs
        Forwarded to the inner :meth:`load` so the trace reflects which
        templates are actually included given those Jinja variables.

    Returns
    -------
    load_sequence : list of tuple[str, str]
        Ordered list of ``(template_name, filename)`` pairs in the order
        templates were loaded during rendering.
    dependencies_dict : dict[str, set[str]]
        Mapping from each template name to the set of template names it
        directly references (via ``extends`` or ``include``).
    """
    return self._trace_template_rendering(template_name, **kwargs)