# `LlamaCppEx.ModelManager.Backend`
[🔗](https://github.com/nyo16/llama_cpp_ex/blob/main/lib/llama_cpp_ex/model_manager/backend.ex#L1)

Behaviour for the model I/O the manager performs on the write path.

The default implementation is `LlamaCppEx.ModelManager.ModelIO`, which delegates
to `LlamaCppEx.Hub`, `LlamaCppEx.Model`, and `LlamaCppEx.Server`. Tests inject a
fake via the `:io` start option to exercise load/unload lifecycle without real
GGUF files.

Inference dispatch (`generate`/`stream`/`chat`/`embed`) does NOT go through this
behaviour — it reads the ETS table directly from the caller and calls the
relevant module, keeping the manager process off the hot path.

# `load_model`

```elixir
@callback load_model(
  String.t(),
  keyword()
) :: {:ok, LlamaCppEx.Model.t()} | {:error, term()}
```

Loads a model directly (for `:direct` mode).

# `resolve_source`

```elixir
@callback resolve_source(
  LlamaCppEx.ModelManager.Entry.source(),
  keyword()
) :: {:ok, String.t(), non_neg_integer()} | {:error, term()}
```

Resolves a source to a local file path and its byte size, downloading from the
Hub if needed.

# `start_server`

```elixir
@callback start_server(id :: term(), path :: String.t(), keyword()) ::
  {:ok, pid()} | {:error, term()}
```

Starts a backing `LlamaCppEx.Server` for `id` (for `:server` mode).

# `stop_server`

```elixir
@callback stop_server(pid()) :: :ok
```

Stops a backing server, dropping its context and model refs.

---

*Consult [api-reference.md](api-reference.md) for complete listing*
