Package a small brain
Wrap a focused LoRA adapter with provenance, protected inputs, evaluation data, fingerprints, and routing metadata.
SLMCortex lets you compose focused small brains into an extensible local runtime, validate them before execution, and avoid turning every coding workflow into a hosted LLM bill.
$ python scripts/run_slmcortex_demo.py package python_slm package debugging_slm compose runtime/ validate-runtime runtime/ infer --dry-run agent run --dry-run outputs: python_slm/ debugging_slm/ runtime/ agent-trace.json
The landing path mirrors the actual product path in the source repo: package focused capabilities, compose a runtime bundle, validate it, then serve local inference or run the bounded agent workflow.
Wrap a focused LoRA adapter with provenance, protected inputs, evaluation data, fingerprints, and routing metadata.
Combine validated SLM packages into an extensible runtime bundle without mutating the source assets.
Check package and runtime structure before inference, serving, or agent behavior touches a local repository.
Use the same Runtime Core for dry-run routing, model-backed inference, a compatibility server, and bounded agent runs.
The point is not to claim better models. The point is to make local coding capabilities cheaper to evaluate, easier to inspect, and reliable enough to run through bounded control flow.
Paid LLM services can turn every coding workflow into a metered remote dependency.
SLMCortex keeps focused SLM capabilities local, packaged, and inspectable before they run.
One large general model is often used where a smaller, focused capability would be enough.
Compose small brains into a runtime bundle and extend the engine one capability at a time.
Hosted API bills can climb before the workflow is even proven reliable.
Start with dry-run validation, then move to local inference when your backend and model setup are ready.
Start with dry-run validation. Move to real model-backed inference only when your local backend and model setup are ready.
The documented setup starts from a standard virtual environment and editable install.
Packages checked-in adapters, composes a runtime, validates it, and runs dry-run inference and agent flow.
MLX is used on Apple Silicon; GGUF covers Linux, Windows, macOS Intel, and explicit GGUF use.
A non-streaming OpenAI-compatible compatibility server is available for local runtime experiments.
The v0.1 agent is local and single-run, with writes controlled by flags rather than hidden background behavior.
SLMCortex is useful to evaluate because its limits are visible. The current release is a narrow local path, not a broad production-agent platform.
Use the docs to verify the demo, inspect commands, or read the runtime architecture.
Install SLMCortex and run the fastest no-model validation path.
See the packaging, routing, runtime, serving, and agent commands.
Understand Factory, Composer, Runtime Core, and Agent Runtime.