Questions or feedback? Reach me at [email protected]
Bio-Inspired Spiking Language Model

Neurons that
learn to speak

Real Sam is a language model built from biological neurons — not transformers. Using Leaky Integrate-and-Fire spiking neurons, curriculum learning, and environment-driven plasticity, it learns language the way a child does.

6M Parameters
11 Perplexity
~10% Firing Rate
LIF Neuron Type
spiking neural network hover to stimulate

A child doesn't start
with Shakespeare

Real Sam learns language through a biological curriculum — simple patterns first, complex ones later. Each phase builds on the last, with automatic advancement when the model converges.

01

Words

Learn individual tokens and basic word patterns. The neural environment is calm — every neuron settles into stable firing.

seq_len = 8 TinyStories
02

Phrases

Combine words into meaningful pairs. "the cat", "a big", "once upon". The model discovers certain words predict others.

seq_len = 16 TinyStories
03

Sentences

Grammar emerges. Subject-verb agreement, punctuation, dialogue. Neurons encode syntactic rules in spike patterns.

seq_len = 32 TinyStories
04

Stories

Narrative structure, character arcs, emotional content. Multi-paragraph stories with dialogue and cause-and-effect.

seq_len = 64 TinyStories
05

Conversations

The model learns to respond — "Hello" gets a reply, questions get answers. Full LR reset. Neurons rewire for dialogue.

seq_len = 32 → 128 Dolly / Alpaca / oasst1
06

Blended

Stories and conversations merge. The complete speaker — narrate, converse, and generate across all contexts.

seq_len = 128 All sources

Binary spikes, continuous thought

Each token becomes a binary spike pattern. Six layers of LIF neurons process it through residual connections, producing the next token prediction via weight-tied readout.

Input
Token
integer ID
Encoder
STE Spikes
binary {0,1}
Projection
Linear
256 → 512
× 6 Layers
Environment Block
LIF + diversity + residual
Readout
Weight-Tied
embedding.T
Output
Next Token
vocab logits

What makes it different

Four bio-inspired mechanisms working together — each grounded in neuroscience, each contributing to a model that learns more like a brain than a calculator.

Curriculum Learning

Data complexity increases in phases, mimicking infant language development. The model masters words before attempting sentences, sentences before stories. Phase transitions are automatic — triggered by loss convergence.

Inspired by: Developmental neuroscience, Elman (1993)

Shared Environment

One global stress signal modulates all neurons simultaneously — like cortisol in the bloodstream. Creates stable network-wide coordination. High loss = stressed environment = neurons explore more.

Inspired by: Cortical Labs CL1/DishBrain, Free Energy Principle

Neuron Diversity

Each neuron has a fixed "personality" — a diversity factor sampled at initialization, like biological receptor density. Same environment signal, different responses. Sensitive explorers and resilient anchors.

Inspired by: Neuromodulation (Marder 2002), gain modulation (Salinas 2000)

Firing Rate Regularization

Neurons maintain ~10% sparse firing through a loss penalty — not threshold manipulation. One unified optimization target. Backprop naturally discovers efficient sparse codes, just like biological cortex.

Inspired by: Sparse coding (Olshausen 1996), homeostatic plasticity

What 6 million spiking
neurons can write

Real output from the trained model. No cherry-picking — direct generations from binary spike computations.

real-sam v4 / phase 4 / 6,047,238 params
perplexity: 11
firing rate: ~10%
tokens/sec: 24,227
binary spikes: true

From random noise to
coherent speech

The curriculum progression — each phase builds on the last. Perplexity measures prediction quality (lower = better).

Phase Seq Length Epochs Train Loss Val Loss Perplexity
Words 8 1 → 25 4.22 → 2.62 3.33 → 2.64 28 → 14
Phrases 16 26 → 29 2.48 → 2.45 2.55 → 2.55 13 → 13
Sentences 32 30 → 33 2.41 → 2.40 2.48 → 2.48 12 → 12
Stories 64 34 → 37 2.38 → 2.43 2.40 → 2.44 11

Build the future of
biological AI

Real Sam is open source. Explore the code, train your own spiking language model, or contribute to the architecture.

View Source Code

Powered by spikes, not attention