Model Expansion
Expand Width
Division for Dense Neuron
Duplicate a neuron ‘k’ times, divide weight by k, so dot-product stays the same, backprop still works.
Expand Depth
Residual Layer
Insert residual layer with all zero params (w, b) to maintain correct forward pass, and to create another path for the feed.
Zero params, with (almost) any activation functions → Forward ok Gradient flows backward ok too. Model Shrinking
Expanding vs shrinking are like adding more workforce to do the same job, and less workforce can't do the same job thus it's hard.
Model shrinking is harder than expanding, and is usually just approximating, not exact. The operations include:
Reduce neurons in a layer: Neuron merging and update next layer weights. Only valid for identify or ReLU layers.