DNN Improvements
Directions
Data Encoding
Positional Encoding
Model
- Weight initialization randomized (Default by torch.nn) ✓
nn.init package contains convenient initialization methods.
nn.init.kaiming_normal_(self.fc.weight)
- Batch Normalization ✓
- Layer Norm -
- Drop out
- Knowledge Distillation, over-parametrized
Loss Function
- MSE ✓
- Conservative
- L2 regularization (weight_decay in adam)
Activation Function
Accelerator
Accelerate CPU data reading bottleneck
- interleave CPU IO with GPU mini-batch training with multithreading.