Machine Learning Guidance
A collection of my Machine Learning and Data Science Projects,
- including both Theoretical and Practical (Applied) ML,
- references (paper, ebook, repo, tool, etc), ranging from beginner to advanced.
Generative vs Discriminative Model
Given the training data set D={(xi;yi)∣i≤N∈Z}, where yi is the corresponding output for the input xi.
| Aspect\Model | Generative | Discriminative | 
| Learn obj | P(x,y) Joint probability
 | P(y∣x) Conditional probability
 | 
| Formulation | class prior/conditional P(y), P(x∣y)
 | likelihood P(y∣x)
 | 
| Result | not direct (Bayes) P(y∣x)
 | direct classification | 
| Examples | Naive Bayes, HMM | Logistic Reg, SVM, DNN
 | 
Reference: Generative and Discriminative Model, Professor Andrew NG
Types of Learning
- Learner L(XI)=h∈H
- Input training data XI, where xi∈R
- Hypothesis hω:X∈Rn→Y, with weights ω.
- mapping attributes vectors X to labels/output Y={y1,...,yn}
- For NN, h(x)=f(ω;x), explicitly parameterized by ω
- For Generative model f:Z→X, Z is the latent variable
 
 
| Output \ Type | Unsupervised | Supervised | 
| Continuous  Y =R
 | Clustering & Dim Reduction
 | Regression | 
|  | ○ SVD | ○ Linear / Polynomial | 
|  | ○ PCA | ○ Non-Linear Regression | 
|  | ○ K-means | ○ Decision Trees | 
|  | ○ GAN ○ VAE ○ Diffusion
 | ○ Random Forest | 
| Discrete  Y ={Categories}
 | Association / Feature Analysis
 | Classification | 
|  | ○ Apriori | ○ Bayesian       ○ SVM | 
|  | ○ FP-Growth | ○ Logistic Regression ○ Perceptron
 | 
|  | ○ HMM | ○ kNN / Trees | 
And more,
| Aspect \ Type | Semi-Supervised | Reinforcement | 
| Learn from | Labels available | Rewards | 
| Methods | pseudo-labels | ○ Q learning | 
|  | iteratively | ○ Markov Decision Process | 
Reinforcement Learning
- In a state each timestamp
- when an action is performed, we move to a new state and receive a reward
- No knowledge in advance of how actions affect either the new state or the reward
 
Goal
- Value-based V(s)
- the agent is expecting a long-term return of the current states under policy π
 
- Policy-based
- the action performed in every state helps you to gain maximum reward in the future
- Deterministic: For any state, the same action is produced by the policy π
- Stochastic: Every action has a certain probability
 
- Model-based
- create a virtual model for each environment
- the agent learns to perform in that specific environment
 
Geometric Deep Learning
The bigger picture of learning with invariances and symmetries:
| Domain | Structure | Symmetry / Bias | Example | 
| Images | 2D grid | Translation equivariant | CNNs | 
| Sequences | 1D sequence | Order-aware | RNNs, Transformers | 
| Sets / Point Clouds | Unordered set | Permutation invariant | Deep Sets, PointNet | 
| Graphs | Nodes + edges | Permutation equivariant | GNNs, Graph Isomorphism Networks | 
| Manifolds / Spheres | 2D surface embedded in 3D | Rotation equivariant | Spherical CNNs | 
Feature Engineering
- Feature Selection
- After fitting, plot Residuals vs any Predictor Variable
- Linearly-dependent feature vectors
 
- Imputation
- Handling Outliers
- Removal, Replacing values, Capping, Discretization
 
- Encoding
- Integer Encoding
- One-Hot Encoding (enum -> binary)
 
- Scaling
- Normalization, min-max/ 0-1
- Standardization
 
Inference
| Aspect | Bayesianism | Frequentism | 
| Interpretation of Probability | A measure of belief or uncertainty | The limit of relative frequencies in repeated experiments
 | 
| Methods | Prior knowledge and updates beliefs (Bayes') to obtain posterior distributions
 | Hypothesis testing, MLE, confidence intervals | 
| Treatment of Uncertainty Random Variables
 | Parameters | Data set | 
| Handling of Data | useful when prior information is available or when the focus is on prediction intervals.
 | often requires larger sample sizes | 
| Flexibility | flexible model, allow updating models for new data
 | more rigid, on specific statistical methods | 
| Computational Complexity | can be intensive computation, for models with high-dim parameter spaces
 | simpler computation and may be more straightforward in practice
 | 
Empiricism
Applied ML Best Practice
DNN Troubleshooting
Basic
- Initial test set + a single metric to improve
- Target performance
- Human-level performance, published results, previous baselines, etc.
 
Intuition
- Results can be sensitive to small changes in hyperparameter and dataset makeup.
                          Tune hyperparameter
                                  |
Start simple -> Implement & Debug -> Evaluate -> ?
                                  |
                         Improve model & Data
- Start simple: simplest model & data possible (LeNet on a subset of the data)
- Implement & Debug: Once model runs, overfit a single batch & reproduce a know result
- Evaluate: Apply the bias-variance decomposition
- Tuning: Coarse-to-fine random search
- Improve model/data
- Make model bigger if underfit
- Add data or regularize if overfit
 
Troubleshooting
OpenAI Talk
DNN improvements
Improvement direction
My Projects
Machine Learning Real World Data, University of Cambridge IA
- Text Classification; Naive Bayes; Cross-Validation,
- HMM; Social Network
Theoretical Machine Learning with Problems Sets, Stanford CS229
- Linear classifiers (Logistic Regression, GDA), SVM, etc
- Stochastic Gradient Descent; L1 L2 Regularization
Deep Learning for Computer Vision with Problems Sets, Stanford CS231n
- Image Classification  + Localization (x,y,w,h)
[ Supervised Learning, Discrete label + Regression ]
- kNN; Softmax; classifier SVM classifier; CNN
 
- Object Detection
- Semantic / Instance Segmentation
- Image Captioning
- RNN, Attention, Transformer
- Positional Encoding
 
- Video understanding
- Generative model (GAN, VAE)
- Self-Supervised Learning
See more: Visual Computing
More
- Data Science | Uni. of Cambridge, Undergraduate course.
- AI | Uni of Cambridge, IB
- Search, Game, CSPs, Knowledge representation and Reasoning, Planning, NN.
 
- Machine Learning and Bayesian Inference | Uni of Cambridge, Undergraduate course.
- Linear classifiers (SVM), Unsupervised learning (K-means,EM), Bayesian networks
 
- Geometric Deep Learning | Cambridge, Oxford Master's courses.
Reference
📝OpenAI cookbook
Generative Pre-trained Transformer (GPT) from Scratch (Andrej Karpathy)
Paper
Library
- Numpy, matplotlib, pandas, TensorFlow
- Caffe, Keras
- XGBoost, gensim