Machine Learning Guidance

A collection of my Machine Learning and Data Science Projects, including both Theoretical and Practical (Applied) ML.

In addition, there are also references (paper, ebook, repo, tool, etc) that's interesting and helpful attached, ranging from beginner to advanced.

Methods

Data Modelling & Prediction

Generative vs Discriminative Model

Given the training data set $D = \{ ( x_i ; y_i ) | i ≤ N ∈ Z \}$ , where $y_i$ is the corresponding output for the input $x_i$ .

Aspect\Model	Generative	Discriminative
Learn obj	$P(x,y)$ Joint probability	$P(y\vert x)$ Conditional probability
Formulation	class prior/conditional $P(y)$ , $P(x\vert y)$	likelihood $P(y\vert x)$
Result	not direct (Bayes) $P(y\vert x)$	direct classification
Examples	Naive Bayes, HMM	Logistic Reg, SVM, DNN

Reference: Generative and Discriminative Model, Professor Andrew NG

Types of Learning

Learner $L(X_I)=h \in \mathcal{H}$ $L (X_{I}) = h \in H$
- Input training data $X_I$ , where $x_i \in \R$
- Hypothesis $h_{\omega}: X \in R^n \rightarrow Y$ $h_{ω} : X \in R^{n} \to Y$ , with weights $\omega$ $ω$ .
  - mapping attributes vectors $X$ to labels/output $Y=\{y_1,...,y_n\}$
  - For NN, $h(x)=f(\omega;x)$ , explicitly parameterized by $\omega$
  - For Generative model $f: Z \rightarrow X$ , $Z$ is the latent variable

Output \ Type	Unsupervised	Supervised
Continuous $Y$ $=\R$	Clustering & Dim Reduction	Regression
	○ SVD	○ Linear / Polynomial
	○ PCA	○ Non-Linear Regression
	○ K-means	○ Decision Trees
	○ GAN ○ VAE ○ Diffusion	○ Random Forest
Discrete $Y$ $=\{Categories\}$	Association / Feature Analysis	Classification
	○ Apriori	○ Bayesian ○ SVM
	○ FP-Growth	○ Logistic Regression ○ Perceptron
	○ HMM	○ kNN / Trees

And more,

Aspect \ Type	Semi-Supervised	Reinforcement
Learn from	Labels available	Rewards
Methods	pseudo-labels	○ Q learning
	iteratively	○ Markov Decision Process

Reinforcement Learning

In a state each timestamp
- when an action is performed, we move to a new state and receive a reward
- No knowledge in advance of how actions affect either the new state or the reward

Goal

Value-based V(s)
- the agent is expecting a long-term return of the current states under policy π
Policy-based
- the action performed in every state helps you to gain maximum reward in the future
- Deterministic: For any state, the same action is produced by the policy π
- Stochastic: Every action has a certain probability
Model-based
- create a virtual model for each environment
- the agent learns to perform in that specific environment

Feature Engineering

Feature Selection
- After fitting, plot Residuals vs any Predictor Variable
- Linearly-dependent feature vectors
Imputation
Handling Outliers
- Removal, Replacing values, Capping, Discretization
Encoding
- Integer Encoding
- One-Hot Encoding (enum -> binary)
Scaling
- Normalization, min-max/ 0-1
- Standardization

Inference

Aspect	Bayesianism	Frequentism
Interpretation of Probability	A measure of belief or uncertainty	The limit of relative frequencies in repeated experiments
Methods	Prior knowledge and updates beliefs (Bayes') to obtain posterior distributions	Hypothesis testing, MLE, confidence intervals
Treatment of Uncertainty Random Variables	Parameters	Data set
Handling of Data	useful when prior information is available or when the focus is on prediction intervals.	often requires larger sample sizes
Flexibility	flexible model, allow updating models for new data	more rigid, on specific statistical methods
Computational Complexity	can be intensive computation, for models with high-dim parameter spaces	simpler computation and may be more straightforward in practice

Empiricism

Applied ML Best Practice

DNN Troubleshooting

Basic

Initial test set + a single metric to improve
Target performance
- Human-level performance, published results, previous baselines, etc.

Intuition

Results can be sensitive to small changes in hyperparameter and dataset makeup.

                          Tune hyperparameter
                                  |
Start simple -> Implement & Debug -> Evaluate -> ?
                                  |
                         Improve model & Data

Start simple: simplest model & data possible (LeNet on a subset of the data)
Implement & Debug: Once model runs, overfit a single batch & reproduce a know result
Evaluate: Apply the bias-variance decomposition
Tuning: Coarse-to-fine random search
Improve model/data
- Make model bigger if underfit
- Add data or regularize if overfit

Troubleshooting

OpenAI Talk

DNN improvements

Improvement direction

My Projects

Foundation of Machine Learning (naive NLP, Network)

Machine Learning Real World Data, University of Cambridge IA

MLRD-Cambridge_IA

Text Classification;
Naive Bayes
Cross-Validation, NLP
HMM
Social Network

Theoretical Machine Learning

Theoretical Machine Learning with Problems Sets, Stanford CS229

ML-Stanford_CS229

Basic Concepts
- Linear classifiers (Logistic Regression, GDA)
- Stochastic Gradient Descent
- L1 L2 Regularization
- SVM

Computer Vision

Theoretical Computer Vision with Problems Sets, Stanford CS231n

DL-for-CV-Stanford_CS231n

Image Classification + Localization $(x,y,w,h)$ $(x, y, w, h)$ [ Supervised Learning, Discrete label + Regression ]
- kNN
- Softmax
- classifier SVM classifier
- CNN
- Cross Validation
Object Detection
Semantic / Instance Segmentation
Image Captioning
- RNN, Attention, Transformer
- Positional Encoding
Video understanding
Generative model (GAN, VAE)
Self-Supervised Learning

See more: Visual Computing

Data Science
- Course link | Uni of Cambridge, IB
AI
- Search, Game, CSPs, Knowledge representation and Reasoning, Planning, NN.
- Course link | Uni of Cambridge, IB
Machine Learning and Bayesian Inference
- Linear classifiers (SVM), Unsupervised learning (K-means,EM), Bayesian networks
- Course link | Uni of Cambridge, II

Reference

OpenAI cookbook

📝OpenAI cookbook

Generative Pre-trained Transformer (GPT) from Scratch (Andrej Karpathy)

Paper

Library Used

Numpy, matplotlib, pandas, TensorFlow

Caffe, Keras

XGBoost, gensim