Machine Learning Guidance
A collection of my Machine Learning and Data Science Projects, including both Theoretical and Practical (Applied) ML.
In addition, there are also references (paper, ebook, repo, tool, etc) that's interesting and helpful attached, ranging from beginner to advanced.
Methods
Data Modelling & Prediction
Generative vs Discriminative Model
Given the training data set D={(xi;yi)∣i≤N∈Z}, where yi is the corresponding output for the input xi.
Aspect\Model |
Generative |
Discriminative |
Learn obj |
P(x,y) Joint probability |
P(y∣x) Conditional probability |
Formulation |
class prior/conditional P(y), P(x∣y) |
likelihood P(y∣x) |
Result |
not direct (Bayes) P(y∣x) |
direct classification |
Examples |
Naive Bayes, HMM |
Logistic Reg, SVM, DNN |
Reference: Generative and Discriminative Model, Professor Andrew NG
Types of Learning
- Learner L(XI)=h∈H
- Input training data XI, where xi∈R
- Hypothesis hω:X∈Rn→Y, with weights ω.
- mapping attributes vectors X to labels/output Y={y1,...,yn}
- For NN, h(x)=f(ω;x), explicitly parameterized by ω
- For Generative model f:Z→X, Z is the latent variable
Output \ Type |
Unsupervised |
Supervised |
Continuous Y =R |
Clustering & Dim Reduction |
Regression |
|
○ SVD |
○ Linear / Polynomial |
|
○ PCA |
○ Non-Linear Regression |
|
○ K-means |
○ Decision Trees |
|
○ GAN ○ VAE ○ Diffusion |
○ Random Forest |
Discrete Y ={Categories} |
Association / Feature Analysis |
Classification |
|
○ Apriori |
○ Bayesian ○ SVM |
|
○ FP-Growth |
○ Logistic Regression ○ Perceptron |
|
○ HMM |
○ kNN / Trees |
And more,
Aspect \ Type |
Semi-Supervised |
Reinforcement |
Learn from |
Labels available |
Rewards |
Methods |
pseudo-labels |
○ Q learning |
|
iteratively |
○ Markov Decision Process |
Reinforcement Learning
- In a state each timestamp
- when an action is performed, we move to a new state and receive a reward
- No knowledge in advance of how actions affect either the new state or the reward
Goal
- Value-based V(s)
- the agent is expecting a long-term return of the current states under policy π
- Policy-based
- the action performed in every state helps you to gain maximum reward in the future
- Deterministic: For any state, the same action is produced by the policy π
- Stochastic: Every action has a certain probability
- Model-based
- create a virtual model for each environment
- the agent learns to perform in that specific environment
Feature Engineering
- Feature Selection
- After fitting, plot Residuals vs any Predictor Variable
- Linearly-dependent feature vectors
- Imputation
- Handling Outliers
- Removal, Replacing values, Capping, Discretization
- Encoding
- Integer Encoding
- One-Hot Encoding (enum -> binary)
- Scaling
- Normalization, min-max/ 0-1
- Standardization
Inference
Aspect |
Bayesianism |
Frequentism |
Interpretation of Probability |
A measure of belief or uncertainty |
The limit of relative frequencies in repeated experiments |
Methods |
Prior knowledge and updates beliefs (Bayes') to obtain posterior distributions |
Hypothesis testing, MLE, confidence intervals |
Treatment of Uncertainty Random Variables |
Parameters |
Data set |
Handling of Data |
useful when prior information is available or when the focus is on prediction intervals. |
often requires larger sample sizes |
Flexibility |
flexible model, allow updating models for new data |
more rigid, on specific statistical methods |
Computational Complexity |
can be intensive computation, for models with high-dim parameter spaces |
simpler computation and may be more straightforward in practice |
Empiricism
Applied ML Best Practice
DNN Troubleshooting
Basic
- Initial test set + a single metric to improve
- Target performance
- Human-level performance, published results, previous baselines, etc.
Intuition
- Results can be sensitive to small changes in hyperparameter and dataset makeup.
Tune hyperparameter
|
Start simple -> Implement & Debug -> Evaluate -> ?
|
Improve model & Data
- Start simple: simplest model & data possible (LeNet on a subset of the data)
- Implement & Debug: Once model runs, overfit a single batch & reproduce a know result
- Evaluate: Apply the bias-variance decomposition
- Tuning: Coarse-to-fine random search
- Improve model/data
- Make model bigger if underfit
- Add data or regularize if overfit
Troubleshooting
OpenAI Talk
DNN improvements
Improvement direction
My Projects
Foundation of Machine Learning (naive NLP, Network)
Machine Learning Real World Data, University of Cambridge IA
MLRD-Cambridge_IA
- Text Classification;
- Naive Bayes
- Cross-Validation, NLP
- HMM
- Social Network
Theoretical Machine Learning
Theoretical Machine Learning with Problems Sets, Stanford CS229
ML-Stanford_CS229
- Basic Concepts
- Linear classifiers (Logistic Regression, GDA)
- Stochastic Gradient Descent
- L1 L2 Regularization
- SVM
Computer Vision
Theoretical Computer Vision with Problems Sets, Stanford CS231n
DL-for-CV-Stanford_CS231n
- Image Classification + Localization (x,y,w,h)
[ Supervised Learning, Discrete label + Regression ]
- kNN
- Softmax
- classifier SVM classifier
- CNN
- Cross Validation
- Object Detection
- Semantic / Instance Segmentation
- Image Captioning
- RNN, Attention, Transformer
- Positional Encoding
- Video understanding
- Generative model (GAN, VAE)
- Self-Supervised Learning
See more: Visual Computing
More
Reference
OpenAI cookbook
📝OpenAI cookbook
Paper
Library Used
Numpy, matplotlib, pandas, TensorFlow
Caffe, Keras
XGBoost, gensim