課程目錄:Machine Learning – Data science培訓
        4401 人關注
        (78637/99817)
        課程大綱:

            Machine Learning – Data science培訓

         

         

         

        Machine Learning introduction
        Types of Machine learning – supervised vs unsupervised learning
        From Statistical learning to Machine learning
        The Data Mining workflow:
        Business understanding
        Data Understanding
        Data preparation
        Modelling
        Evaluation
        Deployment
        Machine learning algorithms
        Choosing appropriate algorithm to the problem
        Overfitting and bias-variance tradeoff in ML
        ML libraries and programming languages
        Why use a programming language
        Choosing between R and Python
        Python crash course
        Python resources
        Python Libraries for Machine learning
        Jupyter notebooks and interactive coding
        Testing ML algorithms
        Generalization and overfitting
        Avoiding overfitting
        Holdout method
        Cross-Validation
        Bootstrapping
        Evaluating numerical predictions
        Measures of accuracy: ME, MSE, RMSE, MAPE
        Parameter and prediction stability
        Evaluating classification algorithms
        Accuracy and its problems
        The confusion matrix
        Unbalanced classes problem
        Visualizing model performance
        Profit curve
        ROC curve
        Lift curve
        Model selection
        Model tuning – grid search strategies
        Examples in Python
        Data preparation
        Data import and storage
        Understand the data – basic explorations
        Data manipulations with pandas library
        Data transformations – Data wrangling
        Exploratory analysis
        Missing observations – detection and solutions
        Outliers – detection and strategies
        Standarization, normalization, binarization
        Qualitative data recoding
        Examples in Python
        Classification
        Binary vs multiclass classification
        Classification via mathematical functions
        Linear discriminant functions
        Quadratic discriminant functions
        Logistic regression and probability approach
        k-nearest neighbors
        Na?ve Bayes
        Decision trees
        CART
        Bagging
        Random Forests
        Boosting
        Xgboost
        Support Vector Machines and kernels
        Maximal Margin Classifier
        Support Vector Machine
        Ensemble learning
        Examples in Python
        Regression and numerical prediction
        Least squares estimation
        Variables selection techniques
        Regularization and stability- L1, L2
        Nonlinearities and generalized least squares
        Polynomial regression
        Regression splines
        Regression trees
        Examples in Python
        Unsupervised learning
        Clustering
        Centroid-based clustering – k-means, k-medoids, PAM, CLARA
        Hierarchical clustering – Diana, Agnes
        Model-based clustering - EM
        Self organising maps
        Clusters evaluation and assessment
        Dimensionality reduction
        Principal component analysis and factor analysis
        Singular value decomposition
        Multidimensional Scaling
        Examples in Python
        Text mining
        Preprocessing data
        The bag-of-words model
        Stemming and lemmization
        Analyzing word frequencies
        Sentiment analysis
        Creating word clouds
        Examples in Python
        Recommendations engines and collaborative filtering
        Recommendation data
        User-based collaborative filtering
        Item-based collaborative filtering
        Examples in Python
        Association pattern mining
        Frequent itemsets algorithm
        Market basket analysis
        Examples in Python
        Outlier Analysis
        Extreme value analysis
        Distance-based outlier detection
        Density-based methods
        High-dimensional outlier detection
        Examples in Python
        Machine Learning case study
        Business problem understanding
        Data preprocessing
        Algorithm selection and tuning
        Evaluation of findings
        Deployment