課程目錄: 大數據分析培訓 1

        4401 人關注
        (78637/99817)
        課程大綱:

        大數據分析培訓 1

         

         

         

        Section 1: Simple linear regression

        Fit a simple linear regression between two variables

        in R;Interpret output from R;Use models to predict a response variable;Validate the assumptions of the model.

        Section 2: Modelling data

        Adapt the simple linear regression model in R to deal with multiple variables;Incorporate continuous and categorical variables

        in their models;Select the best-fitting model by inspecting the R output.

        Section 3: Many models

        Manipulate nested dataframes in R;Use R to apply simultaneous linear models

        to large data frames by stratifying the data;Interpret the output of learner models.

        Section 4: Classification

        Adapt linear models to take into account when the response

        is a categorical variable;Implement Logistic regression (LR) in R;Implement

        Generalised linear models (GLMs) in R;Implement Linear discriminant analysis (LDA) in R.

        Section 5: Prediction using models

        Implement the principles of building a model to do prediction using classification;Split data into training and test sets,

        perform cross validation and model evaluation metrics;Use model selection for explaining data

        with models;Analyse the overfitting and bias-variance trade-off in prediction problems.

        Section 6: Getting bigger

        Set up and apply sparklyr;Use logical verbs in R by applying native sparklyr versions of the verbs.

        Section 7: Supervised machine learning with sparklyr

        Apply sparklyr to machine learning regression and classification models;Use machine learning models for prediction;

        Illustrate how distributed computing techniques can be used for “bigger” problems.

        Section 8: Deep learning

        Use massive amounts of data to train multi-layer networks for classification;

        Understand some of the guiding principles behind training deep networks, including the use of autoencoders,

        dropout, regularization, and early termination;Use sparklyr and H2O to train deep networks.

        Section 9: Deep learning applications and scaling up

        Understand some of the ways in which massive amounts of unlabelled data, and partially labelled data,

        is used to train neural network models;Leverage existing trained networks for targeting

        new applications;Implement architectures for object classification and object detection and assess their effectiveness.

        Section 10: Bringing it all together

        Consolidate your understanding of relationships between the methodologies presented in this course,

        theirrelative strengths, weaknesses and range of applicability of these methods.