Using SAS



Statistical Analysis Using SAS

This course includes basic to advanced statistical and analytic techniques, with various case studies, live projects and their application into different business areas. These techniques are quite instrumental to work in different realms of Analytics, like Marketing, Insurance, Banking and Risk Analytics.

Introduction to Analytics & Types of Analytics

  • Evolution of Analytics
  • Definition of Analytics Scope of analytics in different industries
  • Descriptive Analysis
  • Predictive Analysis
  • Prescriptive Analysis

Parametric test

  • Z test
  • T Test
  • Two Independent Sample T Test

The One-Sample T-Test in SAS

  • A manual computation
    • A data vector
    • The functions: mean(), sd(), (pqrd)qnorm()
    • Finding confidence intervals
    • Finding p-values
    • Issues with data
      • Using data stored in data frames (attach()/detach(), with())
      • Missing values
      • Cleaning up data
  • EDA graphs
    • Histogram()
    • Boxplot()
    • Densityplot() and qqnorm()
  • The t.test() function
  • P-values
  • Confidence intervals
  • The power of a t test

The Two-Sample T-Tests, the Chi-Square GOF test in SAS

  • GUI’s
    • Rcmdr
    • PMG
  • Tests with two data vectors x, and y
    • Two independed samples no equal variance assumption
    • Two independed samples assuming equal variance
    • Matched samples
    • Data stored using a factor to label one of two groups; x ~ f;
    • Boxplots for displaying more than two samples
    • The chisq.tests
      • Goodness of fit (R square and adjusted R Square)
      • Test of homogeneity or independence

Concept of Analysis of variance

  • Types of Anova
  • One Way Anova
  • Two Way Anova

Association between Variables

  • Chi square Test for Independence
  • Formulate an analysis plan
  • Analyze sample data
  • Interpret result
  • Scatter Plot- Interpretation Of Scatter Plot
  • Correlation among variables
  • Type of Correlation
  • Partial Correlation

The Simple Linear Regression Model in SAS

  • The basics of the Wilkinson-Rogers notation: y ~ x
  • * y ~ x linear regression
  • Scatterplots with regression lines
  • Reading the output of lm()
  • Confidence intervals for beta_0, beta_1
  • Tests on beta_0, beta_1
  • Identifying points in a plot
  • Diagnostic plots

Bootstrapping in SAS, Permutation Tests

  • An introduction to boostrapping
  • The sample() function
  • A bootstrap sample
  • Forming several bootstrap samples
  • Aside for loops vs. matrices and speed
    • Using the bootstrap
    • An introduction to permuation tests
    • A permutation test simulation

Cluster Analysis/ segmentation analysis

Appraches to cluster Analysis

  • Agglomerative Method
  • Divisive Method

Non Hierarchical Method K means clustering

Multiple/ Linear Regression

  • Simple Linear regression
  • Method of Least Square
  • Multiple linear regression with SAS
  • Simple examples, dummy explanatory variables, interpreting regression coefficients; finding a parsimonious model

Generalized Linear Models With SAS

  • Logistic regression with SAS
  • The need for a different model when the response variable is binary, the logistic transform and fitting the model to some simple examples, deviance residuals
  • Multiple regression and logistic regression as special cases of the generalized linear model
  • The Poisson model for count data.
  • The problem of overdispersion

Characterizing Time Series and the Forecasting Goal; Evaluating Predictive Accuracy and Data Partitioning

  • Concept of trend, Cyclical, Seasonal & Random Concept
  • Visualizing time series
  • Time series components
  • Forecasting vs. explanation
  • Performance evaluation
  • Naive forecasts
  • Different Approaches of Time Series
    • Stepwise Auto Regression
    • Exponential
    • Winter
  • Random walk model
  • Unit Root problem
  • Correlogram
  • AR Process (auto regressive)
  • MA Process (moving average)

Analysing Longitudinal Data Using SAS

  • Examples of longitudinal data
  • Simple graphics for longitudinal data and simple inference using the summary measure approach
  • The ‘long form’ of longitudinal data
  • Mixed-effects models for longitudinal data

Generalized Estimating Equations

  • Modeling the correlational structure of the repeated measurements
  • The generalized estimating equation approach for non-normal response variables in longitudinal data
  • The dropout problem