Multiple Regression

This example shows how multiple regression can be performed using statsample and daru.

The lr() shorthand will call the function Statsample::Regression.multiple. It should be noted that internally statsample implements multiple regression using either Ruby methods or GSL methods. This lets statsample run even in the absence of gsl. But ruby implementations of functions are much much slower than those from GSL, and hence it is recomended that you install the rb-gsl or gsl-nmatrix gems before proceeding (these will work only on MRI).

Rb-gsl can be installed from rubygems directly with gem install rb-gsl. To see how to install gsl-nmatrix, see this blog post.

In [1]:
require 'statsample'

Statsample::Analysis.store(Statsample::Regression::Multiple) do
  Daru.lazy_update = true
  
  samples=2000
  ds = Daru::DataFrame.new({
    :a  => rnorm(samples),
    :b  => rnorm(samples),
    :cc => rnorm(samples),
    :d  => rnorm(samples)}, clone: false)
  attach(ds)
  ds[:y] = a*5+b*3+cc*2+d+rnorm(samples)
  
  # REMEMBER: It is _mandatory_ to call #update after assingnment cycles if your 
  # operations to be performed as expected.
  ds.update
  summary lr(ds,:y)
  
  Daru.lazy_update = false
end
Statsample::Analysis.run_batch
Analysis 2016-03-26 02:39:03 +0000
= Statsample::Regression::Multiple
  == Multiple reggresion of a,b,cc,d on y
    Engine: Statsample::Regression::Multiple::RubyEngine
    Cases(listwise)=2000(2000)
    R=0.987
    R^2=0.975
    R^2 Adj=0.975
    Std.Error R=0.992
    Equation=-0.004 + 4.990a + 2.965b + 1.986cc + 0.991d
    === ANOVA
      ANOVA Table
+------------+-----------+------+-----------+-----------+-------+
|   source   |    ss     |  df  |    ms     |     f     |   p   |
+------------+-----------+------+-----------+-----------+-------+
| Regression | 76290.828 | 4    | 19072.707 | 19373.613 | 0.000 |
| Error      | 1964.014  | 1995 | 0.984     |           |       |
| Total      | 78254.842 | 1999 | 19073.691 |           |       |
+------------+-----------+------+-----------+-----------+-------+

    Beta coefficients
+----------+--------+-------+-------+---------+
|  coeff   |   b    | beta  |  se   |    t    |
+----------+--------+-------+-------+---------+
| Constant | -0.004 | -     | 0.022 | -0.170  |
| a        | 4.990  | 0.805 | 0.022 | 226.825 |
| b        | 2.965  | 0.460 | 0.023 | 129.376 |
| cc       | 1.986  | 0.317 | 0.022 | 89.238  |
| d        | 0.991  | 0.160 | 0.022 | 44.963  |
+----------+--------+-------+-------+---------+