This example shows how multiple regression can be performed using statsample and daru.
The lr()
shorthand will call the function Statsample::Regression.multiple. It should be
noted that internally statsample implements multiple regression using either Ruby methods
or GSL methods. This lets statsample run even in the absence of gsl. But ruby implementations
of functions are much much slower than those from GSL, and hence it is recomended that you
install the rb-gsl or gsl-nmatrix gems before proceeding (these will work only on MRI).
Rb-gsl can be installed from rubygems directly with gem install rb-gsl
. To see how to install
gsl-nmatrix, see this blog post.
require 'statsample'
Statsample::Analysis.store(Statsample::Regression::Multiple) do
Daru.lazy_update = true
samples=2000
ds = Daru::DataFrame.new({
:a => rnorm(samples),
:b => rnorm(samples),
:cc => rnorm(samples),
:d => rnorm(samples)}, clone: false)
attach(ds)
ds[:y] = a*5+b*3+cc*2+d+rnorm(samples)
# REMEMBER: It is _mandatory_ to call #update after assingnment cycles if your
# operations to be performed as expected.
ds.update
summary lr(ds,:y)
Daru.lazy_update = false
end
Statsample::Analysis.run_batch