require 'daru'
require 'open-uri'
content = open('https://d37djvu3ytnwxt.cloudfront.net/asset-v1:MITx+15.071x_3+1T2016+type@asset+block/WHO.csv')
df = Daru::DataFrame.from_csv content
df = df.at 0..6
df.first
df.index = Daru::CategoricalIndex.new df['Region'].to_a
df.first 5
Say we want to know about regions as a whole. So let's index our dataset by 'Region' vector.
List all regions
df.index.categories
Let's find out how many countries lie in Africa region.
df.row['Africa'].size
Finding out the mean life expectancy of europe is as easy as-
df.row['Europe']['LifeExpectancy'].mean
Let's see the maximum life expectancy of South-East Asia
df.row['South-East Asia']['LifeExpectancy'].max
Set see the countries in Europe that top the list of LIfeExpectancy
df.row['Europe'].sort(['LifeExpectancy'], ascending: false).first 5
Lets see countries in South-East Asia
that have high FertilityRate
df.row['South-East Asia'].sort(['FertilityRate']).row.at -10..-1