This notebook describe indexing in Daru::DataFrame with the newly added Categorical Index and other index classes.

In [1]:
require 'daru'
Out[1]:
true

Helper function to get a sample dataframe.

In [2]:
def sample_df idx
  Daru::DataFrame.new({
    a: 1..5,
    b: 'a'..'e',
    c: 11..15
  }, index: idx)
end
Out[2]:
:sample_df

Categorical Index

In [3]:
idx = Daru::CategoricalIndex.new [:a, :b, :a, :b, :c]
Out[3]:
#<Daru::CategoricalIndex(5): {a, b, a, b, c}>
In [4]:
df = sample_df idx
Out[4]:
Daru::DataFrame(5x3)
a b c
a 1 a 11
b 2 b 12
a 3 c 13
b 4 d 14
c 5 e 15

#row[]

Retrive rows by category or position

Note: When index is both a valid category as well as position, then it will treated as category.

In [5]:
df.row[:a, :c]
Out[5]:
Daru::DataFrame(3x3)
a b c
a 1 a 11
a 3 c 13
c 5 e 15
In [6]:
df.row[0, 1]
Out[6]:
Daru::DataFrame(2x3)
a b c
a 1 a 11
b 2 b 12

#[]

Its to fetch vectors and works similar to #row[].

In [7]:
df[:a, :b]
Out[7]:
Daru::DataFrame(5x2)
a b
a 1 a
b 2 b
a 3 c
b 4 d
c 5 e

#row.at

To retrive rows by position.

In [8]:
df.row.at 0, 1, 2
Out[8]:
Daru::DataFrame(3x3)
a b c
a 1 a 11
b 2 b 12
a 3 c 13

#at

To retrive vectors by position.

In [9]:
df.at 0, 1
Out[9]:
Daru::DataFrame(5x2)
a b
a 1 a
b 2 b
a 3 c
b 4 d
c 5 e

#row[]=

Set rows by categories or positions.

Note: In case index is both a valid category and position, it will taken as category.

In [10]:
df.row[:a] = ['x', 'y', 'z']
df
Out[10]:
Daru::DataFrame(5x3)
a b c
a x y z
b 2 b 12
a x y z
b 4 d 14
c 5 e 15

#[]=

Works similar to #row[]= and is for vectors.

In [11]:
df[:a] = [1]*5
df
Out[11]:
Daru::DataFrame(5x3)
a b c
a 1 y z
b 1 b 12
a 1 y z
b 1 d 14
c 1 e 15

#row.set_at

Set rows by positions to a given vector

In [12]:
#reset dataframe
df = sample_df idx
Out[12]:
Daru::DataFrame(5x3)
a b c
a 1 a 11
b 2 b 12
a 3 c 13
b 4 d 14
c 5 e 15
In [13]:
df.row.set_at [0, 4], ['x', 'y', 'z']
df
Out[13]:
Daru::DataFrame(5x3)
a b c
a x y z
b 2 b 12
a 3 c 13
b 4 d 14
c x y z

#set_at

Works similar to #row.at_set

In [14]:
df.set_at [0, 1], [nil]*5
df
Out[14]:
Daru::DataFrame(5x3)
a b c
a z
b 12
a 13
b 14
c z

Other index classes

In [ ]: