Chapter 4 Vectors, matrices and functions

4.1 Vectors

The R basic object is the vector (a scalar is considered as a vector of length one). The most used function to create a vector is the concatenation:

## [1] 150 162 155 157

Indexing is done through brackets:

## [1] 150
## [1] 150
## [1] 155 157

One can also use a boolean indexing vector, the extracted elements are obviously those corresponding to the TRUE values. For example to extract prices greater than 156:

## [1] FALSE  TRUE FALSE  TRUE
## [1] 162 157

An alternative is given by the which () function which returns the indices whose elements satisfy a logical condition:

## [1] 2 4
## [1] 162 157

You can use the indexing to change an element:

## [1]   0 162 155 157

It is possible to give labels to the elements of a vector and extract elements based on them:

## NULL
## model.1 model.2 model.3 model.4 
##       0     162     155     157
## model.3 
##     155

In a vector, all the elements must have the same mode:

## [1] "1" "2" "a" "b"
## [1] "character"

To generate the vector of the first \(n\) integers we use the syntax 1:n

##  [1]  1  2  3  4  5  6  7  8  9 10
## [1] 2 3 4 5 6

To generate more general sequences we use the seq() function:

##  [1]  2  4  6  8 10 12 14 16 18 20

We can create a vector of repeated elements with rep():

## [1] 1 1 1
## [1] NA NA NA NA

4.2 Matrices

A matrix is a vector with a dim attribute of length two. All the elements of a matrix therefore have the same mode. To create a matrix:

##      [,1] [,2] [,3]
## [1,]    2    4    6
## [2,]    3    5    7
##      [,1] [,2] [,3]
## [1,]    2    3    4
## [2,]    5    6    7

By default matrix () fills the new matrix one column after another. Indexing is done through brackets:

## [1] 3 5 7
## [1] 6 7
## [1] 3
## [1] 4
##      [,1] [,2]
## [1,]    2    6
## [2,]    3    7

To vertically (resp. horizontally) merge two matrices we use rbind() (resp. cbind()):

##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    2    4    6   -2   -4   -6
## [2,]    3    5    7   -3   -5   -7
##      [,1] [,2] [,3]
## [1,]    2    4    6
## [2,]    3    5    7
## [3,]    4    8   12
## [4,]    6   10   14

4.3 Operations on numerical vectors and matrices

Element wise operations:

## [1] 5 6 3 8
## [1]  6  8  2 12
## [1]  9 16  1 36
## [1] 1.5 2.0 0.5 3.0
## [1] 1 1 1 1
## [1] 12 20  2 42
##          [,1]     [,2]     [,3]
## [1,] 1.414214 2.000000 2.449490
## [2,] 1.732051 2.236068 2.645751
##      [,1] [,2] [,3]
## [1,]    4   16   36
## [2,]    9   25   49

Transpose, multiplication, inverse:

##      [,1] [,2]
## [1,]    2    3
## [2,]    4    5
## [3,]    6    7
##      [,1] [,2]
## [1,]    2    4
## [2,]    3    5
##      [,1] [,2]
## [1,] -2.5    2
## [2,]  1.5   -1
##      [,1]         [,2]
## [1,]    1 1.776357e-15
## [2,]    0 1.000000e+00

The transpose of a vector is a row matrix:

## [1] 1 4
##      [,1]
## [1,]    3
## [2,]    4
## [3,]    1
## [4,]    6

Pay attention to the following examples:

##      [,1] [,2] [,3] [,4]
## [1,]    9   12    3   18
## [2,]   12   16    4   24
## [3,]    3    4    1    6
## [4,]   18   24    6   36
##      [,1]
## [1,]   62
##      [,1]
## [1,]    3
## [2,]    4
## [3,]    1
## [4,]    6
##      [,1]
## [1,]   62

4.4 Factors

A factor is a vector used to represent qualitative variables, ie a variable with discrete values. Its values, or categories, are called the levels in R.

## [1] paris  lyon   lyon   paris  nantes
## Levels: lyon nantes paris
## [1] "factor"
## [1] "lyon"   "nantes" "paris"

A factor has the numeric mode. The reason for this counter-intuitive fact is that the elements of a factor are represented as integers corresponding to the lexicographic order of their values:

## [1] "numeric"
## [1] 3 1 1 3 2

4.5 User-defined functions

Example:

## [1] -8
## [1] -2
## [1] 3

Any variable defined in a function is local and does not appear in the workspace: try to run

4.6 Exercises

  1. Let \(x\) be a vector with the elements of a sample:
##  [1] 45 63 17 32 54 57 41 29 34 37 18 39 46 43
  • Write a code to give
    • the third element of the sample
    • the first four elements of the sample
    • the items strictly greater than 35.
    • all elements except those in positions 3, 9 and 12.
  • Replace the first element by a missing value and give the position of all elements less than 30.
  1. Write a function weighted_average that takes as inputs two vectors \(x=(x_1,\ldots,x_n)\) and \(w=(w_1,\ldots,w_n)\) and computes the weighted mean \[ \frac{1}{\sum_{i=1}^n w_i}\sum_{i=1}^nw_ix_i \]