After module XX, you should be able to:
why create your own function?
Here we will write a function that multiplies some number (x) by 2:
When you run the line of code above, you make it ready to use (no output yet!) Let’s test it!
Adding the curly brackets - { }
- allows you to use functions spanning multiple lines:
return
If we want something specific for the function’s output, we use return()
. Note, if you want to return more than one object, you need to put it into a list using the list()
function.
[[1]]
[1] 40
[[2]]
[1] 10
This is a brief introduction. The syntax is:
functionName = function(inputs) {
< function body >
return(list(value1, value2))
}
Note to create the function for use you need to
Only then will the function be available in the Environment pane and ready to use.
Functions can take multiple arguments / inputs. Here the function has two arguments x
and y
Functions can have default arguments. This lets us use the function without specifying the arguments
We got an answer b/c we put defaults into the function arguments.
Let’s write a function, sqdif
, that:
x
and y
with default values of 2 and 3.functionName = function(inputs) {
< function body >
return(list(value1, value2))
}
[1] 1
[1] 25
[1] 25
Functions can have any kind of data type input. For example, here is a function with characters:
...
(elipsis) argument\[ \text{geometric mean}(\mathbf{x}) = \sqrt[n]{x_1 \cdot x_2 \cdot \ldots \cdot x_n} = \exp\left( \frac{1}{n}\left(\log x_1 + \ldots + \log x_n \right) \right) \]
...
(elipsis) argumentgeo_mean <- function(...) {
x <- c(...)
output <- exp(mean(log(x)))
return(output)
}
geo_mean(2, 3, 4, 5)
[1] 3.309751
...
is a placeholder for as many arguments as you want....
for pass-throughThe most common use of ...
is to “pass-through” arguments to underlying functions.
mean()
has additional arguments, what if we want to use those in our geometric mean calculation?
Or we can do this an easier way.
traceback()
browser()
debug()
and debugonce()
Now, let’s see how we can repeat our loop example from the measles dataset with a function instead.
Here we are going to set up a function that takes our data frame and outputs the median, first and third quartiles, and range of measles cases for a specified country.
get_country_stats <- function(df, iso3_code){
country_data <- subset(df, iso3c == iso3_code)
# Get the summary statistics for this country
country_cases <- country_data$measles_cases
country_quart <- quantile(
country_cases, na.rm = TRUE, probs = c(0.25, 0.5, 0.75)
)
country_range <- range(country_cases, na.rm = TRUE)
country_name <- unique(country_data$country)
country_summary <- data.frame(
country = country_name,
min = country_range[[1]],
Q1 = country_quart[[1]],
median = country_quart[[2]],
Q3 = country_quart[[3]],
max = country_range[[2]]
)
return(country_summary)
}
Now we can use the function.
functionName = function(arguments) {
< function body >
return(list(outputs))
}
Create your own function that calculates the incidence rate per \(K\) cases in a given country for all years where that country appeared (e.g. incidence per 10,000 people or another number \(K\)).
Step 1. Determine your arguments.
Step 2. Begin your function by subsetting the data to include only the country specified in the arguments (i.e, country_data
).
Step 3. Get the years of interest (what errors might you run into here?).
Step 4. Calculate the incidence rate per \(K\) people in each year.
Step 5. Return the output neatly.
Bonus exercise: allow the country code argument to be optional, or add a new argument that allows the user to only calculate the incidence for certain years.
get_incidence <- function(df, iso3_code, K = 10000){
country_data <- subset(df, iso3c == iso3_code)
incidence <- country_data$measles_cases / country_data$population * K
names(incidence) <- country_data$year
return(incidence)
}
get_incidence(df=meas, iso3_code="IND")
2023 2022 2021 2020 2019 2018 2017
0.45603205 0.28907547 0.04049550 0.04013214 0.07540965 0.14224947 0.08884979
2016 2015 2014 2013 2012 2011 2010
0.12886248 0.22805022 0.20294566 0.06416849 0.02593200 0.26744142 0.25356807
2009 2008 2007 2006 2005 2004 2003
0.45918728 0.36675830 0.34583747 0.54747898 0.31794361 0.48794093 0.42192914
2002 2001 2000 1999 1998 1997 1996
0.36459551 0.47990173 0.36649458 0.20195097 0.33276727 0.60861874 0.47872368
1995 1994 1993 1992 1991 1990 1989
0.38882932 0.53443386 0.54397290 0.79109798 0.87586166 1.02948793 1.94564007
1988 1987 1986 1985 1984 1983 1982
1.93562738 3.10083364 1.94360299 2.06623051 2.50206072 1.73819235 2.00496602
1981 1980
2.83437652 1.63650050
2023 2022 2021 2020 2019 2018 2017
0.73692544 0.35526364 0.44939088 0.12090842 0.09252406 1.50215163 0.42402323
2016 2015 2014 2013 2012 2011 2010
0.12658949 0.01829650 0.06578580 0.42607889 0.39791131 0.22084288 0.22221137
2009 2008 2007 2006 2005 2004 2003
0.04539161 0.06072114 0.15396495 0.42910095 0.17095625 0.24893257 0.28404208
2002 2001 2000 1999 1998 1997 1996
0.23906241 0.24174444 0.13370480 0.19640005 0.16036998 0.13075755 0.07942591
1995 1994 1993 1992 1991 1990 1989
0.12920918 0.10994611 0.15667487 0.24245113 0.05176020 1.88755151 0.21035120
1988 1987 1986 1985 1984 1983 1982
5.14440235 4.41204007 4.20439485 2.74769085 1.84269015 2.30609510 2.26464854
1981 1980
3.44249798 3.54397944
for (i in 1:n) {
this_country <- my_countries[[i]]
country_data <- subset(meas, iso3c == this_country)
incidence <- country_data$measles_cases / country_data$population * 10000
names(incidence) <- country_data$year
res[[i]] <- incidence
}
str(res)
List of 5
$ : Named num [1:44] 0.456 0.2891 0.0405 0.0401 0.0754 ...
..- attr(*, "names")= chr [1:44] "2023" "2022" "2021" "2020" ...
$ : Named num [1:44] 0.7369 0.3553 0.4494 0.1209 0.0925 ...
..- attr(*, "names")= chr [1:44] "2023" "2022" "2021" "2020" ...
$ : Named num [1:44] 0.0162 0.0182 0.012 0.1439 0.3521 ...
..- attr(*, "names")= chr [1:44] "2023" "2022" "2021" "2020" ...
$ : Named num [1:44] 0.3117 0.0426 0.0476 0.1322 0.1491 ...
..- attr(*, "names")= chr [1:44] "2023" "2022" "2021" "2020" ...
$ : Named num [1:44] 0.127 0.0895 NA NA 0.0261 ...
..- attr(*, "names")= chr [1:44] "2023" "2022" "2021" "2020" ...
List of 5
$ : Named num [1:44] 0.456 0.2891 0.0405 0.0401 0.0754 ...
..- attr(*, "names")= chr [1:44] "2023" "2022" "2021" "2020" ...
$ : Named num [1:44] 0.7369 0.3553 0.4494 0.1209 0.0925 ...
..- attr(*, "names")= chr [1:44] "2023" "2022" "2021" "2020" ...
$ : Named num [1:44] 0.0162 0.0182 0.012 0.1439 0.3521 ...
..- attr(*, "names")= chr [1:44] "2023" "2022" "2021" "2020" ...
$ : Named num [1:44] 0.3117 0.0426 0.0476 0.1322 0.1491 ...
..- attr(*, "names")= chr [1:44] "2023" "2022" "2021" "2020" ...
$ : Named num [1:44] 0.127 0.0895 NA NA 0.0261 ...
..- attr(*, "names")= chr [1:44] "2023" "2022" "2021" "2020" ...