+ - 0:00:00
Notes for current slide
Notes for next slide

A remote glimpse into the useR! 2021 conference

Anna Quaglieri | R-Ladies Melbourne Meetup | 3 Aug 2021

Art by Danielle Navarro, Silhouette in Teal (2021) Random walk, flametree L-system

1

A bit about me

2

A bit about me

Got my Bachelor and Master in Statistics between the Universities of Bologna, Glasgow and Melbourne

3

A bit of research

  • Master research in Population Genetics at the WEHI + worked for 1.5 years as RA
  • PhD in Cancer Genomics at the WEHI
4

A bit of research

  • Master research in Population Genetics at the WEHI + worked for 1.5 years as RA
  • PhD in Cancer Genomics at the WEHI

A bit outside of research

  • After my PhD, I Decided to try a different path outside of research
  • Worked as Data Science consultant for 1.5 years at the AI consultancy Eliiza
5

Since two months

I'm a Bioinformatics Data Scientist at the Melbourne based startup Mass Dynamics

6

Since two months

I'm a Bioinformatics Data Scientist at the Melbourne based startup Mass Dynamics

Mass Dynamics is on a mission to free humanity and society from the burden of disease by helping more life scientists transform proteomics data to knowledge - better, faster and easier.

7

Since two months

I'm a Bioinformatics Data Scientist at the Melbourne based startup Mass Dynamics

Mass Dynamics is on a mission to free humanity and society from the burden of disease by helping more life scientists transform proteomics data to knowledge - better, faster and easier.

cool

8

What I actually do every day

9

What I actually do every day

πŸ‘πŸ’» Work in a fun team: Work with a fun interdisciplinary team of scientists, developers, marketing savvy.

10

What I actually do every day

πŸ‘πŸ’» Work in a fun team: Work with a fun interdisciplinary team of scientists, developers, marketing savvy.

πŸ“™ Learn: Learn the intricacies and amazingness of mass spectrometry (= most used technique to quantify proteins in a sample) & what life scientists need to make the best use of their experiment.

11

What I actually do every day

πŸ‘πŸ’» Work in a fun team: Work with a fun interdisciplinary team of scientists, developers, marketing savvy.

πŸ“™ Learn: Learn the intricacies and amazingness of mass spectrometry (= most used technique to quantify proteins in a sample) & what life scientists need to make the best use of their experiment.

πŸ‘©β€πŸ’» Code in R: Assemble workflows in R to analyse mass spectrometry data.

12

What I actually do every day

πŸ‘πŸ’» Work in a fun team: Work with a fun interdisciplinary team of scientists, developers, marketing savvy.

πŸ“™ Learn: Learn the intricacies and amazingness of mass spectrometry (= most used technique to quantify proteins in a sample) & what life scientists need to make the best use of their experiment.

πŸ‘©β€πŸ’» Code in R: Assemble workflows in R to analyse mass spectrometry data.

πŸ‘ Open Science: Learn and strive for reproducibility and openness in what we produce.

13

What I actually do every day

πŸ‘πŸ’» Work in a fun team: Work with a fun interdisciplinary team of scientists, developers, marketing savvy.

πŸ“™ Learn: Learn the intricacies and amazingness of mass spectrometry (= most used technique to quantify proteins in a sample) & what life scientists need to make the best use of their experiment.

πŸ‘©β€πŸ’» Code in R: Assemble workflows in R to analyse mass spectrometry data.

πŸ‘ Open Science: Learn and strive for reproducibility and openness in what we produce.

πŸ₯œ In a nutshell: Study & Learn, think, build solutions (mainly in R packages)β€š debug, debug, debug, repeat.

14

useR! 2021

Art by Will Chase, USA, Terrazzo, confetti (2021) Voronoi Tessellation, Poisson disc sampling

15

useR! is one of my favorite conferences!

R-Ladies dinner useR! 2018

R-Ladies online cathcup 2021

16

Disclaimer

  • My highlights also corresponds to talks presented in our timezone
17

Disclaimer

  • My highlights also corresponds to talks presented in our timezone

  • All talks and workshops will be made available online very very soon! I'll keep you posted

tuned

18

Teaching and learning statistics

Artwork by @allison_horst

20

Developing a datasets based R package to teach environmental data science

Author and speaker: Allison Horst

  • πŸ”—Introduced the {lterdatasampler}: LTER Data Sampler πŸ“¦ (LTER = Long Term Ecological Research program (LTER) Network)
  • LTER goal is education and training
21

Developing a datasets based R package to teach environmental data science

Author and speaker: Allison Horst

  • πŸ”—Introduced the {lterdatasampler}: LTER Data Sampler πŸ“¦ (LTER = Long Term Ecological Research program (LTER) Network)
  • LTER goal is education and training

Lesson learnt from this talk

  • A great way to learn how to build an R πŸ“¦ is to create a data-package (package that only includes data)
22

Developing a datasets based R package to teach environmental data science

Author and speaker: Allison Horst

  • πŸ”—Introduced the {lterdatasampler}: LTER Data Sampler πŸ“¦ (LTER = Long Term Ecological Research program (LTER) Network)
  • LTER goal is education and training

Lesson learnt from this talk

  • A great way to learn how to build an R πŸ“¦ is to create a data-package (package that only includes data)

  • I don't have super complex, cool new functions... I cannot write an R πŸ“¦

23

Developing a datasets based R package to teach environmental data science

Author and speaker: Allison Horst

  • πŸ”—Introduced the {lterdatasampler}: LTER Data Sampler πŸ“¦ (LTER = Long Term Ecological Research program (LTER) Network)
  • LTER goal is education and training

Lesson learnt from this talk

  • A great way to learn how to build an R πŸ“¦ is to create a data-package (package that only includes data)

  • I don't have super complex, cool new functions... I cannot write an R πŸ“¦

  • Free gift! Data packages are an enormously useful tool for teaching purposes (how many times have you used the {iris} dataset [1]??!!)

R. A. Fisher (1936). "The use of multiple measurements in taxonomic problems". Annals of Eugenics. 7 (2): 179–188. doi:10.1111/j.1469-1809.1936.tb02137.x. hdl:2440/15227

24

🐧 {palmerpenguins} is the new {iris} 🌷

Authors: Allison Horst, Alison Hill, Kristen Gorman

The πŸ”—{palmerpenguins} πŸ“¦ provides a great dataset for data exploration & visualization, as an alternative to iris.

install.packages("palmerpenguins")
library(palmerpenguins)

25

🐧 {palmerpenguins} is the new {iris} 🌷

Authors: Allison Horst, Alison Hill, Kristen Gorman

The πŸ”—{palmerpenguins} πŸ“¦ provides a great dataset for data exploration & visualization, as an alternative to iris.

install.packages("palmerpenguins")
library(palmerpenguins)

Meet the penguins!

Artwork by @allison_horst

26
library(palmerpenguins)
library(dplyr)
library(DT)
penguins %>% head()
## # A tibble: 6 Γ— 8
## species island bill_length_mm bill_depth_mm flipper_length_… body_mass_g sex
## <fct> <fct> <dbl> <dbl> <int> <int> <fct>
## 1 Adelie Torge… 39.1 18.7 181 3750 male
## 2 Adelie Torge… 39.5 17.4 186 3800 fema…
## 3 Adelie Torge… 40.3 18 195 3250 fema…
## 4 Adelie Torge… NA NA NA NA <NA>
## 5 Adelie Torge… 36.7 19.3 193 3450 fema…
## 6 Adelie Torge… 39.3 20.6 190 3650 male
## # … with 1 more variable: year <int>
27

28

29

Creating R packages: resources that I found super useful!

Get started with πŸ”—R Packages by Hadley Wickham. Easy to read and very comprehensive.

What you need to get started:

  1. Code (one function is enough) AND/OR data (it doesn't have to be large!)
30

Creating R packages: resources that I found super useful!

Get started with πŸ”—R Packages by Hadley Wickham. Easy to read and very comprehensive.

What you need to get started:

  1. Code (one function is enough) AND/OR data (it doesn't have to be large!)

  2. An R project in a new folder /path/to/myPackage

31

Creating R packages: resources that I found super useful!

Get started with πŸ”—R Packages by Hadley Wickham. Easy to read and very comprehensive.

What you need to get started:

  1. Code (one function is enough) AND/OR data (it doesn't have to be large!)

  2. An R project in a new folder /path/to/myPackage

  3. Run the code usethis::create_package("/path/to/myPackage") (more info https://r-pkgs.org/workflows101.html). This will create the metadata and other files that you need to package the package up!

32

Creating R packages: resources that I found super useful!

Get started with πŸ”—R Packages by Hadley Wickham. Easy to read and very comprehensive.

What you need to get started:

  1. Code (one function is enough) AND/OR data (it doesn't have to be large!)

  2. An R project in a new folder /path/to/myPackage

  3. Run the code usethis::create_package("/path/to/myPackage") (more info https://r-pkgs.org/workflows101.html). This will create the metadata and other files that you need to package the package up!

  4. You're setup!

33

To recapitulate: Setup your R package in a few lines!

usethis::create_package("/path/to/myPackage")

34

To recapitulate: Setup your R package in a few lines!

usethis::create_package("/path/to/myPackage")

data_fake <- data.frame(First = seq(1:200),
Second = rep("A", 200))
usethis::use_data(data_fake)

R Packages by Hadley Wickham

35

{fusen} πŸ“¦: Create a package from Rmd

36

{fusen} πŸ“¦: Create a package from Rmd

Speaker and author: SΓ©bastien Rochette

If you know how to create a Rmarkdown file, then you know how to build a package.

πŸ”—Introduction to {fusen}

37

{fusen} πŸ“¦: Create a package from Rmd

Speaker and author: SΓ©bastien Rochette

If you know how to create a Rmarkdown file, then you know how to build a package.

πŸ”—Introduction to {fusen}

Philosophy You don't need to move around functions and files to create a package, you only need your Rmd with functions, documentation, tests, examples.

38

{fusen} πŸ“¦: Create a package from Rmd

Speaker and author: SΓ©bastien Rochette

If you know how to create a Rmarkdown file, then you know how to build a package.

πŸ”—Introduction to {fusen}

Philosophy You don't need to move around functions and files to create a package, you only need your Rmd with functions, documentation, tests, examples.

install.packages("fusen")
library(fusen)

easy

39

Rmd first approach to write an R πŸ“¦

  1. Write your Rmd using some prefix to name code chunks, e.g. description, function, tests, examples

  2. These prefixes will tell {fusen} how to create your package

40

Rmd first approach to write an R πŸ“¦

  1. Write your Rmd using some prefix to name code chunks, e.g. description, function, tests, examples

  2. These prefixes will tell {fusen} how to create your package

Inflate!

41

Building and maintaining OpenIntro using the R ecosystem

Speaker: Cetinkaya-Rundel, Mine

42

Another lesson to build a data-centric-package!

{OpenIntro} depends on 3 other data packages. See package πŸ”—DESCRIPTION

43

Teaching and learning Bayesian statistics with {bayesrules} πŸ“¦

Speaker: Mine Dogucu (πŸ”—GitHub repo and πŸ”—Talk slides )

44

Data Visualisation

Art by Ijeamaka Anyene, USA, Sunset (2021)

45

Keynote: Expanding the vocabulary of R graphics

Speaker: Paul Murrel author of {grDevices} πŸ“¦

  • A graphics engine containing functions for both base and grid graphics.
  • The {grid}πŸ“¦ is low-level system for plotting within R ({ggplot2}πŸ“¦ is based on this)
  • Grid graphics and R’s base graphics are two separate systems.
  • Usually worth using grid graphics when you need to create a very unusual plot that cannot be created using ggplot2

46

πŸ”—Paul Murrel's New Features in the R Graphics Engine

Gradient and radial fill

library(grid)

47

Pattern fills

library(grid)

48

Going beyond statistical plots

πŸ”—Paul Murrel's talk: Going beyond statistical plots

You can build illustrator like viz!

49

The {virgo} πŸ“¦

Speaker: Stuart Lee

Authors: Stuart Lee and Earo Wang

πŸ”—Introduction to virgo

50

The {virgo} πŸ“¦

Speaker: Stuart Lee

Authors: Stuart Lee and Earo Wang

πŸ”—Introduction to virgo

  • Allows to easily build interactive graphics for exploratory data analysis
51

The {virgo} πŸ“¦

Speaker: Stuart Lee

Authors: Stuart Lee and Earo Wang

πŸ”—Introduction to virgo

  • Allows to easily build interactive graphics for exploratory data analysis

  • Allows cross interactivity between plots without having to build a Shiny app

52

The {virgo} πŸ“¦

Speaker: Stuart Lee

Authors: Stuart Lee and Earo Wang

πŸ”—Introduction to virgo

  • Allows to easily build interactive graphics for exploratory data analysis

  • Allows cross interactivity between plots without having to build a Shiny app

  • virgo plots also works within Shiny

53

{virgo} in action!

library(virgo)
library(palmerpenguins)
selection <- select_interval()
p <- penguins %>%
vega() %>%
mark_circle(
enc(
x = bill_length_mm,
y = bill_depth_mm,
color = encode_if(selection, species, "black")
)
)
p

πŸ”—Introduction to virgo

54
p_right <- penguins %>%
vega(enc(x = body_mass_g)) %>%
mark_histogram(bin = list(maxbins = 20)) %>%
mark_histogram(color = "purple", bin = list(maxbins = 20),
selection = selection) %>%
mark_rule(enc(x = vg_mean(body_mass_g)), color = "red", size = 4,
selection = selection)
p_right

πŸ”—Introduction to virgo

55

Concatenate interactivity!

hconcat(p, p_right)

πŸ”—Introduction to virgo

56

Do you see what I see? The {microshades} πŸ“¦

Speaker and Author: Lisa Karstens, PhD

πŸ”—Introduction to microshades

Provide custom color shading palettes that improves:

  • accessibility for Color Vision Deficient (CVD) people
  • data organization
remotes::install_github("KarstensLab/microshades")
57

Do you see what I see? The {microshades} πŸ“¦

Speaker and Author: Lisa Karstens, PhD

πŸ”—Introduction to microshades

Provide custom color shading palettes that improves:

  • accessibility for Color Vision Deficient (CVD) people
  • data organization
remotes::install_github("KarstensLab/microshades")

Two crafted colour palettes:

  • microshades_cvd_palettes
  • microshades_palettes

Total of 30 available colours per palette.

58

{microshades} in action!

🐧 {palmerpenguins} with {microshades} example code: https://karstenslab.github.io/microshades/articles/non-microbiome_data.html

59

Data validation and Unit testing

Art by Antonio SΓ‘nchez, Spain, Jellyfish (2018), Sines and cosines

60

{autotest} πŸ“¦: Automatic testing for R packages

Speaker and author: Mark Padgham, Software Research Scientst at rOpenSci

πŸ”—Introduction to {autotest}

install.packages("autotest")
61

{autotest} πŸ“¦: Automatic testing for R packages

Speaker and author: Mark Padgham, Software Research Scientst at rOpenSci

πŸ”—Introduction to {autotest}

install.packages("autotest")
  • {autotest} goes into the examples of your R πŸ“¦ functions and mutates (aka changes) the inputs parameters to function calls.
62

{autotest} πŸ“¦: Automatic testing for R packages

Speaker and author: Mark Padgham, Software Research Scientst at rOpenSci

πŸ”—Introduction to {autotest}

install.packages("autotest")
  • {autotest} goes into the examples of your R πŸ“¦ functions and mutates (aka changes) the inputs parameters to function calls.

  • This allows to check for robustness of the package to several inputs

63

{autotest} in actions

library(autotest)
y <- autotest_package(package = "stats", functions = "var", test = TRUE)

64

A fresh look at unit testing with {tinytest} πŸ“¦

Speaker and author: Mark van der Loo

πŸ”—Introduction to {tinytest}

install.packages("tinytest")
  • The purpose is to facilitate the development of unit testing of R πŸ“¦

  • It provides you with better stats and ideas where the errors actually occurred

65

A fresh look at unit testing with {tinytest} πŸ“¦

Speaker and author: Mark van der Loo

πŸ”—Introduction to {tinytest}

install.packages("tinytest")
  • The purpose is to facilitate the development of unit testing of R πŸ“¦

  • It provides you with better stats and ideas where the errors actually occurred

[1] M van der Loo (2017). tinytest: R package version 1.2.4. https://cran.r-project.org/package=tinytest

[2] MPJ van der Loo (2020) A method for deriving information from running R code. R-Journal (Accepted) https://arxiv.org/abs/2002.07472

66

{tinytest} in action!

πŸ”—Overview of {tinytest} functionalities

library(tinytest)
addOne <- function(x) x + 1
subOne <- function(x) x - 2
67

{tinytest} in action!

πŸ”—Overview of {tinytest} functionalities

library(tinytest)
addOne <- function(x) x + 1
subOne <- function(x) x - 2
# this test should pass
tinytest::expect_equal(addOne(1), 2 )
## ----- PASSED : <-->
## call| tinytest::expect_equal(addOne(1), 2)
68

{tinytest} in action!

πŸ”—Overview of {tinytest} functionalities

library(tinytest)
addOne <- function(x) x + 1
subOne <- function(x) x - 2
# this test should pass
tinytest::expect_equal(addOne(1), 2 )
## ----- PASSED : <-->
## call| tinytest::expect_equal(addOne(1), 2)
# this test will fail
tinytest::expect_equal(subOne(2), 1 )
## ----- FAILED[data]: <-->
## call| tinytest::expect_equal(subOne(2), 1)
## diff| Expected '1', got '0'
69

Tutorial: Data validation with the {validate} πŸ“¦

Author: Mark van der Loo

The purpose is to provide easy to use tools to check that you're data is valid!

[1] MPJ van der Loo and E de Jonge (2020). Data Validation Infrastructure for R. Journal of Statistical Software, Accepted for publication. https://arxiv.org/abs/1912.09759

[2] MPJ van der Loo (2020) The Data Validation Cookbook version 1.0.1. https://data-cleaning.github.io/validate

70

{validate} in action!

library(palmerpenguins)
head(penguins)
## # A tibble: 6 Γ— 8
## species island bill_length_mm bill_depth_mm flipper_length_… body_mass_g sex
## <fct> <fct> <dbl> <dbl> <int> <int> <fct>
## 1 Adelie Torge… 39.1 18.7 181 3750 male
## 2 Adelie Torge… 39.5 17.4 186 3800 fema…
## 3 Adelie Torge… 40.3 18 195 3250 fema…
## 4 Adelie Torge… NA NA NA NA <NA>
## 5 Adelie Torge… 36.7 19.3 193 3450 fema…
## 6 Adelie Torge… 39.3 20.6 190 3650 male
## # … with 1 more variable: year <int>
table(penguins$island, penguins$species)
##
## Adelie Chinstrap Gentoo
## Biscoe 44 0 124
## Dream 56 68 0
## Torgersen 52 0 0
71

Create a validator with rules

72

Create a validator with rules

  • Separate multiple validations by a comma
73

Create a validator with rules

  • Separate multiple validations by a comma

  • The example shows multivariate validation including completeness validation (is_complete) and conditional validations.

74

Create a validator with rules

  • Separate multiple validations by a comma

  • The example shows multivariate validation including completeness validation (is_complete) and conditional validations.

library(validate)
rules <- validator(flipper_length_mm > 0,
is_complete(bill_depth_mm, flipper_length_mm, bill_depth_mm),
if(island %in% "Biscoe") species %in% c("Adelie"))
75

Create a validator with rules

  • Separate multiple validations by a comma

  • The example shows multivariate validation including completeness validation (is_complete) and conditional validations.

library(validate)
rules <- validator(flipper_length_mm > 0,
is_complete(bill_depth_mm, flipper_length_mm, bill_depth_mm),
if(island %in% "Biscoe") species %in% c("Adelie"))
confront(penguins, rules) %>% summary()
## name items passes fails nNA error warning
## 1 V1 344 342 0 2 FALSE FALSE
## 2 V2 344 342 2 0 FALSE FALSE
## 3 V3 344 220 124 0 FALSE FALSE
## expression
## 1 flipper_length_mm > 0
## 2 is_complete(bill_depth_mm, flipper_length_mm, bill_depth_mm)
## 3 !(island %vin% "Biscoe") | (species %vin% c("Adelie"))
76

Teasing your appetite for next month R-Ladies Melbourne event

Spoiler Alert!

Art by Will Chase, USA, Triangle disintegration (2019), Curl noise, trigonometry

77

Here is the anomalous-down!

78

Here is the anomalous-down!

Author and speaker Dr Sevvandi Kandanaarachchi

R packages that Sevvandi developed to find anomalies in high-dimensional data:

79

Here is the anomalous-down!

Author and speaker Dr Sevvandi Kandanaarachchi

R packages that Sevvandi developed to find anomalies in high-dimensional data:

soon

80

Art by Ijeamaka Anyene, USA, Clouds (2021)

Keynote: Heidi Seibold

Research Software Engineers (RSE) & Academia

81

Heidi’s path

83

What does an RSE do?

  • An RSE builds software for research
85

What does an RSE do?

  • An RSE builds software for research

  • Generally writes code and teaches about software to researchers

86

What does an RSE do?

  • An RSE builds software for research

  • Generally writes code and teaches about software to researchers

  • Consult researcher with any kind of software problem

87

What does an RSE do?

  • An RSE builds software for research

  • Generally writes code and teaches about software to researchers

  • Consult researcher with any kind of software problem

88

Tools for Open and Reproducible Science

It's a bit too much to expect that the researcher would do all of those things + their research!

The RSE comes to help!

89

If you think this does for you...

You can become part of the community!

90

Art by Will Chase, USA, Bubble strings (2021), Flow fields, circle packing, perlin noise

R in production

91

What does "R in production" mean?

Example from my experience.

  • πŸ‘©β€πŸ’» I'm a data scientist at Mass Dynamics and I build R πŸ“¦ to analyse mass spectrometry data
92

What does "R in production" mean?

Example from my experience.

  • πŸ‘©β€πŸ’» I'm a data scientist at Mass Dynamics and I build R πŸ“¦ to analyse mass spectrometry data

  • But, Mass Dynamics wants to make the functionalities of the R packages easily available also to πŸ‘©β€πŸ”¬ life scientist, reducing the barrier of having to learn to code

93

What does "R in production" mean?

Example from my experience.

  • πŸ‘©β€πŸ’» I'm a data scientist at Mass Dynamics and I build R πŸ“¦ to analyse mass spectrometry data

  • But, Mass Dynamics wants to make the functionalities of the R packages easily available also to πŸ‘©β€πŸ”¬ life scientist, reducing the barrier of having to learn to code

  • The solution: the life scientist can interact with an easy user interface (UI, aka frontend) which runs my R πŸ“¦ in the background (aka backend)

94

What does "R in production" mean?

Example from my experience.

  • πŸ‘©β€πŸ’» I'm a data scientist at Mass Dynamics and I build R πŸ“¦ to analyse mass spectrometry data

  • But, Mass Dynamics wants to make the functionalities of the R packages easily available also to πŸ‘©β€πŸ”¬ life scientist, reducing the barrier of having to learn to code

  • The solution: the life scientist can interact with an easy user interface (UI, aka frontend) which runs my R πŸ“¦ in the background (aka backend)

  • Every time a scientist interacts with the UI, the R πŸ“¦ is run -> This is R in production πŸŽ‰!

95

How do we do that?

There is a bit of engineering setup and jargon to digest & there are various of way of accomplishing this task!

96

How do we do that?

There is a bit of engineering setup and jargon to digest & there are various of way of accomplishing this task!

Tricky aspects:

97

How do we do that?

There is a bit of engineering setup and jargon to digest & there are various of way of accomplishing this task!

Tricky aspects:

  • A lot of aspects are around the engineering setup (which is not my expertise!)
98

How do we do that?

There is a bit of engineering setup and jargon to digest & there are various of way of accomplishing this task!

Tricky aspects:

  • A lot of aspects are around the engineering setup (which is not my expertise!)

  • However, from my side I need to make sure that:

    • all dependencies needed by my R packages are available in production
99

How do we do that?

There is a bit of engineering setup and jargon to digest & there are various of way of accomplishing this task!

Tricky aspects:

  • A lot of aspects are around the engineering setup (which is not my expertise!)

  • However, from my side I need to make sure that:

    • all dependencies needed by my R packages are available in production

    • all packages are put into production with a defined version to allow reproducibility

100

How do we do that?

There is a bit of engineering setup and jargon to digest & there are various of way of accomplishing this task!

Tricky aspects:

  • A lot of aspects are around the engineering setup (which is not my expertise!)

  • However, from my side I need to make sure that:

    • all dependencies needed by my R packages are available in production

    • all packages are put into production with a defined version to allow reproducibility

    • Managing dependencies can be really tricky!

    • How do you know all that you need, when a package will depends on another package, and anothe rpackage will depend on another one etc...
101

What are package dependencies?

102

What are package dependencies?

You find them in the DESCRIPTION file of a package, disguising under:

103

How to find them all?

How do you determine all the dependencies needed to reproduce an R package/project environment to make it reporodcible, open, shareable, safe for production?

My summary of suggestions after discussing with speakers at useR! 2021:

  • {renv} πŸ“¦:
    • is the emerging method to manage package/project dependencies in R.
    • Calling renv::snapshot() saves the state of the project library to the lockfile (called renv.lock)
    • Really useful but for a given project it could give you more than you need (may affect speed).
104

How to find them all?

How do you determine all the dependencies needed to reproduce an R package/project environment to make it reporodcible, open, shareable, safe for production?

My summary of suggestions after discussing with speakers at useR! 2021:

  • {renv} πŸ“¦:

    • is the emerging method to manage package/project dependencies in R.
    • Calling renv::snapshot() saves the state of the project library to the lockfile (called renv.lock)
    • Really useful but for a given project it could give you more than you need (may affect speed).
  • Hard code your dependency, start minimal and grow:

    • It's really useful to grow your dependency list little by little.
    • Start with package imports (looking at the DESCRIPTION file) and then look for system dependencies
105

Ways to find out system dependencies of a πŸ“¦

  • r-hub/sysreqs πŸ“¦ provides a database with API to quickly find out which packages or other software needs to be available to build and use R packages.

    • Usage: sysreqs::sysreq_commands(desc = "path/to/a/DESCRIPTION/file") runs all the commands to install the necessary runtime system dependencies.
106

Ways to find out system dependencies of a πŸ“¦

  • r-hub/sysreqs πŸ“¦ provides a database with API to quickly find out which packages or other software needs to be available to build and use R packages.

    • Usage: sysreqs::sysreq_commands(desc = "path/to/a/DESCRIPTION/file") runs all the commands to install the necessary runtime system dependencies.
  • rstudio/r-system-requirements: RStudio independently maintained catalogue of dependencies, used to power the RStudio package manager

107

Ways to find out system dependencies of a πŸ“¦

  • r-hub/sysreqs πŸ“¦ provides a database with API to quickly find out which packages or other software needs to be available to build and use R packages.

    • Usage: sysreqs::sysreq_commands(desc = "path/to/a/DESCRIPTION/file") runs all the commands to install the necessary runtime system dependencies.
  • rstudio/r-system-requirements: RStudio independently maintained catalogue of dependencies, used to power the RStudio package manager

  • {maketools} πŸ“¦ To get runtime dependencies (only for Linux)

maketools::package_sysdeps("stringi")
## # A tibble: 1 Γ— 6
## shlib package headers source version url
## <chr> <chr> <chr> <chr> <chr> <chr>
## 1 libc++.1.dylib <NA> <NA> <NA> <NA> <NA>
108

These awesome suggestions from:

Suggestions from speakers at useR! 2021:

  • Peter Solymos speaker for Data science serverless-style with R and OpenFaas
  • Max Held speaker for: Bridging the Unproductive Valley: Building Data Products Strictly Without Magic

Other resources:

109

Productionising ML models developed in R

Author: Surya Avala, ML Engineer at @whispr (Talk slides)

(Not from useR!) But I found it really useful!

Phylosphy: Wrapping up everything with Docker 🐳 and using {plumber} πŸ“¦ to generate and API for R.

110

aRt gallery

Yonder 1831, 2021 by Thomas Lin Pedersen (Denmark). Flow lines, nearest neighbour, texture blending

111

Thanks to

112

Glossary of R packages mentioned

Data Validation and Package Testing

Next month surprise!

Manage package dependencies

113

Art by Ijeamaka Anyene, USA, Arcs IV (2020)

Any question?

You can fine me at:

114

A bit about me

2
Paused

Help

Keyboard shortcuts

↑, ←, Pg Up, k Go to previous slide
↓, β†’, Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow