Get up to speed: Useful reads

DataSHIELD

Along this workshop, there are some details regarding DataSHIELD and “resources” that are not explained in detail, it is expected that the reader is familiar with them. If that is not the case, there are other free online books/papers with that knowledge.

  • DataSHIELD paper: Description of what is DataSHIELD.

  • DataSHIELD wiki: Materials about DataSHIELD including:

    • Beginner material
    • Recorded DataSHIELD workshops
    • Information on current release of DataSHIELD
  • resource book: In this book you will find information about:

    • DataSHIELD (Section 5)
    • What are resources (Section 6/7)

Opal

We will be interacting with DataSHIELD through a data warehouse called Opal. This is the server that will handle the authentication of our credentials, storage of data and “resources” and will provide an R server where the non-disclosive analysis will be conducted. Information about it can also be foun online:

“resources”: A very simple explanation without any technicalities

It is quite important to have a solid understanding of what are the “resources” and how we work with them, since we will be using them to load our data on the R sessions. For that reason we included a very brief description of them without using technicalities.

The “resources” can be imagined as a data structure that contains the information about where to find a data set and the access credentials to it; we as DataSHIELD users are not able to look at this information (it is privately stored on the Opal server), but we can load it into our remote R session to make use of it. Following that, the next step comes naturally.

Once we have in an R session the information to access a dataset (an table for example) we have to actually retrieve it on the remote R session to analyze it. This step is called resolving the resource.

Those two steps can be identified on the code we provide as the following:

Loading the information of a “resource”:

DSI::datashield.assign.resource(conns, "resource", "resource.path.in.opal.server")

Resolving the “resource”:

DSI::datashield.assign.expr(conns, "resource.resolved", expr = as.symbol("as.resource.data.frame(resource)"))

This toy code would first load the “resource” on a variable called resource and it would retrieve the information it contains and assign it to a variable called resource.resolved.