Thursday, March 31, 2016

Linking MATLAB to REDCap

REDCap is a mature secure web application for building and managing online surveys and databases. It's used extensively in the biomedical community to store and manage clinical data. The Campbell lab uses it to manage clinical data about the patients and organ donors who donate cardiac samples for our research.

Although REDCap has many advantages, I've found exporting data to be tricky. In principle, you can export data in formats for

  • SPSS
  • SAS
  • R
  • Stata
  • CSV / Excel

In practice, everything but the CSV / Excel format requires downloading two or three files. One of these files has the data, the other files have code that tells the statistics packages how to import the information. I haven't experimented with this approach because I'm not keen on having to link my data files to specific import code. That seems complicated.

As noted above, REDCap labels its CSV output format as "CSV / Excel". It's important to note though that the file is not an xlsx spreadsheet. It's a CSV file, which uses commas to separate entries. It might look like this

Date,Name,Age,Comments,BMI
11/22/2015,Mike,22,Nothing,24.8
11/23/2015,Ken,23,Interestingly, Ken had red hair, and also blue eyes,25.2

This looks okay at first, but you quickly get into trouble if any of the fields include commas. For example, the second line would show a BMI of Ken, instead of 25.2.

I ran into this problem with one of our datasets and developed several workarounds. They did the job but I wasn't confident that I would catch every new error going forward. Then I remembered that REDCap had an API and I thought I would give that a look.

This post gave me some useful pointers but I don't have much experience with Python so I had trouble getting things to work. The code in the REDCap API Sandbox also seemed a bit buggy.

The real breakthrough came when I realized that  cURL makes it easy to communicate via http. I used these examples and some snippets from the REDCap API and was quickly able to download a REDCap report in JSON format.

For example, the command generated by this print statement


sprintf('token=%s&content=report&format=json&report_id=2313&rawOrLabel=label&rawOrLabelHeaders=label&exportCheckboxLabel=true&returnFormat=json" https://redcap.uky.edu/redcap/api/ -o %s',my_token,output_file_name);

sucks my REDCap report from the server and saves it to the specified text file.

I then used the MATLAB JSONlab toolbox to turn the data into a MATLAB structure which is what I really wanted for my data processing anyway.

Bottom line, from now on, I think I'm going to give up exporting data from REDCap directly, and suck it into MATLAB using the REDCap API. That way, I get around some of the limitations of the CSV (or REDCap alternative) outputs and I can get the latest version of the database on the fly whenever I need it.
Welcome to Data Driving, Ken Campbell's new blog on scientific computing.