As a Workspaces Project Edition user, you have two options when uploading data into your workspace, both of which follow the same basic outline detailed below. These are:
- selecting files which have previously been de-identified
- selecting files which do not require de-identification
To upload either type of tabular data formatted as CSV files from the web interface, get started by navigating to the workspace you want to upload data into.
Once in the correct workspace, navigate to the ‘Add’ dropdown menu and select ‘Upload Data’. In the ‘Upload Data’ panel shown above, browse to select the CSV file you wish to upload and click ‘Upload File’.
If your data has already been de-identified, you should select the statement indicating that your file either does not contain any identifiable data, or that it has already been de-identified.
Once selected, you also have the option to provide an accompanying table definition file (TDF) which describes the fields and data within the CSV file. This is not mandatory, however if a TDF is not provided, a new one will be created based on the input gathered during the upload process. This can be downloaded at the end of the upload process if you wish.
You can also provide an authorisation reference for the dataset if applicable. This might be the name of the project or study which this dataset has been approved for use in, or the name of the data owner who has provided consent for this data to be uploaded.
Describe your dataset
The next screen requires you to enter a title and gives you the option of adding a basic description for your dataset. You may also wish to provide an optional web URL providing the location of that dataset within a web-available repository or scientific journal article relating to that dataset.
Parse your data
The next screen allows you to configure how your data should be processed when it is uploaded into the workspace.
The following settings can be configured on this screen:
- Data table name: The name of the database table that your data will be loaded into.
- Delimiter: The character used to separate the columns in your CSV file.
- Include header row: Determines if the first row in your CSV file contains column headers.
- Text qualifier: The character used to surround text within each column in the CSV file.
- Null qualifier: This character in the CSV file will be replaced with a database Null value when it is loaded into the workspace database.
- Encoding: The character encoding set to use when processing the data. Several common options are provided, with the default set to UTF-8. Clicking on a column header in the grid allows you to change the name of the column that will be created in the workspace database. Similarly, clicking on the data type directly below the column header allows you to change the type of the column that will be created.
Describe your fields
This section gives you the opportunity to alter the label of each column by changing the text in the ‘Label’ field and allows you to provide a description of the data captured within that column in the ‘Description’ field. This will ultimately provide a metadata description of each field should you wish to share the resulting TDF with another individual.
Once you have successfully step through all of the steps above, you will be informed that your data is ready to be uploaded. Pressing ‘Upload’ will upload data from your CSV file to the workspace.
When the upload is completed a note will appear in the Summary tab notifying you that the upload was completed successfully. You can also select ‘Download TDF’ if you wish to share the table definition file which has been cumulatively generated during the upload process. This is not automatically saved in your workspace, therefore you may wish to download a copy if you plan to use it in future.