Raw Data

Exporting Raw Data

You can use Researcher Dashboard to export data collected from data sources or activity responses. Both data sources and activity responses support export as CSV, though you have more options depending on the type of the data being exported. For example, for GPS you may also choose KML or GPX export, or for contact network data you may also choose to export as GEXF.

To export your data, open the researcher dashboard and navigate to the Data Export page:

In this page, you start with choosing the list of participants for whom you want to export the data, and the type of data you want to export. Depending the type of data, Avicenna may ask you to choose the export format as well. For most data sources, you can download the data as a CSV file. Although for GPS you can also choose KML or GPX, and for Bluetooth Beacons you can choose GEXF as well.

After selecting the export format, you can choose the date range as well. Pressing Export will start the data export process. The export may take up to a few hours to complete, depending on the size of the data. You can always come back this page to check the status of your data export. When the data export file is ready, you will receive an email about it, and you can come back to the Data Export file to download the file.

Note that the Data Export page will also list all Survey Response export requests, even though for exporting the survey responses you need to use the Responses page, as explained in the View Responses document.

Each data export will be available for download for up to 7 days. After that, Avicenna automatically deletes the export file. If you need to download the data again, you must create a new data export.

At the moment Avicenna does not have any limitations on the size of the data export file. But for sensor data this file can be very large, specially if a long date range and many participants are chosen to be exported. So it’s not uncommon for the file to surpass 10GB in size. There is no limitation at the moment on the file size, though we do request that if you expect your data export to be very large, break it into multiple requests, so it can generate multiple files.

Data Fields

The data fields you will find in each file depend on the type of data being exported. For survey responses, the list of available fields are explained in the View Responses document. For the sensor-based data sources, the list of data sources are described in their related section in the Data Sources document.

Downloading Record Counts

Download Sample Data

You can download a CSV file containing the count of the collected records in each hour-long interval grouped by data source for each participant in the study. This provides a quick overview of the data collection volume across different data sources in your study.

To do that, on the Data Export page, click on Download All Data Sources Record Counts as CSV.

The downloaded CSV file includes the following columns:

  • data_source_id: The unique identifier for the data source. See the Data Source ID Reference for more details.
  • user_id: The participant’s unique identifier.
  • participant_type: Either “Main” or “Test”.
  • device_id: The unique identifier of the device that collected the data.
  • time_bin: The date and hour for which the aggregated count is reported.
  • count: The number of records collected for that data source on that date.

Data Source ID Reference

The following table provides a reference for mapping data source IDs to their corresponding data sources:

ID Data Source
1 Accelerometer
2 Ambient Temperature
3 Gyroscope
4 Gravity
5 Light
6 Linear Acceleration
7 Magnetic Field
8 Orientation
9 Pressure
10 Proximity
11 Relative Humidity
13 WiFi
14 Bluetooth
15 GPS
16 Battery
19 Ambient Audio
20 App Usage (Legacy)
24 Screen State
25 Pedometer
26 Activity Recognition
27 Bluetooth Beacon
30 Fitbit Heart Rate
33 Fitbit Sleep
37 Garmin Health
39 HealthKit
42 Garmin Health Daily
43 Garmin Health Heart
44 Garmin Health Respiration
45 Garmin Health Sleep Daily
46 Garmin Health Sleep
47 Fitbit Activity Summary
48 Garmin Health Pulse Ox
49 Garmin Health Stress
50 Garmin Health Body Composition
51 Garmin Health User Metrics
52 Weather
53 Fitbit Activity
54 Fitbit Sleep Level
55 Fitbit Active Zone
58 WHOOP Workout
59 WHOOP Sleep
60 WHOOP Recovery
61 Polar Exercise
62 Polar Sleep
63 Polar Continuous Heart Rate
64 Polar SleepWise Circadian Bedtime
65 Polar SleepWise Alertness
66 Fitbit Weight Log
67 SensorKit Heart Rate
68 SensorKit Accelerometer
69 SensorKit Rotation Rate
70 SensorKit Ambient Light
71 SensorKit Ambient Pressure
72 SensorKit Device Usage Report
73 SensorKit Keyboard Metrics
74 SensorKit Message Usage Report
75 SensorKit On Wrist State
76 SensorKit Pedometer
77 SensorKit Phone Usage Report
78 SensorKit Telephony Speech Metrics
79 SensorKit Siri Speech Metrics
80 SensorKit Visits
81 SensorKit Wrist Temperature
82 Hexoskin Shirt
98 HealthKit Activity
99 HealthKit Vital Signs
100 HealthKit Sleep Analysis
103 HealthKit State of Mind
105 Web Activity Tracking

Direct Database Access

While Avicenna’s data export allows you to create complex queries and export any data from your study as CSV, this will not cover all analysis cases. For more advanced use-cases, you may need to connect directly to the database.

We can provide direct database access to your team to query and work with your study data. At the moment, this feature is not automatically provided. If you need to have direct database access, please contact Avicenna Support.

Handling of Timezones

Every piece of information stored in Avicenna is time-stamped as appropriate. All time values are stored internally in UTC.

However, keep in mind that all participants’ data exported from your study will be based on the participants’ timezones. This is because presenting participants’ data in their local timezones enhances researchers’ ability to interpret and analyze the data accurately, aligning it with the study protocol and the participants’ context.

Troubleshooting

I see extra lines in the exported CSV files

When working with the exported CSV files (e.g., survey responses), cell values with line breaks may appear as extra lines in some text editors. The CSV format handles this by enclosing such content in double quotes ("). Most data analysis tools (e.g., Excel, Google Sheets, R, Python) correctly interpret these quoted fields.

Try importing the CSV into a spreadsheet application like Google Sheets and see if the issue persists.