UTOPIAN Data Safe Haven
What is UTOPIAN Data Safe Haven
The UTOPIAN Data Safe Haven is a secure researchable database comprised of de-identified patient records extracted from electronic medical records (EMRs) in contributing primary care practices associated with the University of Toronto.
When a patient registers with a family physician and seeks healthcare, the normal expectation is that her/his health data will only be shared with individuals and institutes within the circle of care for the purpose of providing direct care. Family Physicians as the data custodians have a key role in ensuring that public trust and confidence are maintained. This need for trust and confidence goes far beyond just getting the right level of data security and privacy. Issues of trust and consent for sharing the data for direct care are already addressed by policy and professional guidance but potential secondary uses such as research need to be considered if Ontario is to develop this capability in a way that other jurisdictions have achieved internationally. Through UTOPIAN, we are developing an Ontario Primary Care EMR Database and provide data extraction and analysis services. It will be based initially on the family physician practices participating in UTOPIAN but will be expanded to include other parts of Ontario and other health structures such as Community Health Centres.
Significant work has been undertaken to ensure the extracted EMR data are transformed into de-identified research-ready data and that the highest standards of privacy are maintained. Data extracted from the EMRs of participating practices are being cleaned, coded, de-identified and transferred to the secure UTOPIAN Data Safe Haven server periodically (every 3 months).
The data stored in the UTOPIAN Data Safe Haven are used by researchers at the DFCM to answer questions about primary health care. The data are retained for the duration of the project for all patients that do not opt out, and for all providers that do not withdraw consent.
The UTOPIAN dataset is available for research. To find out more about the UTOPIAN dataset, please click the links below:
- Geographical distribution of practices that contribute data to the UTOPIAN Data Safe Haven
- 2018 Q1 data cycle statistics (pdf doc)
Please contact Ivanka Pribramska, UTOPIAN Research Administrator for more information about how to access it.
The data extracted for the UTOPIAN Data Safe Haven are shared with the following entities:
Institute for Clinical Evaluative Sciences (ICES)
The data from the UTOPIAN Data Safe Haven can be linked with administrative data held at Institute for Clinical Evaluative Sciences (ICES). A separate linkage file containing patient unique identifier is generated within the practice environment and securely transferred directly to ICES. This file is then used to link UTOPIAN data to ICES data holdings. No data identifying health care providers or patients is released from ICES. Note: this dataset is not available for research yet (as of December 2017), pending final approvals.
Canadian Primary Care Sentinel Surveillance Network (CPCSSN)
Canadian Primary Care Sentinel Surveillance Network (CPCSSN) is a multi-disease electronic medical record surveillance system that aims to improve the quality of care for Canadians. By collecting de-identified health information from the EMRs of participating physicians (also referred to as sentinels), CPCSSN has created a platform for multi-level research across Canada. Primary care EMR data are extracted quarterly; the data are cleaned and coded using case definitions developed for the following conditions: diabetes, hypertension, COPD, depression, osteoarthritis, epilepsy, Parkinson's disease and dementia (CPCSSN case definitions).
In the Greater Toronto Area, CPCSSN runs under the UTOPIAN umbrella. After the data in UTOPIAN Data Safe Haven have been fully de-identified, a copy is forwarded and merged into a National CPCSSN Data Repository. Only approved researchers are allowed to use the CPCSSN Data Repository to conduct primary care related research.
Through CPCSSN, UTOPIAN is able to run multiple queries for patient eligibility based on specific project needs. This helps researchers identify sites of interest for their study population. By using preliminary de-identified data, it is possible to coordinate with site-specific data custodians to vet potential patients for clinical trials. UTOPIAN is also able to share ready-to-use queries with sites in order so that these can be executed on identifiable patient data for study-specific purposes (i.e. sending study invitation letters/e-mails to eligible patients).
In Q1 2018 (31.3.2018), the dataset contained 575,311 de-identified patient primary care records, contributed by 359 family physicians (UTOPIAN Data Safe Haven - statistics for Q1 2018).
Below are a few resources providing more information about CPCSSN and the CPCSSN dataset:
- To obtain more information about access to CPCSSN national data, click here.
- To see the current UTOPIAN projects using CPCSSN national data please check Projects and Publications.
- CPCSSN data dictionary
- Results of research done using CPCSSN data - publications
- Results of research done using EMRALD data - publications
Diabetes Action Canada (DAC) Data Repository
Launched in March 2016, Diabetes Action Canada (DAC) is a national network that aims to transform the health trajectory for patients living with diabetes and its related complications. This network facilitates meaningful connections between patients, primary care providers, and specialists to improve the care and significant cost savings within the health system. It also enables respectful communication with researchers to co-build studies producing solutions for the most important health concerns identified by patients. For more information about DAC, click here.
UTOPIAN Data Safe Have Data Flow
For current UTOPIAN data contributors, click here for more information.
For prospective UTOPIAN data contributors, click here for more information.