What is i2b2?

i2b2 is an acronym that stands for “Informatics for Integrating Biology and the Bedside.” It is an NIH-funded National Center for Biomedical Computing (NCBC) devoted to translational research (http://www.i2b2.org).

More specifically, it is a scalable, open-source informatics framework and architecture that can be used to host a research data warehouse. This architecture consists of two major pieces. The first is the back-end infrastructure (the “Hive”) that takes care of things like security, access rights, and managing the underlying data repository. The second piece is an application suite of query and mining tools that allows users to ask questions about the data (the workbench). The system was first developed within the Partner’s HealthCare system in Boston at MGH. It served as the architecture for their Research Patient Data Registry (RPDR).

What is the MSM Clinical and Translational Research Data Repository (CTRDR)?

The MSM CTRDR, also known as the i2b2 repository, is a centralized research data repository that is designed to integrate data from data sources that range from electronic health records and lab results, to genetic and research data, as well as other public sources such as birth registries, and government data like Medicaid. The CTRDR is an IRB approved limited (i.e. not completely de-identified) data repository built on the i2b2 database architecture. This architecture ensures that researchers do not have access to identified data until approval from the IRB


The i2b2 repository is intended to serve the MSM research community.

What are the potential benefits of the i2b2 repository?

By integrating data from many of the Institution’s clinical and research sources, the Biomedical Informatics Unit will be able to provide investigators with a more complete view of a patient than might otherwise be possible.

The potential benefits of this project are enormous. For the first time, researchers and investigators will be able to directly query a large data repository for the purposes of cohort identification and hypothesis generation. At the same time, we will be able to augment many existing research databases with a wealth of clinical information that was previously unavailable, allowing investigators to draw new insights and conclusions as they perform their data analysis.


Researchers will be able to run queries and perform simple data analysis on the limited data for the purposes of cohort identification and hypothesis generation. If an investigator identifies a suitable cohort or a promising avenue of researcher, they can request access to the fully identified records of those patients through the IRB.

How can l access the i2b2 repository?


Authorized user: Please click here.
Non Authorized use, please click here to a submit BIU services request.

Access to the i2b2 workbench is obtained by submitting completing a Biomedical Informatics Unit request form. After access is granted, the user will be assigned a user name and password by the database administrator. Any MSM faculty member can request an i2b2 repository account. Access for other research staff will be granted, but a supervising faculty member must first approve the request. These staff members will be placed in a user group with the supervising faculty, who assumes all responsibility for the collective actions of the group.

The repository would be accessed primarily through a web-browser, allowing users to identify and analyze patient cohorts. We also plan to allow them to download extracts from the repository for off-line analysis and hypothesis generation, assuming they have signed a Limited Data Use Agreement form.

There are three primary routes for users to access data in the i2b2 repository:


i) Using the web-based Workbench, which handles user authentication and provides automated query, export and analysis tools.
ii) By consulting with the BIU team to create specialized queries and reports based on their data needs.
iii) Through project-specific data marts that integrate an investigator’s research database with other clinical data from the general i2b2 repository.