List of fhir-to-omop Tools and Utilities: Create and Populate an OMOP CDM Database with FHIR Patients From Scratch

This page summarizes the tools that are available in the fhir-to-omop project to build and populate and OMOP CDM database from scratch using data from a FHIR Server. All of the components used to do this are also available and can be used independently. These components are listed below as Utilities. These utilities include functionality such as authenticating to a FHIR server using OAUTH2 (or any other method), getting a list of patient ids from a FHIR server that can then be used to get patients from a FHIR server, retrieving patients from a FHIR server, converting FHIR patients into OMOP CDM resources, etc. The tools and Utilities listed below are currently provided publicly under the Apache 2 license by the fhir-to-omop project. All of the tools listed here have a simplest complete example listed. These examples are all JUnit integration tests. These examples can be run individually to observe the intended behavior. These tests are also run routinely as part of our build process. To run these tests, follow the instructions for how to use fhir-to-omop as a developer .

Details for Each Component are Available Here:

- Build Your OMOP CDM Database
- Get the IDs of the Patients on Your FHIR Server
- Download Patients from Your FHIR Server
- Upload Your FHIR Patients into Your OMOP CDM Database

Notes on Configuration and Testing

All of the fhir-to-omop tools are configured using a single properties file. Most of the parameters in this file are for the instance of OMOP that can be created locally by these tools. Therefore, most of the parameters are all ready populated with default values. A copy of this file is provided in the root directory of the fhir-to-omop.zip file created for each distribution (e.g. fhir-to-omop-0.0.14.zip). The fhir-to-omop project maintains a robust set of JUnit integration tests that can be run individually. The entire JUnit test suite can be run using the RunAllIntegrationTests class.

Notes on Design Patterns and System Architecture

Architecture and design of the fhir-to-omop project is discussed in detail here , however there are a few key features that should be noted here.
Element Description
Authentication and Application Parameters All authentication tokens and other configurable parameters are storred in the app.properties file. A (mostly) working example of this file is included in the root of the folder of the zip file that is used by application users. For full functionality you will need to provide your UMLS and Synthea keys. Developers can add an app.properties file to the fhir-to-omop/src/main/resources/auth folder. This file is excluded from our repository explicitly in .gitignore and is explicitly removed as part of the deploy build.
Data Value Objects (DVOs)

The OMOP Common Data Model (CDM) tables are represented in the Java code base of fhir-to-omop as Data Value Objects (DVOs). These data value objects are auto generated by querying the databases meta data and then constructing Java classes that exactly represent these tables.

The intersection between FHIR and OMOP is completely encapsulated in the OmopPersonEverything class. The OmopPersonEverything is constructed using and FHIR /Person/[ID]/$everything resource. The OmopPersonEverything then provides this resources as collections of DVOs that exactly map to OMOP tables. These DVO objects can then be used to directly write data to the data base using the Dao class. These DVO objects can also be used to otherwise represent the data as CDM structures (e.g. by writing the data to CSV files that represent the CDM tables). In some cases additional information is derived from the data. This additional data is managed using the DvoProxy class (which create DVO classes under the hood). Details for the OmopPersonEverything and OmopPersonEverythingFactory are listed in the Utilities section below.

Tools

The fhir-to-omop tools are the more complete solutions that perform complex tasks. Things like creating an entire OMOP instance including populated vocabulary tables from scratch. Things like populating an OMOP instance with the entire contents of a FHIR Patient server. All of the Tools listed in this section can be launched from the Standalone Application. The fhir-to-omop standalone command for each tool is shown below the title of each description (e.g. fhir-to-omop i will Build a complete OMOP CDM Database).
Tool Description
Build Your OMOP CDM Database

Build a complete OMOP CDM Database
(fhir-to-omop i)

The CreateOmopInstanceTool class will build a complete OMOP CDM database that is ready to receive data and get plugged into OHDSI tools. This tool will

  • Drop the OMOP Database to be created if it exists
  • Create the OMOP database instance using the CDM 5.4 scripts
  • Create the OMOP DQD results database instance
  • Create an OMOP Database user
  • Create tables used for mappings
  • Create a cdm_source record (this is used by the Data Quality Dashboard)
  • Create the sequences used to generate the primary keys when FHIR resources are imported
  • Load all of the terminology tables (e.g. concept)
  • Create indexes used during the data loading process
The following prerequisites must be met before this tool can be run.
  • A running instance of Microsoft Sql Server
  • A folder containing the files that define the vocabularies to be used for the instance downloaded from Athena
  • Correct settings in the app.properties file. The app.properties file provided with the distribution has everything required for a default build except for the location of your Athena vocabularies files and the settings for your MS Sql Server database.
  • Java: This application was developed and tested using Java 1.8.0
This tool can be run using the CreateOmopInstanceTool class directly. This class has a main method that will create the database, user, etc. based on the parameters in the app.properties file.

This tool can be run from the Standalone Application using the "instant-omop" or "i" option:
fhir-to-omop instant-omop
fhir-to-omop i

Get the IDs of the Patients on Your FHIR Server

Get a list of all of the IDs for all of the patients on a FHIR server.
(fhir-to-omop ids)

Before you can ask for a patient, you'll need to know what patients exist on a server. This tool can be used to fetch the entire list of patients that exist on a server. The ids will be written to a list of files with 1000 patient ids each. The files are written to the dir specified in the app.properties file. This utility gets all of its parameters from the app.properties file. This tool was developed using the Synthea SyntheticMass FHIR Server. To run this tool using SyntheticMass you will need to get Synthea credentials as describe in the Getting Started document.

This tool can be run using the DownloadPatientIds tool. This class has a main method that can be call directly.

This tool can be run from the Standalone Application using the "download-patient-ids" or "ids" option:
fhir-to-omop download-patient-ids
fhir-to-omop ids

Download Patients from Your FHIR Server

Download a set of patients from a FHIR Server.
(fhir-to-omop d)

The DownloadPatients tool is used to download a set of patients from a FHIR server. This class has a main method that can be called directly. The FHIR server is configured in the app.properties file. This tool uses the files down loaded by the DownloadPatientIds tool that contain the patient ids for the FHIR server. This tool is configured in the app.properties file to point to the directory containing the id files. When the tool is run all of the resources representing the Patient/[id]/$everything resource for each patient is downloaded. The Patient/[id]/$everything resource is allowed to be paged so each patient is usually represented by more than one file. The files representing each patient are downloaded into a directory given the name of the patient id. Each file in that directory is prefixed with the download order (0_, 1_, etc.) and the patient id and an arbitrary guid (to prevent duplicates).

This tool can be run from the Standalone Application using the "download" or "d" option:
fhir-to-omop download
fhir-to-omop d

Upload Patients

Upload Your FHIR Patients into Your OMOP CDM Database
(fhir-to-omop u)

The PopulateOmopInstanceFromFhirFiles tool can be used to populate an OMOP instance from a set of files representing FHIR patients. This tool is configured in the app.properties file. This tool will scan a root directory. Each directory in the root directory represents a patient. All of the files in that directory are Patient/[id]/$everything resources. Each Patient/[id]/$everything file is read. The set of files are parsed into a single OmopPerson object. The OmopPerson object is composed of Data Value Objects (DVOs) that are exact 1:1 mappings to the OMOP CDM. The OmopPerson object is then written to the OMOP CDM Database.

The PopulateOmopInstanceFromFhirFiles tool has a main method and can be called directly.

This tool can be run from the Standalone Application using the "download" or "d" option:
fhir-to-omop upload
fhir-to-omop u

Atlas: Initialization

Initialize Atlas MS Sql Server Database and PostgreSql Database
(fhir-to-omop atlas)

This tool is provided to automate the creation of the database objects (e.g. users, permissions, schemas, tables, etc.) required for installation of Atlas. Detailed instructions for installing Atlas are provided here.

This tool can be run from the Standalone Application using the "download" or "d" option:
fhir-to-omop atlas

Atlas: Datasources

Initialize Atlas MS Sql Server Datasources
(fhir-to-omop atlas2)

This tool is provided to automate the creation of the database objects (e.g. users, permissions, schemas, tables, etc.) required for the Atlas Data Sources functionality. This needs to be a separate process from the Atlas initialization as other steps need to be peformed before this step can be completed (e.g. the Postgres server needs to be created manually before this process can be executed). Detailed instructions for installing Atlas are provided here.

This tool can be run from the Standalone Application using the "download" or "d" option:
fhir-to-omop atlas2

Utilities

The fhir-to-omop utilities are the building blocks of the fhir-to-omop environment. These tools can be used individually. Each tool has a simplest complete example that exists as a JUnit integration test in the fhir-to-omop project. This integration test is listed at the bottom of the description for each entry in the list below.
OAUTH Authentication The fhir-to-omop tools use dynamic class loading to allow for any authentication method to be integrated into this framework. Authentication is done in the FhirServerAuthenticator class. This class dynamically loads the implementation of the HttpClientAuthenticator interface specified in the app.properties file. The HeaderTokenAuthenticator class is an implementation of HttpClientAuthenticator that is provided by this framework. This is the authenticator that has been used to authenticate to the SyntheticMass FHIR patient server. Other authenticators will be made available as requirements dictate as new servers become supported. The FhirServerAuthenticatorIntegrationTest class demonstrates how this authentication is implemented.
FHIR Parser Classes The org.nachc.tools.fhirtoomop.fhir.parser package contains a series of Parser classes. These classes use the HAPI FHIR API to parse FHIR resources. The Parser classes encapsulate HAPI FHIR API functionality and simplify common tasks such as reducing indirection and aggregating functionality in to single method calls.
FHIR Patient The FhirPatient class is a Data Value Object (DVO) that represents a FHIR Patient resource. The FhirPatient is composed of a series of FhirParser objects that represent the different components of a FhirPatient. For example, for each FhirPatient instance, there is a single PatientParser representing the patient data and lists of EncounterParser, ConditionParser, MedicationRequestParser, etc. objects that compose each FhirPatient instance.
FHIR Patient Factory The FhirPatientFactory class builds a FhirPatient from a FhirPatientResources. The FhirPatientResources class represents a list of resources that represent a single FHIR Patient/[id]/$everything resource. FhirPatientResources is an interface and can be used for any FHIR resource that can be converted into an InputStream (e.g. FhirPatientResources can use files or pull data directly from a FHIR REST service). Use of the FhirPatientFactory is demonstrated in the FhirPatientFactoryIntegrationTest class.
DVO Classes The org.nachc.tools.omop.yaorma.dvo package contains a series of Data Value Objects (DVOs). These DVOs represent a one-to-one mapping of the OMOP Common Data Model (CDM). Each DVO represents an OMOP CDM table. Each DVO has a field that represents each attribute of the OMOP CDM entity it represents. These classes are used to compose the OmopPerson class described below. These classes can be written to the database directly, or can be used to otherwise persist the information they contain (for example, they could be used to write the data to csv files etc.).
OMOP Person The OmopPerson class represents all of the OMOP CDM data for a single person. The OmopPerson object is composed of collections of DVO objects that are one-to-one mappings to the OMOP CDM. Instances of OmopPerson are created from a FhirPatient object by the OmopPersonFactory class described below.
OMOP Person Factory The OmopPersonFactory class takes an instance of a FhirPerson and creates an OmopPerson instance. Each component of the OmopPerson is built by its respective Builder class. Builder classes are found in the org.nachc.tools.fhirtoomop.omop.person.factory.builder package. The functionality of the OmopPersonFactory class is demonstrated in the OmopPersonBuilderIntegrationTest class.
Write OMOP Person to CDM Database The WriteOmopPersonToDatabase class writes a single OmopPerson to an instance of the CDM as specified in the app.parameters file. The functionality of this class is demonstrated in the WriteOmopPeopleToDatabaseIntegrationTest class. This class is used in a multi-threaded environment to write up to 200 patients per second to an OMOP instance (approximately 1.5 million patients in 2 hours).
Write OMOP People to CDM Database (Threading Implementation) The WriteOmopPeopleToDatabase class implements the threading model used by the fhir-to-omop tools to write FHIR Patients to OMOP. This threading model is very efficient and is capable of processing 1.5 million FHIR Patients in approximately 2 hours. This includes the entire process including reading the file from disc, parsing the FHIR content using HAPI FHIR, mapping of FHIR systems and codes to OMOP vocabulary_id values and OMOP concept_id values, and the inserts in to the OMOP data tables to populate the OMOP database. This threading model is based on the ThreadTool framework described below. The functionality of the WriteOmopPeopleToDatabase class is demonstrated in the WriteOmopPeopleToDatabaseIntegrationTest class.
thread-tool (Threading Framework) The thread-tool project is a framework that can be used to easily create very efficient multi-threaded applications. This framework provides several advantages over other strategies for implementing multi-threaded applications in Java
  • Simplicity and Ease of Use: The implementer using this framework only needs to implement a class that extends the ThreadToolUser and then create and execute a ThreadRunner. The details of implementation of a Java threading model are encapsulated by the framework thereby eliminating errors that often occur when multi-threaded applications are written in Java from scratch.
  • Sustained Continuous High Throughput Other threading models in Java tend to create a set of threads and run that set of threads until each finishes. Only then is another set of threads created and executed. This leads to spikes and troughs in activity. As a new set of threads is started all threads are running and peak activity is achieved. As each thread finishes, activity diminishes until only a few or one stragglers are running. At this point activity is minimized until the final thread finishes and a new batch is created and started. The ThreadTool prevents these periods of inactivity by creating groups of threads. A constant number of groups of threads are kept running at all times. As one group finishes, it is replaced by a new group of threads. When one group of threads is left with only a few stragglers, the other thread groups are still running. Thereby a relatively constant and maximum level of activity is maintained.
  • Reuse: This framework can be used anywhere high performance multi-threading is required. The thread-tool framework can be included in any Java project using the following Maven dependency declaration.
    <dependency>
    	<groupId>org.nachc.cad.tools</groupId>
    	<artifactId>thread-tool</artifactId>
    	<version>0.0.2</version>
    </dependency>
    
Functionality of the thread-tool framework is demonstrated in the ThreadRunnerIntegrationTest class of the thread-tool project.