TOOLS: MetaMap

Documentation: Description of MetaMap Data Versions

Description

In order to accommodate the UMLS source-vocabulary licensing permissions and processing requirements of as many users as possible, MetaMap releases beginning in 2011 have included three distinct versions of the data that are based mostly on the Restriction Categories of Metathesaurus source vocabularies. (See the Restriction Categories tab on the UMLS Source Release Documentation page for all UMLS sources’ Restriction Categories.) Each data version includes a Strict and Relaxed model; listed from smallest to largest, the three versions are:

  1. Base: The Base data version includes those source vocabularies with no associated licensing restrictions beyond those of the UMLS license; this version includes all and only sources of Restriction Category 0.
  2. USAbase: The USAbase data version includes those source vocabularies with no associated restrictions beyond a UMLS license, and free for use for US-based projects; this version includes the Base vocabularies (those with Restriction Category 0), plus the five Category-4 sources and the four Category-9 sources (including, most notably, SNOMEDCT). The USAbase version is a proper superset of the Base version, and might be the most appropriate version for users with a SNOMEDCT license. To repeat: This data version is MetaMap's default, but the default can be overridden by specifying another user-installed data version using the -V flag (See Additional Data Sets.)
  3. IMPORTANT NOTE: Users without SNOMED-CT licenses should use Base data version.

  4. NLM: The NLM data version includes the full Metathesaurus other than the CPT, CPTSP, HCPT, and MTHCH vocabularies from the CPT family, and the HCDT, HCPCS, and MTHHH vocabularies from the HCPCS family.

Specifying a data version using the -V option

Assuming that the "public_mm/bin" directory is in your program path and the "2012AB Base" optional database is installed, one can specify the Base MetaMap data version for 2012AB release using the following command:

    $ metamap -V Base -Z 2012AB ...

Similarly, one can specify "NLM" data version for 2012AA dataset (assuming that the 2012AA NLM optional dataset has been installed) using the command:

    $ metamap -V NLM -Z 2012AA ...

or if using metamap16 (in which 2015AB is the default) just specify the data version:

    $ metamap -V NLM ...