TOOLS: MetaMap

Distribution Frequently Asked Questions

Windows error: "'C:\Program' is not recognized"

If you see the error: "'C:\Program' is not recognized as an internal or external command, operable program or batch file." try moving the "public_mm" directory to a directory without any spaces in it or any of its parent directories.

If possible, install it in the "C:\" (or D:\, or the top level directory of another volume.)

The MetaMap program and it supporting programs were originally written for Unix and sometimes these programs do not work well with directory paths with spaces embedded in them.

Linux GLIBC requirements (Glibc 2.5 or greater)

MetaMap must be run on a Linux system using version 2.5 or greater of the GNU C Library (glibc). Linux distributions using glib 2.5 or greater include, but are not limited to, RHEL 5 (not RHEL4), Fedora Core 6 or later, Debian 5.0 (Lenny), Ubuntu 8.04 or later.

To determine which version of GLIBC is present on Redhat RPM based systems, please open a command prompt on your Linux server and type "rpm -q glibc" then press the Enter key.

On non RPM based Linux systems, please open a command prompt and change to the /lib directory then type "ls -al /lib/libc*". This will return a list of glibc files on your system including version numbers. (Reference: http://www.gordano.com/kb.htm?q=2108)

Running the Tagger server on an alternate port

Question

I observed an interesting thing when running 04FilterStrict script in which I get this error:

ERROR in calling tcp_connect for WSD Server on host localhost/port 1795:

 tcp_mishap(tcp_connect.connect host(localhost), port(1795),99)

I observed that the script ./skrmedpostctl is still running and has not terminated.

Also, sometimes while running the ./skrmedpostctl I get the error message that I can not create a socket at port number 1795. I have to change the port number inside the script before the tagger finally gets started. Could that be a source of this problem?

Answer

Yes, if you change the port number skrmedpostctl uses you need to change public_mm/bin/SKRrun (or SKRrun.in and re-run install.sh) to match the port you set in skrmedpostctl. The environment variables that should be changed in SKRrun are:

TAGGER_SERVER_DEFAULT_TCP_PORT=1795
TAGGER_SERVER_TCP_PORT_0=1795
TAGGER_SERVER_TCP_PORT_1=1795

Public MM Error: "metamap09.BINARY: not found" on 64-bit Linux

When invoking the following MetaMap command on a 64-bit Linux system:

echo "lung cancer" | ./bin/metamap09 -I

Gives the following output:

.../public_mm/bin/SKRrun -L 2009 -M /DATA/XDR -B /BDB4 -w .../public_mm/lexicon .../mm/public_mm/bin/metamap09.BINARY -Z 09 -I
exec: 154: .../public_mm/bin/metamap09.BINARY: not found

And running bin/metamap09.BINARY in the public_mm directory gives the following error:

-bash: bin/metamap09.BINARY: No such file or directory

This uninformative message occurs when the operating system attempts to map the executable into to memory but cannot find supporting shared libraries that use the same architecture as the executable.

The executable file metamap09.BINARY is a statically linked 32-bit binary but one should be able to run it on a 64-bit system provided that you have a 32-bit version of libc-*.so in /lib (as well as a /lib64/libc-*.so). (The locations of these files may vary.) There may also be other 32-bit supporting files that may need to be installed also.

Ubuntu and Debian systems

To run 32-bit MetaMap on a 64-bit Ubuntu or Debian system you may have to do one of the following things:

1. If you're running Ubuntu 14 or greater or Debian 7 (Wheezy) or greater use the command:

 sudo dpkg --add-architecture i386

then:
^<<
 sudo apt-get update
 sudo apt-get install libc6:i386 libncurses5:i386 libstdc++6:i386

See also:

2. if you are using Ubuntu v12.04 or below or Debian 6 (Squeeze) or below, use this:

 sudo echo "foreign-architecture i386" > /etc/dpkg/dpkg.cfg.d/multiarch

Then:

 sudo apt-get update
 sudo apt-get install libc6:i386 libncurses5:i386 libstdc++6:i386
alternatively for Ubuntu v12.04 use:

  sudo apt-get install ia32-libs-multiarch

or:

  sudo apt-get install ia32-libs-i386
or:
  sudo apt-get install ia32-libs

This depends on the distribution version.

The APT program will install any necessary dependencies as well. If you're running some other Linux distribution, check the corresponding package repositories for the necessary 32-bit packages.

Red Hat systems

If you are running on Red Hat 6, the following libraries must be installed:

glibc.i686        
glibc-static.i686 
libgcc.i686       

You can install the packages as root using the following command:

yum install glibc.i686 glibc-static.i686 libgcc.i686 libstdc++.i686

Java API Questions

MMServer denies client connection

If you see the following error from mmserver:

! [PBEANS] - denied connection from 10.56.79.25) 

Edit the file public_mm/bin/SKRrun.11 and change the environment variable ACCEPTED_HOSTS on 47:

# for mmserver
export ACCEPTED_HOSTS="['127.0.0.1']"

Add the ip address of the client machine (not the host running mmserver) to the client list assigned to ACCEPTED_HOSTS. For example, to allow mmserver to accept clients on host 192.168.1.123:

# mmserver will accept clients at host 192.168.1.123:
export ACCEPTED_HOSTS="['127.0.0.1','192.168.1.123']"

all of the entries must be ip-addresses, hostnames will not work.

"Connection refused" errors on Mac OS/X

On Mac OS/X 10.6, the Java API test program, testapi.sh gets the error: Error when querying Prolog Server: Connection refused.

One possible solution is to go to "System Preferences" -> Security -> Firewall -> Advanced and unset the option "Block all incoming connections" (You may want to block certain services.) And in "System Preferences" -> Network -> Advanced -> Proxies set the option "Exclude simple hostnames".

See also:

Mac OS X v10.5, 10.6: About the Application Firewall
http://support.apple.com/kb/ht1810

Data File Builder Questions

How resolve tcp_connect problems with 04FilterStrict

When using the Data File Builder Suite the program filter_mrconso displays tagger errors when running the 04FilterStrict script. The error is:

Established connection to Tagger Server on localhost.

ERROR in calling tcp_connect for WSD Server on host localhost/port 1765:
tcp_mishap(tcp_connect.connect host(localhost), port(1765),99)

ERROR in calling tcp_connect for WSD Server on host localhost/port 1765:
tcp_mishap(tcp_connect.connect host(localhost), port(1765),99)

DONE
calling split_filtered_mrconso

One might consider trying to run two tagger servers, each one on a separate computer. In the script public_mm/bin/SKRrun you can change the environment variables TAGGER_SERVER_HOST_0 and TAGGER_SERVER_HOST_1 to point to the computer where the tagger servers reside. If the servers were on the machines yukon and potomac, the entries in SKRrun would look like the following:

if you're running metamap09v2 you must specify the servers as follows:

# The TAGGER_SERVER_HOSTS variable is for Quintus Prolog (metamap09v2)
TAGGER_SERVER_HOSTS="['yukon', 'potomac']"
export TAGGER_SERVER_HOSTS

Also, you can specify more than two servers if you're using metamap09v2:

# The TAGGER_SERVER_HOSTS variable is for Quintus Prolog (metamap09v2).
TAGGER_SERVER_HOSTS="['yukon', 'potomac', 'hudson', 'ohio']"
export TAGGER_SERVER_HOSTS

If you're using metamap09 (not metamap09v2) you can use a maximum of two tagger servers.

TAGGER_SERVER_HOST_0=yukon
export TAGGER_SERVER_HOST_0

TAGGER_SERVER_HOST_1=potomac
export TAGGER_SERVER_HOST_1

Using custom data sets

Q: Will I still be able to run the original (uncustomized) MMTx after adding some custom files and running the followup processes to use them?

A: Creating a custom dataset will NOT affect your original MetaMap installation. However, it will be necessary to use the "-Z" option to access the new dataset:

$ ./bin/metamap09v2 -Z <custom dataset name>

For example, if the name of your data set was "obs_custom", you would use the command:

$ ./bin/metamap09v2 -Z obs_custom 

Adding terms to the dataset.

A: If I just want to add some terms to the default UMLS files, do I have to reproduce the UMLS MRCON, MRSO, etc., in my custom "umls" directory? Q: Yes, if you want to add terms to the default UMLS files you will need to generate the original UMLS Original Release Format files using the UMLS Metamorphosys release for that year. These files are not supplied with the Public MetaMap release. The new records containing the additional terms must conform to the record organization described in the "Table Descriptions" section of the "MetaMap Data File Builder" document(here.

Removing terms from the dataset.

Q: What if I also want to remove some UMLS terms from the lexicon based on arbitrary criteria (other than source as per Metamorphosys)? Can I just tell it what terms I don't like, or do I have to reproduce MRCON without them? What about MRSO, MRSTY, etc?

A: You would still need to generate the Original Release Format files using Metamorphosys. You would then need to remove any records referencing the terms you wish to remove. MRSO should probably be edited as well to remove any records with the same SUIs (string unique identifiers) as the removed MRCON records.

Do you have a list of Semantic Types and how do I map their abbreviations to their full names?

The Semantic Type information is contained in the UMLS "SRDEF" file obtained when you download the UMLS Metathesaurus and run MetamorphoSys. For lines beginning with "STY", the third field contains the full name of the semantic type and the ninth field contains the abbreviation.

For convenience, we have included a parsable list of Semantic Types and their abbreviations from the 2011AA version of the UMLS via the link below. The format of the file is "abbrev|Full Semantic Type Name".


2011AA version of Semantic Type Mappings (3.6 kb)

Also, we have included a parsable list of Semantic Groups and their mappings to the Semantic Types via the link below. The format of the file is "Semantic Group Abbrev|Semantic Group Name|TUI|Full Semantic Type Name". For additional information on Semantic Groups, please refer to the Semantic Groups web site.


2011 Semantic Group File (5.7 kb)

What is the function of the servers supplied with MetaMap?

The server MedPost/SKR tagger server (skrmedpostctl) is a Part-of-Speech tagger server that is required by MetaMap to tag the input text before parsing.

The WSD Server (wsdserverctl) is only necessary if you wish MetaMap to use Word Sense Disambiguation (WSD).

The MetaMap Server (mmserver) is also available for those who want to use the MetaMap Java API. It is not used by the command-line version of MetaMap.

Why is the second run of MetaMap on the same input faster?

Linux and possibly some other operating system cache file input/output (i/o) to previous invocations of programs (including MetaMap).

Why do I need to be online to run MetaMap?

You do NOT need to be online to use MetaMap as long as the Tagger server is running on the same computer that you are running MetaMap.

Running WSD Server on an alternate port

To change the port used by the WSD Server update the file public_mm/WSD_Server/config/disambServer.cfg modifying the section:

# ----------------------------------
# TCP port for the WSD server. 
# ----------------------------------
DISAMB_SERVER_TCP_PORT=5554

to use the new port (in this case 8554):

# ----------------------------------
# TCP port for the WSD server. 
# ----------------------------------
DISAMB_SERVER_TCP_PORT=8554

For MetaMap, open the file public_mm/bin/SKRrun.12 (and public_mm/bin/SKRrun.12.in) and change the line:

export WSD_SERVER_PORT=5554

to use the new port (in this case 8554):

export WSD_SERVER_PORT=8554