Semantic Models Package (SEMMOD)

From MallWiki
Jump to: navigation, search

Contents

Background

In both intelligence and command and control operations the ability to identify and process natural language is pivotal. The task is made difficult by the volume of such information available making automated methods important in narrowing the search for crucial information. Unlike existing search engine technologies that are successful on the world wide web, emphasis must be placed not only on the precision of retrieved results, but also on recall. There are a number of methods for extracting semantic information that have been introduced in recent years that have yet to be compared systematically in military-like contexts. In this package we implement some of the more prominent methods, in preparation for there use in a systematic comparison. The methods covered are:

  1. Vector Space Model (Salton, Wong & Yang, 1975)
  2. Latent Semantic Analysis (Martin & Berry, 2007)
  3. the topics model (Griffiths & Steyvers, 2002)
  4. Non-negative matrix factorization (Lee & Seung, 1999, Ge & Iwata, 2002)
  5. Sparse Non-negative matrix factorization (Shashua & Hazan, 2005)
  6. Independent Components Analysis (Isbell & Viola 1998)
  7. Sparse ICA (Bronstein, Bronstein, Zibulevsky & Zeevi, 2005)
  8. Syntagmatic Paradigmatic model (Dennis, 2005)
  9. Constructed Semantics Model (Kwantes, 2005)

SEMMOD Prerequisites

The current version of SEMMOD is 1.7, SEMMOD has been tested under 32-bit and 64-bit versions of both Windows and Linux (Ubuntu) operating systems under is released under the GNU General Public License. The SEMMOD package is primarily written in Python, however sections have been optimized with C to enable timely compilation of model spaces. Semmod also relies on the Numpy and Scipy packages to implement matrix calculations.

  1. SEMMOD Package (Windows & Linux) - [http://mall.psy.ohio-state.edu/Semmod.tar.gz]
  2. Python 2.6 - (http://www.python.org/)
  3. Numpy 1.6.1 - (http://numpy.scipy.org/)
  4. Scipy 0.9.0 -(http://www.scipy.org/)

Installation

All dependencies can be installed in binary form, for example by apt-get install or windows installers. However, for the best performance install the numpy and scipy from source with ATLAS and GFORTRAN. You'll want to go the extra steps and install ATLAS, it makes the HanDles/SEMMOD/Numpy code run a lot quicker, especially in numpy.dot() calculations.


Hints for installing Numpy/Scipy (with ATLAS) on Ubuntu:

GFORTRAN

       $ sudo apt-get install gfortran

ATLAS

       $ sudo apt-get install libatlas-base-dev

NUMPY
To compile the numpy and scipy code from source you'll need python-dev:

       $ apt-get install python-dev
    

Then to install numpy, from the numpy source directory:

       $ python setup.py build --fcompiler=gnu95
       $ sudo python setup.py install

SCIPY
From the scipy source directory:

       $ python setup.py build --fcompiler=gnu95
       $ sudo python setup.py install

SEMMOD

       $ cd [your path]/semmod
       $ tar zxf semmod-1.7.tar.gz
       $ cd semmod-1.7
       $ sudo python setup.py install

Installation Problems

Running 64-bit Linux?

By default SEMMOD comes with C modules compiled for 32-bit Linux systems. If you are running a 64-bit system then you will need to run the setup.py scripts in the following directories with the command line "python setup.py":

  [your install directory]/semmod-1.7/semmod/spsvd/setup.py
  [your install directory]/semmod-1.7/semmod/csm/setup.py
  [your install directory]/semmod-1.7/semmod/spnmf/setup.py
  [your install directory]/semmod-1.7/semmod/topics/setup.py

Note: Do this before you run the main installation script contained in:

  [your install directory]/semmod-1.7/setup.py

Papers

Stone, B. & Dennis, S. (2011). Semantic models and corpora choice when using semantic fields to predict eye movement on web pages. International Journal of Human-Computer Studies. 69 (11), 720-740. [HTML]

Stone, B., Dennis, S., & Kwantes, P. J. (2011). Comparing Methods for Single Paragraph Similarity Analysis. Topics in Cognitive Science, 3 (1), 92-122. [HTML]

Stone, B., Dennis, S. & P. J. Kwantes (2008). A Systematic Comparison of Semantic Models on Human Similarity Rating Data: The effectiveness of subspacing. The Proceedings of the Thirtieth Conference of the Cognitive Science Society.. [PDF].

Spaces

Personal tools