Ads by Google
Christian Borgelt's Web Pages

PyFIM - Frequent Item Set Mining for Python

Download

fim.so (785 kb) GNU/Linux Python 3.10 shared object
fim.pyd (396 kb) Windows Python 3.10 dynamic module
pyfim.zip (865 kb) C sources, version 6.30 (2022.11.22)
pyfim.tar.gz (802 kb)

Description

PyFIM is an extension module that makes several frequent item set mining implementations available as functions in Python 3.10 or later. Currently apriori, eclat, fpgrowth, sam, relim, carpenter, ista, accretion and apriacc are available as functions, although the interfaces do not offer all of the options of the command line program. (Note that lcm is available as an algorithm mode of eclat.) There is also a "generic" function fim, which is essentially the same function as fpgrowth, only with a simplified interface (fewer options). Finally, there is a function arules for generating association rules (simplified interface compared to apriori, eclat and fpgrowth, which can also be used to generate association rules.

How to use the functions can be seen in the example scripts testfim.py and testacc.py in the source package (directory pyfim/ex). From a Python script or command prompt interface, call help(fim), help(apriori) (or help(fim.apriori)), help(eclat) (or help(fim.eclat)) etc. or print, for example, apriori.__doc__, eclat.__doc__ etc. for a description of the functions and their arguments.

The shared objects made available above were compiled for Python 3.10 on Linux Minit 21 and the dynamic modules made available above were compiled for Python 3.10 on Windows 10.

Installation: precompiled version

If you have a GNU/Linux system, you can use this extension module by simply downloading the shared object made available above and storing it in a directory that is on your PYTHONPATH (environment variable). A typical (local) installation directory is $HOME/lib/ while a typical (global) installation directory is /usr/local/lib/python3.10/site-packages/ . (Note that you may need root rights to copy into the latter directories.) A typical (local) installation directory for the Anaconda Python distribution is $HOME/anaconda/lib/python310/site-packages/.

If you have a Windows system, downloading the Python dynamic module made available above and placing it into the extension module directory of your Python installation should work. Consult the manual of your Python installation to find the correct directory. A typical directory for (global) installation is C:\Program Files\Python310\Lib\site-packages\ (Note that you may need administrator rights to copy into these directory.) A typical installation directory for the Anaconda Python distribution is C:\Anaconda3\Lib\site-packages\.

If you have trouble on Microsoft Windows, check whether you have the Microsoft Visual C++ Redistributable for Visual Studio 2022 (see under "Other Tools and Frameworks") installed, as the library was compiled with Microsoft Visual Studio 2022.

Installation: using setuptools

Another way to install the extension module for your system is to use the Python script setup_fim.py (in the source package), which uses Python's setuptools package to build and install the module.

On a GNU/Linux system call the script with

./setup_fim.py install

in a terminal window to build and install the extension module. If you get a "Permission denied" error message, check whether the file setup_fim.py is marked as executable. If it is not, add the executable flag with the command

chmod +x setup_fim.py

Alternatively, call the script explicitly through Python:

python3 setup_fim.py install

On a Microsoft Windows system call the script with

python3 setup_fim.py install

in a command prompt window to build and install the extension module. Note, however, that this direct call to Python is possible on Microsoft Windows only if the directory, in which the program python.exe resides, is contained in your PATH variable (environment variable, check its contents at a command prompt with echo %PATH%). Otherwise you may have to specify the full path to the Python program. A typical form of the command for this case is

"C:\Program Files\Python310\python.exe" setup_fim.py install

In addition, building the module requires a C compiler. On a GNU/Linux system Python uses the system C compiler, which for GNU/Linux is usually the GNU C compiler gcc. This compiler is essentially part of the system and thus basically always available. One only may have to install the Python development files (package python3-dev for Debian based GNU/Linux distributions).

On a Windows system Python commonly uses Microsoft Visual Studio C/C++, which therefore needs to be installed. Note that the Community Edition of this C compiler can be obtained (perfectly legally) free of charge.

Note generally (for GNU/Linux as well as for Microsoft Windows) that installing this extension module for all users may require root/administrator rights in order to copy the shared object/Python dynamic module to the standard extension module directory. Local installations (for individual users) are also possible.

Installation: recompilation with makefiles

On GNU/Linux (provided the Python development files are installed – package python3-dev on Debian based distributions), you may also install the extension module by simply calling

make all

in the source directory pyfim/src and copying the resulting shared object fim.so to a directory that is on your PYTHONPATH (environment variable).

On Windows you may also install the extension module by simply calling

nmake /f pyfim.mak all

in a command prompt of Microsoft Visual Studio C/C++ in the source directory pyfim/src and copying the resulting dynamic module fim.pyd to the extension module directory of your Python installation.

Should the compilation fail, check the definition of the variable PYDIR in the files makefile (Gnu/Linux) or pyfim.mak (Windows)

If you are using the Anaconda Python distribution, you may use the special makefile pyfim_conda.mak, which is configured for Anaconda 1.8.0 installed in the default path. If you have a different version or installed to a non-standard path, you may have to adapt the definitions of CONDAINC and CONDALIB in pyfim_conda.mak.

References

An overview of frequent item set mining in general and several specific algorithms can be found in the following paper:

More information about frequent item set mining, implementations of other algorithms as well as test data sets can be found at the Frequent Itemset Mining Implementations Repository.