fim.so | (785 kb) | GNU/Linux Python 3.10 shared object |
fim.pyd | (396 kb) | Windows Python 3.10 dynamic module |
pyfim.zip | (865 kb) | C sources, version 6.30 (2022.11.22) |
pyfim.tar.gz | (802 kb) |
PyFIM is an extension module that makes several frequent item set
mining implementations available as functions in Python 3.10 or later.
Currently
apriori,
eclat,
fpgrowth,
sam,
relim,
carpenter,
ista,
accretion and
apriacc are available as functions,
although the interfaces do not offer all of the options of the
command line program. (Note that lcm
is available as an
algorithm mode of eclat
.) There is also a "generic"
function fim
, which is essentially the same function as
fpgrowth
, only with a simplified interface (fewer options).
Finally, there is a function arules
for generating
association rules (simplified interface compared to apriori
,
eclat
and fpgrowth
, which can also be used to
generate association rules.
How to use the functions can be seen in the example scripts
testfim.py
and testacc.py
in the source package
(directory pyfim/ex
). From a Python script or command prompt
interface, call help(fim)
, help(apriori)
(or
help(fim.apriori)
), help(eclat)
(or
help(fim.eclat)
) etc. or print, for example,
apriori.__doc__
, eclat.__doc__
etc.
for a description of the functions and their arguments.
The shared objects made available above were compiled for Python 3.10 on Linux Minit 21 and the dynamic modules made available above were compiled for Python 3.10 on Windows 10.
If you have a GNU/Linux system, you can use this extension module
by simply downloading the shared object made available above and
storing it in a directory that is on your PYTHONPATH
(environment variable). A typical (local) installation directory
is $HOME/lib/
while a typical (global) installation
directory is /usr/local/lib/python3.10/site-packages/
.
(Note that you may need root rights to copy into the latter
directories.) A typical (local) installation directory
for the Anaconda Python distribution is
$HOME/anaconda/lib/python310/site-packages/
.
If you have a Windows system, downloading the Python dynamic
module made available above and placing it into the extension
module directory of your Python installation should work.
Consult the manual of your Python installation to find the
correct directory. A typical directory for (global) installation is
C:\Program Files\Python310\Lib\site-packages\
(Note that you may need administrator rights to copy into
these directory.) A typical installation directory
for the Anaconda Python distribution is
C:\Anaconda3\Lib\site-packages\
.
If you have trouble on Microsoft Windows, check whether you have the Microsoft Visual C++ Redistributable for Visual Studio 2022 (see under "Other Tools and Frameworks") installed, as the library was compiled with Microsoft Visual Studio 2022.
Another way to install the extension module for your system is to
use the Python script setup_fim.py
(in the source package),
which uses Python's
setuptools
package to build and install the module.
On a GNU/Linux system call the script with
./setup_fim.py install
in a terminal window to build and install the extension module.
If you get a "Permission denied" error message, check whether the
file setup_fim.py
is marked as executable. If it is not,
add the executable flag with the command
chmod +x setup_fim.py
Alternatively, call the script explicitly through Python:
python3 setup_fim.py install
On a Microsoft Windows system call the script with
python3 setup_fim.py install
in a command prompt window to build and install the extension module.
Note, however, that this direct call to Python is possible on Microsoft
Windows only if the directory, in which the program python.exe
resides, is contained in your PATH
variable (environment
variable, check its contents at a command prompt with
echo %PATH%
). Otherwise you may have to specify the full path
to the Python program. A typical form of the command for this case is
"C:\Program Files\Python310\python.exe" setup_fim.py install
In addition, building the module requires a C compiler.
On a GNU/Linux system Python uses the system C compiler,
which for GNU/Linux is usually the GNU
C compiler gcc
. This compiler is essentially part
of the system and thus basically always available. One only may have to
install the Python development files (package python3-dev
for Debian based GNU/Linux distributions).
On a Windows system Python commonly uses Microsoft Visual Studio C/C++, which therefore needs to be installed. Note that the Community Edition of this C compiler can be obtained (perfectly legally) free of charge.
Note generally (for GNU/Linux as well as for Microsoft Windows) that installing this extension module for all users may require root/administrator rights in order to copy the shared object/Python dynamic module to the standard extension module directory. Local installations (for individual users) are also possible.
On GNU/Linux (provided the Python development files are installed
– package python3-dev
on Debian based distributions),
you may also install the extension module by simply calling
make all
in the source directory pyfim/src
and copying the
resulting shared object fim.so
to a directory that is
on your PYTHONPATH
(environment variable).
On Windows you may also install the extension module by simply calling
nmake /f pyfim.mak all
in a command prompt of Microsoft Visual Studio C/C++ in the source
directory pyfim/src
and copying the resulting dynamic
module fim.pyd
to the extension module directory of your
Python installation.
Should the compilation fail, check the definition of the
variable PYDIR
in the files makefile
(Gnu/Linux) or pyfim.mak
(Windows)
If you are using the Anaconda Python distribution, you may use the
special makefile pyfim_conda.mak
, which is configured for
Anaconda 1.8.0 installed in the default path. If you have a different
version or installed to a non-standard path, you may have to adapt
the definitions of CONDAINC
and CONDALIB
in
pyfim_conda.mak
.
An overview of frequent item set mining in general and several specific algorithms can be found in the following paper:
More information about frequent item set mining, implementations of other algorithms as well as test data sets can be found at the Frequent Itemset Mining Implementations Repository.