accfim.pdf | (1582 kb) | Accretion and FIM result diagrams |
accfim.zip | (32 kb) | scripts and other source files |
accfim.tar.gz | (25 kb) | |
detect.pdf | (4006 kb) | surrogate filtering result diagrams |
detect.zip | (41 kb) | scripts and other source files |
detect.tar.gz | (35 kb) |
The document accfim.pdf
contains the result diagrams for
the complete set of experiments concerning Accretion and its possible
extensions by frequent item set mining (other statistical tests,
subset conditions, maximal versus closed frequent item sets etc.)
that were conducted for the paper
Picado-Muiño et al. 2013 referenced below. Only few
of these diagrams are contained in the paper due to a lack of space.
For the theory underlying the methods, please consult the paper.
The archives accfim.{zip,tar.gz}
contain the scripts and
other source files, with which the experiments were conducted and the
document with the result diagrams was created.
The document detect.pdf
contains the diagrams for the
complete set of experiments concerning the surrogate-based assembly
detection method suggested in the paper
Picado-Muiño et al.
2013 referenced below. Only few of these diagrams are contained
in the paper. This document also contains results of various pattern
set reduction methods that are discussed in detail in the paper
Torre et al. 2013 referenced
below (but differ from the diagrams used in that paper). For the
theory underlying the methods, please consult the two papers.
The archives detect.{zip,tar.gz}
contain the scripts and other
source files, with which the experiments were conducted and the document
with the result diagrams was created.
Note that the scripts etc. were developed on/for a GNU/Linux system
and thus are directly executable on such a system or a similar one
(that is, some other GNU/Linux distribution). Although at least most
of the Python scripts should also be working on a Windows system (with
the possible exception of the parallelization scripts), most of the
other scripts (like the run
script, which is the main control
script, and the makefile
, which controls generating the
diagrams from the result data) may need porting to batch files or
something similar.
On a GNU/Linux system, the following software needs to be installed to run the experiments:
detect
scripts will also work without
this extension module, namely by falling back on a pure Python
replacement, which, however, is slower by a factor of about 40
or more),pdflatex
program,mptopdf
command,bash
,
awk
, tar
etc.) and are easy to install
otherwise.On such a system the experiments can be run by simply calling
the main script run
(in the directory accfim
or detect
, respectively) on the command line, which does
everything. The execution of the experiments exploits 4-fold
parallelization, thus making full use of the quadcore processors
basically all modern computers are equipped with. The progress of
the experiments can be followed on the command line, to which regular
progress messages are written. Once all experiments are completed
(which, even on a modern computer system, can take more than
40 hours for the accfim
scripts --- mainly because
of the huge number of individual experimental runs, namely in the
hundreds of thousands, and the high costs of Fisher's exact test
used in some of them --- and about 60 minutes for the
detect
scripts), the result diagrams are created and
compiled into the final documents, which are also directly
available above.