[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
ML Repository Additions
The following is a list of databases, etc. that have recently been
added to the UCI Machine Learning Repository.
Any comments or donations would be greatly appreciated
(ml-repository@ics.uci.edu).
Patrick M. Murphy (Librarian)
P.S. Check out our new repository home page:
http://www.ics.uci.edu/
mlearn/MLRepository.html
- Bach Chorales (time-series) database (donated by Darrell Conklin)
Sequential (time-series) domain. Single-line melodies of 100 Bach
chorales (originally 4 voices). Number of Instances: 100 Chorales,
each with
45 events. Number of Attributes: 6 (nominal) per event.
Includes grammar describing the chorale dataset.
- Page Blocks Classification database (donated by Donato Malerba)
The problem consists in classifying all the blocks of the page
layout of a document that has been detected by a segmentation
process. This is an essential step in document analysisin order
to separate text from graphic areas. Indeed, the five classes
are: text (1), horizontal line (2), picture (3), vertical line
(4) and graphic (5). 5473 examples comes from 54 distinct
documents. All attributes are numeric.
- converter.lisp (donated by Stefanos Manganaris)
This code reads UCI and C4.5 data files directly into LISP.
Instances are transformed into any form, as specified by a
user-defined LISP function. One typically has one such function
for each learner. Functions can also be written to extract
features or otherwise manipulate the data. (in utilities/)