[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

ML Repository Additions



The following is a list of databases, etc. that have recently been
added to the UCI Machine Learning Repository.

Any comments or donations would be greatly appreciated
(ml-repository@ics.uci.edu).



Patrick M. Murphy (Librarian)

P.S. Check out our new repository home page:

     http://www.ics.uci.edu/
mlearn/MLRepository.html


- Bach Chorales (time-series) database (donated by Darrell Conklin)

  Sequential (time-series) domain.  Single-line melodies of 100 Bach
  chorales (originally 4 voices). Number of Instances: 100 Chorales, 
  each with 
45 events.  Number of Attributes: 6 (nominal) per event.
  Includes grammar describing the chorale dataset.

- Page Blocks Classification database (donated by Donato Malerba)

  The problem consists in classifying all the blocks of the page
  layout of a document that has been detected by a segmentation
  process. This is an essential step in document analysisin order 
  to separate text from graphic areas. Indeed, the five classes 
  are: text (1), horizontal line (2), picture (3), vertical line 
  (4) and graphic (5). 5473 examples comes from 54 distinct 
  documents.  All attributes are numeric.

- converter.lisp (donated by Stefanos Manganaris)

  This code reads UCI and C4.5 data files directly into LISP.
  Instances are transformed into any form, as specified by a 
  user-defined LISP function.  One typically has one such function 
  for each learner.  Functions can also be written to extract 
  features or otherwise manipulate the data.  (in utilities/)