back to DITELO 

Speech databases collected at IRST on the telephone line

All such databases are listed here. For training purposes, we also use APASCI and SPK databases, which were acquired at 16kHz in a quiet environment and then digitally filtered. 

 
name
description
PHONE1 PHONE1 files were collected between december 1995 and july 1996 by means of an automatic system which performed automatic calls to previously advised speakers.  In total 281 speakers were collected, for a total of 5210 files (including silences).  Each speaker has 1 to 64 files (including silences); 223 speakers have more than 15 files. 

Files are divided into higQ (3631 files), medQ (1191) and lowQ (388), according to their content (presence/absence of spontaneous speech phenomena, out of vocabulary words, etc.). This subdivision is indicative, as some files were re-labeled in successive phases. 

 1652 acoustic  phonetically rich sentences 
  422 city      info about speaker's origin 
 1671 digit     digit sequences 
  844 noise     silences 
  621 yes-no    confirmations 
 5210 total 

See also a tech report which describes this database, and a paper containing some results. 

FIELD1 FIELD1 files were collected on december 1996 in the Exchange of 
Acilia, Roma. 

Files are divided in the following classes, according to the state of the system during which they were collected. Their content may be different. 

  15 digit      isolated digits 
 246 loopdigit  continuous digit sequences 
 626 yes-no-a   confirmations by user a 
 228 yes-no-b   confirmations by user b 
1115 total

See also a page and a paper describing FIELD1,2,3 (and containing detailed information about the real number of usable data) and reporting results.

FIELD2 FIELD2 files were collected on january 1997 in the Exchange of Acilia, Roma. 

Files are divided in the following classes, according to the state of the system during which they were collected. Their content may be different. 

  24 digit      isolated digits 
 205 loopdigit  continuous digit sequences 
 654 yes-no-a   confirmations by user a 
 208 yes-no-b   confirmations by user b 
1091 total

See again a page and a paper describing FIELD1,2,3.

FIELD3 FIELD3 files were collected from february 13 to february 26, 1997, in the Exchange of Acilia, Roma. 

Files are divided in the following classes, according to the state of the system during which they were collected. Their content may be different. 

1499 loopdigit  continuous digit sequences 
1301 yes-no-a   confirmations by user a 
1337 yes-no-b   confirmations by user b 
4137 total 

Other files are available (348 digit + 3689 yes-no-a), but not yet transcribed.

See again a page and a paper describing FIELD1,2,3.

DEMO1 DEMO1 files were collected on february 1997 at IRST (internal demo). Most of the calls were performed by people working at IRST. 

Files are divided in the following classes, according to the state of the demo during which they were collected. Their content may be 
different. 

 133 surnames   surnames of people working at ITC/IRST 
  16 names      names (when the surname is not unique) 
 350 digit      isolated digits 
  95 loopdigit  continuous digit sequences 
 543 yes-no     confirmations 
 143 cities     major Italian cities (capoluoghi) 
1280 total

PHONE2 PHONE2 files were collected in august 1997 by means of an automatic system which performed automatic calls to previously advised speakers. In total 269 speakers were collected, for a total of 4034 files (including silences).  Each speaker has 2 to 37 files (including silences); 121 speakers have more than 15 files. 

Files are divided into higQ (3669 files), medQ (122) and lowQ (243), according to their content (presence/absence of spontaneous speech phenomena, out of vocabulary words, etc.). This subdivision is only indicative. 

 1356 acoustic  phonetically rich sentences 
  307 city      info about speaker's origin 
  639 digit     digit sequences 
 1212 noise     silences collected during the system's introduction 
  520 yes-no    confirmations 
 4034 total

FIELD4 FIELD4 files were collected from october 14 to october 16, 1997, in the Exchange of Acilia, Roma. 

Files are divided in the following classes, according to the state of the system during which they were collected. Their content may be different.

 419 loopdigit  continuous digit sequences
2255 yes-no-a   confirmations by user a 
 340 yes-no-b   confirmations by user b 
3014 total

CARITRO dec '97 / may '98 field data under collection ... see this page for a description of the application.
 
 

  Last update 1/6/1998 - Maintainer Roberto Gretter