» Content »pkg://akonadi_1.3.1.orig.tar.gz:257221
/ info downloads
These scripts are designed to make use of the Enron email dataset (see http://www.cs.cmu.edu/~enron/). They try to retrieve contact information from the messages and create an address book (VCARD format) from them.
To start the whole process, just run './run.sh'. It needs wget, tar, gzip, cat, sort, uniq and PHP.
Following is a list of the scripts:
Reads the mail files, analyzes their addressee information (names, company names, email addresses) and outputs it.
Takes about 4 minutes on a Pentium 4 2.4 GHz to read 128326 mail files.
Reads the output from extract_contacts.php and creates a VCARD address book from it. Additional information, like telephone numbers, birthdays, photographs, is randomly generated and added to the VCARDs.
Takes about 20 seconds on a Pentium 4 2.4 GHz to create 32072 VCARDS.
Randomly adds fake attachments to the email messages, in order to create a more realistic dataset.
(c) 2007 Robert Zwerus <email@example.com>
Results 1 - 1 of 1Search over 15 billion files
© 1997-2017 FileWatcher.com