Apache SpamAssassin is an extensible email filter that is used to identify spam. Once identified, the mail can then be optionally tagged as spam for later filtering. It provides a command line tool to perform filtering, a client-server system to filter large volumes of mail, and Mail::SpamAssassin, a set of Perl modules allowing Apache SpamAssassin to be used in a wide variety of email systems.
MH-sync is a small suite of command-line tools designed to allow an MH user to read mail offline by synchronizing a local set of MH folders with the "real" MH folders online at a remote site. Changes made using 'rmm' and 'refile' will be propagated back to the server site correctly without being affected by "folder -pack" or server-side message filing or removal. ssh is used as the transport, and no other ports need be open on the firewall.
Sitescooper automatically retrieves the stories from news websites, trims off extraneous HTML, and converts them into Plucker, iSilo, or Palm DOC format for later reading on-the-move. It will avoid stories you've already read, and can handle virtually any news site on the Web. Support for over 300 sites is included. Even if you don't have a handheld, it's still handy for simple website-to-text conversion.
EtText is a simple plain-text format which allows conversion to and from HTML. It provides an easy-to-edit, easy-to-read and intuitive way to write HTML, based on plain-text markup conventions. Like most simple text markup formats (POD, setext, etc.), EtText markup handles the usual things: insertion of paragraph tags, header recognition, and markup. However it also adds a powerful link markup system, and tries to generate code which conforms to XHTML.
WebMake is a simple Web site management system, allowing an entire site to be created from an optional set of text and markup files and one WebMake file. It requires no dynamic scripting capabilities on the server, and can be run entirely offline. It allows the separation of responsibilities between content editors, page designers, and the site architect. Only the site architect needs to edit the WebMake file itself, or know Perl or WebMake code. Perl scripts can be embedded and executed to build pages. Automatic dependency tracking means that pages will not be rebuilt unless necessary. Metadata support means that indexes etc. can be built automatically.
You are viewing a mobilized version of this site...
View original page here
Spam Filters
Re: Interesting but incomplete Hi -- nice study in general. But I agree that using SpamAssassin's sa-learn would have made a big difference. ;) The required amount for SA to start using bayes ...