<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Makura no Soshi &#187; CRM114</title>
	<atom:link href="http://mschuette.name/wp/category/projects/crm114-projects/feed/" rel="self" type="application/rss+xml" />
	<link>http://mschuette.name/wp</link>
	<description>枕草子</description>
	<lastBuildDate>Tue, 22 May 2012 08:32:27 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>Text::AI::CRM114</title>
		<link>http://mschuette.name/wp/2012/05/textaicrm114/</link>
		<comments>http://mschuette.name/wp/2012/05/textaicrm114/#comments</comments>
		<pubDate>Tue, 22 May 2012 08:32:27 +0000</pubDate>
		<dc:creator>Martin</dc:creator>
				<category><![CDATA[CRM114]]></category>
		<category><![CDATA[E-Mail]]></category>
		<category><![CDATA[english]]></category>
		<category><![CDATA[Projects]]></category>
		<category><![CDATA[benchmark]]></category>
		<category><![CDATA[comparison]]></category>
		<category><![CDATA[cpan]]></category>
		<category><![CDATA[crm114]]></category>
		<category><![CDATA[libcrm114]]></category>
		<category><![CDATA[perl]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">https://mschuette.name/wp/?p=827</guid>
		<description><![CDATA[I released my first CPAN module. I finally played with libcrm114, a C library that implements several text classification algorithms. It is a potential replacement for the mailreaver.crm tool, which is the basis for my SpamAssassin plugin. Having a C library removes the need to fork a crm interpreter process (and in most cases also [...]]]></description>
			<content:encoded><![CDATA[<p>I released my first <a href="https://metacpan.org/module/Text::AI::CRM114">CPAN module</a>.</p>
<p>I finally played with <a href="http://crm114.sourceforge.net/">libcrm114</a>, a C library that implements several text classification algorithms. It is a potential replacement for the <code>mailreaver.crm</code> tool, which is the basis for my SpamAssassin <a href="/wp/crm114-spamassassin-plugin/">plugin</a>.</p>
<p>Having a C library removes the need to fork a crm interpreter process (and in most cases also the need to read the learned feature data) for every single classification; it also enables the inclusion into other languages (so far I know modules for <a href="http://gymx.net/php-crm114/">PHP</a> and <a href="https://github.com/pmundkur/libcrm114">Python</a> – now my module seems to be the first one for Perl).</p>
<p>A first and unprofessional benchmark against my mailbox confirms the expected performance improvement. I use <a href="https://metacpan.org/module/AI::CRM114">Text::AI</a> for the old fork-interpret-model (<a href="/files/crm114/test_ai_crm114.pl">source</a>) and <a href="https://metacpan.org/module/Text::AI::CRM114">Text::AI::CRM114</a> for the new library in-memory model (<a href="/files/crm114/test_text_ai_crm114.pl">source</a>).</p>
<pre>~&gt; time perl test_ai_crm114.pl
classified 9111 texts in 100.86 seconds (11.070 millisec per text)
Spam Texts: 68
 Ham Texts: 9043
30.717u 64.212s 1:41.50 93.5%   200+8161k 9361+0io 0pf+0w

~&gt; time perl test_text_ai_crm114.pl
classified 9111 texts in 7.89 seconds (0.866 millisec per text)
Spam Texts: 68
 Ham Texts: 9043
6.719u 0.485s 0:08.41 85.4%     9+2691k 0+0io 0pf+0w</pre>
]]></content:encoded>
			<wfw:commentRss>http://mschuette.name/wp/2012/05/textaicrm114/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CRM114-Plugin News</title>
		<link>http://mschuette.name/wp/2009/04/crm114-plugin-news/</link>
		<comments>http://mschuette.name/wp/2009/04/crm114-plugin-news/#comments</comments>
		<pubDate>Sat, 18 Apr 2009 11:39:05 +0000</pubDate>
		<dc:creator>Martin</dc:creator>
				<category><![CDATA[CRM114]]></category>
		<category><![CDATA[english]]></category>

		<guid isPermaLink="false">http://mschuette.name/wp/?p=280</guid>
		<description><![CDATA[This week brought great news for my CRM114 plugin: The upcoming amavisd-new version 2.6.3 will completely support CRM114 (either standalone or as an SA plugin) so no more patches are required to include custom headers. In addition Mark made several improvements to my plugin itself, so I am happy to release a new plugin version [...]]]></description>
			<content:encoded><![CDATA[<p>This week brought great news for my <a href="/wp/crm114-spamassassin-plugin/">CRM114 plugin</a>: The upcoming <a href="http://www.ijs.si/software/amavisd/">amavisd-new</a> version 2.6.3 will completely support <a href="http://crm114.sourceforge.net/">CRM114</a> (either standalone or as an SA plugin) so no more patches are required to include custom headers.</p>
<p>In addition Mark made several improvements to my plugin itself, so I am happy to release a new plugin version 0.8 (see the <a href="/wp/crm114-spamassassin-plugin/">project page</a> for the <a href="http://mschuette.name/files/crm114.pm">module</a>, its <a href="http://mschuette.name/files/crm114.html">documentation</a> and additional notes).</p>
<p><strong>Update:</strong> I just noticed that CRM114&#8242;s stable versions (those from 2007) do not support the “<code>--report_only</code>” option. Thus I made a last minute change after uploading and deactivated the option in line 653 (= line 607 in the SA3.3 version).</p>
]]></content:encoded>
			<wfw:commentRss>http://mschuette.name/wp/2009/04/crm114-plugin-news/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

