So you want to be an independent software developer 月曜日, 6月 19 2006 

Recently I've been investigating the possibility of starting a micro independent software vendor (mISV, apparently).  This involves a heck of a lot of work which is not strictly related to getting your software finished and bug free.  Assuming you can pass that hurdle, here's some other stuff you might want to get a handle on:

 Web site: You absolutely, positively need a web site to promote your software.  It will probably also be your number one sales channel.  Even if you have additional channels (for example, if you manage to hammer out a distribution detail with a portal) the fact that you'll get so much more money from sales at your own website makes it well worth your trouble to spend some time and money setting up the site.

Web hosting is a commodity business nowadays.  After a little poking around the Internet, it seems is about as good a choice as anyone.  You get your domain name for about $2 with a hosting contract (2 month minimum) and their cheapest hosting plan gets you 250 GB of transfer for $4 a month.   If you need CGI, for example for customer tracking or to implement your own web store, you'll want to upgrade to their next higher hosting plan for a whopping $7 a month.  Warning: When it comes to support and dispute resolution you probably get what you pay for here.  If your venture takes off you might want to move to a web host where you'd actually rate enough importance to get a live representative if you call in… but lets not get ahead of ourselves, shall we?

Web site development: "Hey, I can do this by myself on notepad!"  Well, I suppose you could… if this were 1996.  In 2006, having a nice professional looking website is a) not very expensive and b) worth its weight in gold when you're trying to convince someone to hand over their credit card number to you.  I have no particular recommendation on a web designer, but get somebody who can build you a nice extensible site with CSS so you can add additional pages in the same style yourself.  I suppose you could cut corners and use one of the default templates that comes with, say, OSS blogging software… but would YOU purchase something from a page like that?  Budget $100 as the bare minimum and likely a bit of a multiple of that.

 Marketing: So you've got a web site hosted somewhere on the Internet.  Thats great.  What are you missing?  Customers.  With no userbase to start out with you're cut out of the most important marketing tool: word of mouth.  Thus, to get the ball rolling, you're going to be relying on the world's largest marketing firm: .  First, you need to optimize your site so that its maximally search-engine friendly.  There's gigabytes of good advice and sheer voodoo written about how to do this on the Internet.  I'll give you the five second summary: make sure everything on your site is tagged in machine-readable format (alts on your images, etc), never miss a chance to provide natural language information (filenames and URLs should be human readable and descriptive of content, your TITLE tag should include your keywords in addition to your company name), and have the keywords that your users will be searching for in your site (if, for example, your users are non-technical then make sure the non-technical gloss of what your software does appears somewhere in, say, a FAQ page).  

Then your other, likely more productive option, is Google AdWords.  Here's how it works: you come up with a text ad which points at your site, and you hand it to Google along with a list of keywords which you want that ad to appear with.  If someone searches a combination of your keywords, Google will hold an automated auction between you and the other people aiming for that keyword, and show adds for the folks offering the most money.  You pay per click (or per thousand impressions if you target a particular site — more on that later).  For a keyword with vanishingly little competition, AdWords is *dirt cheap* — a penny per click.  For keywords with lots of competition and/or high value keywords ("online casino"), they're much more expensive (I hear some are in the $10+ dollar range, mostly for legal services).  If you've choosen your software niche well, you'll be going up against moderate competition with your main phrases and, this is key, *no competition* on phrases folks search but haven't thought to market to yet.  Let me give you an example.  Say you make software which, to continue my example from below, teaches Japanese.  Yay for you.  "Japanese software" has lots of bidders.  "Teach yourself Japanese" has exactly one ad displayed for it.  "Japanese language game", "Teach yourself kanji", "Kanji game", "Japanese edutainment", "Japanese practice", etc, have NO ads displayed for them.  This means you can pick up these searches for a song (a penny per click).  Then your ad takes them right to your sales presentation, which has in big bold type your two actions you want them to take: downloading the demo and purchasing the software.

Software deployment: I originally thought of spending a couple of hundred dollars on Installshield, which is an industry-standard installer program.  Here's the rub: why spend a couple hundred dollars when you can get something functionally identical and very professional looking for free.  And legally, too!  Take a gander at the Nullsoft Scriptable Install System, a GPLed piece of software (don't worry, you can wrap commercial software with it without any license issues).  Wouldn't you want your installer to look something like one of these?  Yeah, thought so.

Incidentally, I'm a Java developer.  Java development can be risky for an mISV, because you can't be sure that folks will have the JRE that you need installed, and folks might not understand the whole "click on this jar" concept.  You can make this a much simpler concept for them by adding a native executable to wrap your JAR.  Ideally, this executable would detect if they had no JRE or if their JRE were insufficient (if, for example, you are addicted to Java 1.5 and can't write two lines of code without creating a HashTable<String>) and then take them straight to Sun's site to fix the problem.  Haha, such a program also exists, and its free!  Try out Launch4j, and watch your conversion rate soar as you take the hassle out of the critical "launching the bloody program" step.  There are some other nice features, such as being able to display a native splash screen while loading the JRE (nice for those of us mildly concerned that Java, while being a quite responsive language during runtime, can be a bit… slow… when loading), and builds nicely from an ant task. 

Payment processing: OK, so you've got folks to your website, they've downloaded your demo, and they're ready to hand you money!  Thats great.  How can they get you the money?

Well, option #1 is you get yourself a credit card merchant account.  This can be a bit on the expensive side for startup costs, you'll have to pay a monthly maintenance fee, and you'll need some web facing and backend software to handle billing.  All in all, not a whole lot of fun.  On the plus side, this is absolutely the cheapest option in terms of the rake the payment processor will take (figure on ~$.50 + 2%). 
Your other option is working with a payment processor.   I've been able to locate three, which have very little functional differentiation from my perspective (since I'm just looking for the very simple "process my payments and mail out serial numbers, I'll handle the rest myself" service rather than any of their overpriced shopping cart/DRM/etc offerings).  Here's the links and prices: : $2.95 + 5% *or* min(2.50, 14.9%). No setup fee. : 10% "trial" offer, scales to $15k yearly sales.
Then 15% and declining based on sales min($3.00, 8.9%).  $10 setup fee.

 You can do the math on which is cheapest at your price point and expected level of sales.  I personally will be launching something within the next few weeks at $15, and wins at that level by a significant amount, but if you're planning on doing significant levels of business (say, in the $30k per year range) on more expensive program (say, $40) their 15% fee is going to lose to either of the other options.

Note there is a bit of vendor lockin here: while you won't strictly speaking be contractually obligated to only use their service transferring your serial numbers, customer data, and registration schemes to another system after having used one of them for a while will burn up a LOT of your time.  The creator of Lux, a rather successful Risk clone, has some great words about this here.   

An Indie Business Plan 木曜日, 6月 15 2006 

This is not strictly work-related but I was intrigued by Brian Green's challenge to come up with a business plan for an indie game. I am posting it here rather than in his comment section because of the length.

Lets start in reverse: How much money do we need to make this project worthwhile? We have 2 people working for equity. I'm going to assume they are working for the chance at a not-extravagant-but-comfortable salary for an educated knowledge worker. $30k per year is what I would consider the minimum to entertain an employment offer as a 24 year old engineer with no responsibilities, so lets use that as the baseline. 18 month development cycle * 2 employees = 3 man-years of labor. Plus we need to recoup our $50k initial investment. This means we'll need $140,000 in net sales. I'm going to assume we lose 50% of every sale to the download partner, that means we need gross sales of $280,000. (This will depend on what price point and contract terms you pick, but for ballparking lets run with 50%)


The Spam Community vs. The Anti-Spam Community 金曜日, 5月 12 2006 

In the anti-spam community we spend an awful lot of time pouring over headers, writing regular expressions to catch "ratware", and training Bayesian filters to do content analysis.  But, while we gripe about spammers in our mailing lists and blog posts, we don't often describe their operations in detail: spam, for our Internets and purposes, pops out of some ether when it arrives at our mail servers.  This strikes me as a poor foundation for reasoning about the problem.  So lets stop talking about spam for a second, and talk about spammers.

 Spammers have a community, no less than anti-spammers do.  It is present in underground IRC channels, peer networks over ICQ, and web-based forums such as <a href="">SpecialHam</a&gt; (where I lurked periodically and learned most of the following).  Spammers have an <i>infrastructure</i> — there are dozens of players involved in getting the latest offer for Mr. Wiggly pills to your mailbox.  This infrastructure has specialization by roles.  Just like the anti-spam communityhas  a handful of deep-thinkers like Paul Grahm, who wrote the seminal <a href="">A Plan for Spam<.a> essay which kicked off the modern Bayesian filter experiment, there are deep thinkers of spam.  A relatively tiny portion of both communities has the technical skill necessary to develop tools to further their interests, and these tools are both shared and sold.  A still small but larger portion of each community is expert with techniques to make maximum use out of their available tools — for example, writing regular expressions for <a href="">SpamAssassin</a&gt;.  The majority of the community has no special level of technical skills and seeks turn-key solutions where you click two buttons and go.

 Lets talk a little about the types of tools the spam community has a need for.

 scrappers — A scrapper is an Internet spider which collects email addresses.  It scans publicly available web pages, newsgroups, forums, etc to find more targets for spam.  In the anti-spam community, we've advised people to be circumspect with publishing their email address and to use obfuscation tricks like  It should come as no suprise that many scrappers have adapted to these tricks, for example by using regular expressions which recognize / (AT) / as equivalent to /@/.  There are other ways of getting email addresses which do not involve scrappers, which are described below.

 verifiers — A verifier takes as input a list of email addresses and returns as output a sublist of those email addresses which can actually be delivered to.  There are a variety of methods for accomplishing this, and many of them involve actually mailing the inboxes in question.  Many mail servers now will refuse connections if you deliver too many messages to invalid inboxes in a particular period, which makes this technique risky from the spammer's perspective (it will cost him use of his IP address — more on this later).  As a result, most verifiers tend to be custom-built pieces of software which are domain specific and, if they target a large domain, very valuable (prices range in the hundreds of US dollars).  One example of a strategy for verifying email addresses in a very valuable domain is, for, creating multiple dummy AIM accounts and, over the period of a week, noting which addresses on the list come online from an AOL clients.  AOL of course has countermeasures in place, and these programs generally don't have a very long effective lifespan.  They also don't need it, as they're largely produced in Eastern Europe where the prospect of thousands of dollars of payoff for selling a successful script can motivate an awful lot of talent.

mailers — Then there are mail agents which actually send you the mail.  A spammer has several operational requirements which are not dissimilar to those of a person running a large mailing list: his software must purge bounced addresses from his list, generate an incredibly large number of emails in a short amount of time, and so forth.  There are other requirements imposed by being in the "biz": the mailer cannot be detectable by spam filters which look for telltale signs of "ratware" (the anti-spam community's derisive label for spamming software).  For example, ratware which immitates a genuine email program (such as Thunderbird) but which places its headers out of order will get its messages almost automatically bounced by systems employing SpamAssassin or similar rule-based techniques.  The mailer must also avoid anti-spam countermeasures such as RBLs (blacklists of known spammer IPs), which generally entails using rotating IP addresses from "bullet-proof hosting" (see below) or distributing the mailing across a botnet (see below).

 botnet — Not technically a piece of software, a botnet is a network of computers which have been subverted by a trojan, virus, or other security exploit.  A computer so afflicted is called a zombine.  Botnets are generally controlled over dedicated IRC channels.  Spammers generally buy access to botnets from virus writers or from others in the spam community who take existing viruses and modify their payloads to include code capable zombifying the machine.  There exist a variety of open source tools to include arbitrary payload with a given exploit, greatly decreasing barriers to entry to this market: an example is <a href="">Metasploit</a&gt; (which is, incidentally, aimed at "white hat" penetration testers — there are much more nasty such packages lurking in dark corners of the Internet).

Stay tuned for our next installment, where we cover money, the driving force behind the spam community. 

A bit of an introduction 金曜日, 5月 12 2006 

My name is Patrick McKenzie and I'm a member of the R&D team at Softopia Japan.  I'm currently working on a year-long project to develop something new and interesting in the field of spam filtering.  This blog is devoted to chronicling that effort and, perhaps, providing a useful resource to the anti-spam community by gathering together research and commentary from around the Internet.  It will also be published in Japanese.

ソフトピアのブログのデモ版 金曜日, 4月 28 2006 


金曜日, 4月 28 2006 

Hello world! 金曜日, 4月 28 2006 

Welcome to This is your first post. Edit or delete it and start blogging!