I’ve removed all the books from my amazon wishlist. If you want to buy me books. Please do it through Community Bookstore in park slope Brooklyn. I’ll have details to follow.

Community Book Store Of Park Slope
143 7th Avenue
Brooklyn, NY
(718) 783- 3075

I’m looking for
Silence on the Wire: A Field Guide to Passive Reconnaissance and Indirect Attacks by Michal Zalewski
ISBN: 1593270461

Rootkits: Subverting the Windows Kernel by Greg Hoglund, Jamie Butler
ISBN: 0321294319

Zero Day Exploit: Countdown to Darkness by Rob Shein, David Litchfield, Marcus Sachs
ISBN: 1931836094

Stealing the Network: How to Own an Identity by Raven Alder, Chris Hurley, Tom Parker, Ryan Russell, Jay Beale, Riley Eller, Brian Hatch, Jeff Moss
ISBN: 1597490067

I got this one for myself. =)

Just tell them it’s a gift for Francis Gulotta or post a comment about the book (anonymously?) or.. I’ll work out a way to make wish lists work with them. Just buy books from your local sellers folks. >_< -Francis

The spam we get is of biblical proportions

Spam is a problem, and at my day job it seems to be an ever increasing one. We have two main methods of blocking spam, but lets start at the beginning.

Our server accepts an email to a valid address. (And tells the server to bugger off if there’s no account here by that name.) Its scanned for virus’s and email client exploits, as well as blocking encrypted archives, scanning inside archives and scanning for macros in Microsoft Office documents. This stops a great deal of mail but not the majority of it. Next it’s put through a gauntlet of checks making sure it’s not spam. There are 12 modules that check the message, a collection of black list and white lists (sorta Greylisting) that can reject, accept, or pass on a message and send it down the chain. We have a large collection of manually added white listed addresses and automatically entered whit listed addresses. The mailsystem scans both incoming and outgoing mail for both content (bayesian) and email addresses. So if you email someone the reply won’t get caught up as spam.

We also employ real time black lists (or RBLs), which checks the ip and hostname of the server sending the messages (does a dns lookup) and checks it against several (we’re using 3 different sources right now) databases of ips that flag for spam, and abuse. These services have vast networks that just receive spam and virus attacks and log and identify them so other people can block them.

We have a spam cache, we don’t send the spam to our users junk folders, most of them don’t want to see it. out of the past 12,746 messages, 7,246 of them were caught with the RBL, and 5220 of them with the bayesian content filtering. And yet, still many get through. We get about 2000 spam a day caught and I think that’s only about 95%. Work that out over around 70 some odd users and that’s about 30 spam per person per day. Which sounds about average.

But biblical?


SUBJECT: And Saul as I have gone out, of Jephunneh.  And thy God, surely there
MESSAGE: to meet with water: Which the reward for thou, was under the land slew

SUBJECT: certain woman when the became a word shall be kept the flock,
MESSAGE: vessels: thereof, three hundred made before thee, and took Ishmael

SUBJECT:no; more, to his days and the great price.  But thou
MESSAGE: promised to the posts thereof, are all the men and when they have left

SUBJECT:Jekamiah, and is thou in peace from Assyria; have access
MESSAGE: shall the son that remain, in it If not on the rough wind into

That’s not even a little of it. Eventually they started coming it with the quote and image spam. Pictures with text advertising drugstores and stock tips. The first messages were to soften up our spam filter to let the other ones though. Go figure.


PS I get about 150 – 200 messages a day over a handful of accounts and my osx Mail app misses about.. 20 or so. While it’s % caught is not very high it’s a lot lower volume then at my 9-5.

Ikea Lamp

One of my favorite directors (and I don’t really care about directors) is Spike Jonze. He’s mostly done music videos and commercials. I just found out that he did an Ikea ad that I found quite funny. So in honor of that…

Ikea Lamp Ad

I was told that link doesn’t work (stupid quicktime) it should work now (stupid avi).
And while.. I have problems doing this I can recognize that this will be the easiest way to watch the movie for most people…

and when searching for that I found this sad spoof

Hits, speed, space and bandwidth.

I’ve written about this before.. I should probably search for it..
And to clarify, hits means unique hits, different ips.

I had reported that I get about 250 nique hits a day, and mostly from a video I posted. (About 4 out of 5 of the hits were for the video opposed to the main page) And so today I check back and I’m now getting on average 500 hits a day.


Well all this was prompted by me browsing my stats again (same thing happened last time) so lets give some stats.

Emo Kid Is posted on about 4 or 5 myspace pages. That gets me about 30-40 hits a month.
People like to hear long lists of Hobo Names about 30-40 of you a month.
I’m glad to see that someone found my favorite mashup of all time, the Toxic Loveshack and shared it with their friends, who share it with about 20 different people a month.
That stupid car commercial is still up there at 91 hits a month.

The top two requested pages are the “cool” archive page and the robots.txt which is well.. unimpressive. In fact while my traffic may have doubled, now almost half of it is web robots!

So while there are more people viewing my site, I have a lot more robots visiting too. Indexing and categorizing and all sorts. And hopefully, determining that I’m just a stupid blog and only relevant to specific searches. The top search that finds roborooter is “volvo logo” followed by “ghosts” and then “emo kid”. (Wtf emo kid?) I take pride in not incessantly repeating other people’s posts and linking to them. Which is what most blogs seem to do these days.

The Gallery should run marginally faster now, if I get some free time I’m going to try out eaccelerator and compile it for dreamhost (they run my servers) as that should give it quite a boost and make it.. well responsive. I also let google back in, I had them blocked on the IP level due to them sending my processor usage through the roof while they indexed me. But I signed up with google’s webmaster services and told them to take it slow, and that my gallery need them to be sweet, kind and loving. Otherwise it will freak out and I’ll get emails saying I’m using too much of the processor. Also the pages get cached now, but they have to be visited at least once to cache and since there are so many pictures, when google indexs me it hits thousands of pictures and generates a cache for each one. So when you go to visit that obscure photo later you’ll get the cache and it will load faster and not use much processor, but it doesn’t protect from google until next time.

I also continue to have plenty of bandwidth and server storage to spare (thanks to dreamhost). I think I’m gonna try making another gimic site (I had a link exchange for Kings of Chaos a while back. But I’m officially retiring that today.) maybe “am I shoopdawoop or not” or something…

Anyway thanks for partaking in another one of my meandering “Francis dumps his mind” posts. No plan, no process, just writing. Just like they taught me in school. (I’m swear to god.)