Curbing Junk Email with Ruby

In 2009, I wrote a little Ruby script called poppy.rb whose purpose was to clean out junk email from one of my pop3 accounts.  I manually prune the spam from this account rather than using the email provider’s built-in filtering for reasons of my own.

The original version of poppy used a sender whitelist to permit emails to be kept and a sender blacklist to force immediate removal.  All other emails that passed by those filters were then deleted based on the size of the email.  At the time, large emails were a pretty good indicator that the item was unsolicited junk mail.

Recently, I’ve been dealing with more spam in this same account.  The junk mail volume grew to such a large volume that I decided to dust off the old poppy script to see if the bulk of these could be weeded out of my inbox.

The first order of business was to look for a particular characteristic of these emails.  What I found was that the “from” address domain always seemed to end with the string “.top>”.

I had already pared down the poppy.rb file so that it no longer used the whitelist / blacklist some time ago.  I changed the script so that instead of checking for the size of the email item, it looks to see if the from address ends in “.top>”.

I commented out the logic to actually delete the entry so that I could be certain that I didn’t have valid emails that met the above pattern I was hoping to use.  A cursory run of the script indicated that I would be able to use the pattern in the from address.

I was only interested in weeding out the newest spam emails for this account, so I put a safety-valve in the code that limits the probing to the last 100 emails received.  The script iterates over the list of pop3 messages in reverse-order ( newest first ).

The code now requires configuration entries in a file called poppy.yaml (  assumed to exist in the same directory as poppy.rb ).

server: your-pop3-server-name
id: your-pop3-id
password: your-password

The source code to poppy.rb can be downloaded from here:

http://www.mailsend-online.com/wp/poppy.zip

Here’s poppy.rb

# License: MIT / X11
# Copyright (c) 2017 by James K. Lawless
# http://jiml.us
# See: http://www.mailsend-online.com/license2017.php

   require 'socket'
   require 'yaml'

   def removeSpam
      cfg=YAML.load(File.open("./poppy.yaml"))

      sock=TCPSocket.new(cfg["server"],110)

      ["","USER "+cfg["id"], "PASS " + cfg["password"], "STAT", "QUIT" ].each do |msg|
         if msg.length != 0
            sock.send(msg+"\r\n",0)
         end
         s=sock.recv(4000)
         if s.downcase.split(" ")[0] != "+ok"
            return false
         end
         if msg=="STAT"
            lookThroughEmails(sock,s)
         end
      end
      sock.close
   end

   def lookThroughEmails(sock,status)
      numOfEmails = status.split(" ")[1].to_i
      if numOfEmails > 100
         lowEmailCount = numOfEmails - 100
      else
         lowEmailCount = 1
      end
      numOfEmails.downto(lowEmailCount) do |i|
         sock.send("LIST " + i.to_s + "\r\n",0)
         s=sock.recv(4000)
         sz=s.split(" ")[2].to_i
         puts i.to_s
         if checkForSpam(sock,i,sz)
            sock.send("dele " + i.to_s + " \r\n",0)
            s=sock.recv(4000)
            puts s
            puts "(deleted)"
         end
      end
   end

   def checkForSpam(sock,i,sz)
      sock.send("TOP " + i.to_s + " 0 \r\n",0)
      flag = 0
      subject=from=to=replyTo=""
      while flag == 0 do
         s=sock.recv(8000)
         s.split("\r\n").each do |lin|
            if lin == ""
               flag = 1
            else
               word = lin.downcase.split(" ")[0]
               if word == "subject:"
                  subject=lin.downcase
               elsif word == "from:"
                  from=lin.downcase
               elsif word == "to:"
                  to=lin.downcase
               elsif word == "reply-to:"
                  replyTo=lin.downcase
               end
            end
         end
      end
      if from.end_with? '.top>'
         puts from
         puts subject
         return true
      else
         return false
      end
   end

   removeSpam

Notes:

Use this script (and derivative works) at your own risk!  It may delete emails that are meant to be kept, but meet the junk criteria.

If you terminate the script before completion, the “delete” operations will not complete.  This can be a safety mechanism if you see that there’s an email that the script deletes that was worth keeping.

I did not use a pop3 library when I had originally written this script.

I had run this script successfully with Ruby 2.0.0 and JRuby 9.1.8.0.

Advertisements

About Jim Lawless

I've been programming computers for about 36 years ... 30 of that professionally. I've been a teacher, I've worked as a consultant, and have written articles here and there for publications like Dr. Dobbs Journal, The C/C++ Users Journal, Nuts and Volts, and others.
This entry was posted in Programming and tagged , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s