My favorites | Sign in
Project Home Downloads Wiki Source
Search
for
Plan  
Development Plan
Phase-Requirements
Updated Feb 4, 2010 by mihai.parparita

Milestones

In Progress

M4

TODO

  • More code comments
  • Cache MessageInfo objects instead of raw IMAP replies, to speed up replay speed
  • Reduce memory consumption
  • Wiki page outlining basic design
  • Total unique recipients, senders, lists.
  • Refactor jwzthreading.py to not run into recursion limits
  • Combine recipients/senders based on --me input

DONE

  • Run on Enron corpus and upload results
  • Add JS obfuscation for printed email addresses
  • Tarball for downloads

Upcoming

M5

  • Top N tables of domains for senders, recipients
  • Mailbox size over time
  • Support non-Gmail servers (go through all mailboxes instead of just All Mail)
  • Split out sent mail, starred, etc.
  • Break down by all mail vs. label
  • X-mailer distribution
  • Attachment extension distribution

Done

M0

Finished on 12/25/2007

  • Fetch mail headers for all mail
  • Fetch labels for all mail
  • Record/replay support for FETCH to speed up development
  • Optimize StringScanner
  • Chart with messages by day of week
  • Chart with messages by time of day
  • Chart with messages per year
  • Chart with messages per month
  • Chart with messages per day
  • Column layout

M1

Finished on 1/1/2008

  • Table with top recipients (messages and bytes)
  • Add tabs (date, size, sender, recipient)
  • Table with top senders (messages and bytes)
  • Table with top list-ids's (messages and bytes)
  • Title with total counts, date range
  • Size distribution
  • Table with top messages by size
  • Improve SubjectSenderFormatter (max length/clipping, better from name extraction, tooltip with email address)
  • Dividers between years in month drop-down
  • Skip over empty stats in stat collections (e.g. months with no data)

M2

Finished on 1/21/2008

  • Handle encoded names/subjects
  • Linkify messages/senders/recipients to searches
  • Normalize +addresses
  • Remove "All Mail" from all stat titles
  • Thread list stats
  • Instead of using longest name for an address, use the most common
  • Thread length stats
  • Thread sender stats
  • Construct threads from in-reply-to

M3

Finished on 3/16/2008

  • Table with top senders to me
  • Table with top recipients from me
  • Allow "me" email addresses to be specified
  • Allow things to be excluded
  • Filled graph of senders
  • Filled graph of recipients
  • Filled graph of lists
  • Split up stats.py
  • Add support for secure password entry (getpass module)
  • Split up large threads that rely purely on subjects
  • Getting started wiki page
  • Better progress in output (when fetching a chunk, say how many are left)
  • Link to SVN log feed
  • Distribution of senders to me
  • Distribution of recipients from me

Comment by diwakerg...@gmail.com, Mar 27, 2008

Why not abstract out the IMAP functionality into a generic MailSource? abstraction? A lot of people archive mail locally in mbox or MailDir?, and it will be much easier/faster to run this over local storage than going through IMAP.

Comment by unodew...@gmail.com, Mar 27, 2008

is it not possible to get some kind of hosted pre-installed version of this? It would be pretty cool as I can't install python and don't know the first thing about code. thanks.

Comment by adewale, Mar 27, 2008

Can this be back-ported to Python 2.4 so that those of us with corporate machines can use it? Thanks.

Comment by beg...@gmail.com, Mar 28, 2008

a simple .exe would be amazing. I have played with Python, Cheetah, Monkey and Squirrel for hours. I have no idea what I am doing and can't get anything to work.

A simple version would spread like wildfire, I'm sure.

THANKS!!!!!

Comment by daviddu...@gmail.com, Mar 31, 2008

This is freaking cool... What a geek I am. Up and running in about 30 seconds on my Mac and almost done downloading now. :) Thanks.

Comment by oscar...@gmail.com, Apr 9, 2008

One thing that could be very nice is to create a Web App with this functionality, for example using the brand new Google App Engine... Creating a simple interface just to put your email, and then using the same python code, and for example Google Chart API to show the result... (but i lack the knowledge to do it myself). Anyone??

Comment by adam.dav...@gmail.com, Apr 16, 2008

It would be great to track information about my average response times to emails. Specifically, this could be displayed as a graph of the distribution of bucketed response times (e.g. with buckets <1h, 1-4h, 4-8h, 8-24h, 1-3d, >3d).

Comment by mccormac...@gmail.com, Jun 23, 2008

Please, an OS X executable! I really want this to work!

Comment by nateja...@gmail.com, Dec 21, 2009

Hi Mihai- this is really a great script. I ran it a year ago and it worked fine, but I think now that my Gmail box is significantly larger, I'm running into the max recursion problem that you mentioned as being an "in progress" task for M4. Has there been any more progress on that?

Here is the traceback:

[2009-12-21 01:16:12,422] Logging out [2009-12-21 01:16:12,798] Identifying "me" messages [2009-12-21 01:17:15,812] 27905 messages are from "me" [2009-12-21 01:17:15,812] 64515 messages are to "me" [2009-12-21 01:17:15,837] Extracting threads Traceback (most recent call last):

File "main.py", line 254, in <module>
threads = ExtractThreads?(message_infos)
File "main.py", line 162, in ExtractThreads?
thread_dict = jwzthreading.thread(thread_messages)
File "/Users/nateaune/code/mail-trends/jwzthreading.py", line 180, in thread
prev.add_child(container)
File "/Users/nateaune/code/mail-trends/jwzthreading.py", line 32, in add_child
if child.parent:
File "/Users/nateaune/code/mail-trends/jwzthreading.py", line 54, in len
count += len(c)

(this goes on for awhile...)

File "/Users/nateaune/code/mail-trends/jwzthreading.py", line 54, in len
count += len(c)

RuntimeError?: maximum recursion depth exceeded while calling a Python object

Comment by rma...@gmail.com, Jan 18, 2010

Great program! Can you make it use my contacts to combine email addresses into a single user?

Comment by gokhanse...@gmail.com, Sep 8, 2010

Hey, Could you create statistics for word-count and longest post listings?


Sign in to add a comment
Powered by Google Project Hosting