|
EmailIntegration
Blueprint for Email Integration.
Phase-Design IntroductionThis document will provide propositions to replace current request by a real email system. The goal is to store emails in Tryton and be able to read it with most email client. Module Name: electronic_mail MUA Incoming ProtocolMail User Agent According to Wikipedia, there is two popular protocols POP3 and IMAP4. IMAP4 will be chosen because it was designed to leave email on the server and it allow multiple clients to access the same mailbox. The IMAP4 server will be implemented using TwistedMail AuthenticationFor the authentication, we need to know on which database the connection must be authenticated. We choose to store the database name in the login like this username@databasename. MDAMail Delivery Agent It will be a simple python script that will work like procmail but it will store the email into Tryton instead of mailbox. Storage Structure
LinksUpdateThis blueprint has been partially implemented as follows: | |
MDA: consider to use explicitly the LMTP Protocol to deliver the message from MDA into your script. see:http://en.wikipedia.org/wiki/LMTP
Storage: add an sha1 for the incoming mail and the full original mail to make sure to have an unmodified mail. Sometimes you wanna store the pgp-encrypted original mail and the decrypted mail for working. That could although be achieved with a child-parent feature.
when youre mail system grows its a must to store parts of a mail outside the database. for example the attachments or the body part. take a look about a single instance storage to have only one copy of a mail. its common these days to copy a mail to dozen of colleagues. sometimes with a a big drawing, sometimes with that really funny presentation... I know its bad when you have to work with a 300GB database only for mails.
Filtering: or take care to implement a hook of a custom module to implement something like sieve for mail filtering. maybe thats a good processing language for assigning mails to parties, projects, invoices, ...
archiving (i know it doesn't sound like core-requirement): if you are thinking about internal only mails there should be a way to send that kind of mail to an external archive-server.
again storage, maybe not directly mail-relevant. other modules could benefit from that kind of idea: having an external_id to reference a mail object which have been archived. for example file://path/to/email_blob.xml or http://web-archive/mail-id, net-rpc://other-tryton-server/database/mail/.. and the possibility to recreate the mail-object.
just my 2c, only thoughts for discussion
tobias
LMTP: too complicate and I don't see any befenit sha1: I don't understand.
sha1: to add a checksum to verify the integrity of the mail itself. like mercurial.
ltmp: as stated in RFC2033:
The LMTP Protocol doesn't look that hard and it was specially developed for that case you wanna use a command-line script. Its supported by a wide range of MTA's.
sha1: I don't find it is useful lmtp: it is like smtp without queue but multi-status. It will be more complex than a simple script. And it will always be possible to add over the script a lmtp server.
Just a comment, after the implementation of poweremail something we identified is that the char fields had to be made into text fields in several because when there are several email IDs it usually needs more space...
So i think to,cc,bcc should be text fields.
Don't worry in Tryton Char fields have not size limitation by default.
I know a thing or two about imap servers on sql storage (dbmail.org).
I'm looking at the schema:
headers: one2many electronic_mail.header
This will quickly explode! All those 'Received' headers will swamp your database. Also, how many times you think 'From: postmaster@...' will be in there. This table will grow O(n^2) or worse.
digest: char(32) md5 of email without header depending of receipt like 'X-Original-To', 'Delivered-To' collision: integer email: function(binary) email from data_path and header
Looks like you are planning on doing single-instance-storage. But the real meaning of these fields remains opaque. A digest will not prevent collisions on single-instance storage (see Birthday Attack).
Storing the address and subject fields is problematic: If you want to be able to search them, indexing requirements will limit their width. However, the RFCs state that their field content may be virtually unlimited. Also, in actual practice they will often contain binary (8bit) data even when the RFCs actually forbid this. This means blobs. Searching blobs sucks as you know.
You may want to take a good look at dbmail. Esp the schema.
About headers: It seems you have the same in dbmail https://github.com/pjstevns/dbmail/blob/master/sql/postgresql/create_tables.pgsql#L236 But we can think about storing raw part of the header in an other field and just fill electronic_mail.header with interesting headers. But indeed I don't see the issue.
About digest: It is the same design as for attachments.
About address and subject: indexing doesn't require a limit (except for MySQL but there are already a lot of issue with it). So we will use a varchar (without limit). For the 8 bit, I don't know perhaps we could store it in base64.