Programmer's Ranch: fetch

Hello!

In the last article, "IMAP: Working with Folders", you may have noticed that hMailServer returns some data about folders you select. For example:

C: 0002 SELECT INBOX
S: * 3 EXISTS
S: * 0 RECENT
S: * FLAGS (\Deleted \Seen \Draft \Answered \Flagged)
S: * OK [UIDVALIDITY 1376229395] current uidvalidity
S: * OK [UNSEEN 2] unseen messages
S: * OK [UIDNEXT 4] next uid
S: * OK [PERMANENTFLAGS (\Deleted \Seen \Draft \Answered \Flagged)] limited
S: 0002 OK [READ-WRITE] SELECT completed

In this article we'll cover what these items mean, and learn how messages are organised in IMAP folders.

Let's start off with the UIDVALIDITY. The UIDVALIDITY is a 32-bit value associated with a folder when it is created. Each time a folder is selected, this UIDVALIDITY value is returned (as above). Normally, this value doesn't change. If it does, it means that the folder somehow got messed up on the server (it might also have been recreated with the same name). IMAP clients check this UIDVALIDITY, and if it changes, they are supposed to discard the messages they downloaded for that folder and download the entire folder again from scratch.

Email messages in a folder are identified by two different means: unique identifiers (UIDs) and sequence numbers. These are both 32-bit numbers but they work a bit differently. Let's say you have five messages in your INBOX folder:

Seq Number	1	2	3	4	5
UID	1	2	3	4	5

Initially, the sequence numbers and UIDs are the same. But look at what happens when the third message gets deleted:

Seq Number	1	2	3	4
UID	1	2	4	5

See, when a message gets deleted, the sequence numbers are reassigned to fill in the gap (sequence numbers above 3 are deducted by one in this case). So if there are n messages in a folder, sequence numbers run continuously from 1 to n without any gaps. On the other hand, message UIDs never change. If a message is deleted, its UID vanishes with it.

We have already seen the use of these identifiers in the FETCH command in the recent article "IMAP: Downloading emails". In a FETCH command like this...

0004 FETCH 1 BODY[]

...the '1' represents the sequence number of the message you want to retrieve. You can use UIDs instead, by using a UID FETCH command instead:

0005 UID FETCH 1 BODY[]

Back to the SELECT response at the beginning of this article, some things should now be clear. The EXISTS part tells you how many emails are in the folder - it's also the value of the highest sequence number available. UIDNEXT is a value higher than the highest UID in the folder, predicted but not required to be assigned when a new message is added in the folder. Section 2.3.1.1 of RFC3501 explains it pretty well:

   The next unique identifier value is the predicted value that will be
   assigned to a new message in the mailbox.  Unless the unique
   identifier validity also changes (see below), the next unique
   identifier value MUST have the following two characteristics.  First,
   the next unique identifier value MUST NOT change unless new messages
   are added to the mailbox; and second, the next unique identifier
   value MUST change whenever new messages are added to the mailbox,
   even if those new messages are subsequently expunged.

        Note: The next unique identifier value is intended to
        provide a means for a client to determine whether any
        messages have been delivered to the mailbox since the
        previous time it checked this value.  It is not intended to
        provide any guarantee that any message will have this
        unique identifier.  A client can only assume, at the time
        that it obtains the next unique identifier value, that
        messages arriving after that time will have a UID greater
        than or equal to that value.

Now, let's talk about flags. A message can be assigned a set of flags. On most servers, the flags are predefined and are the following:

\Seen - if set, message is marked as read
\Answered - if set, message is marked as replied to
\Flagged - if set, message has a special status (e.g. flagged/starred/etc)
\Deleted - more on this in a second
\Draft - if set, message is marked as a draft; clients usually just create a folder for drafts and don't bother with this
\Recent - messages that have been added to a folder are marked as recent; this status is removed once the folder is selected again later

You'll notice that some of the functionality you're used to, such as read/unread messages and messages being marked as answered in Outlook are supported in IMAP by flags. The \Deleted flag, however, deserves some special attention.

In IMAP, there is no Recycle Bin or Trash folder. Clients emulate Recycle Bin functionality by creating a Deleted Items folder (actual name varies between clients). When messages are deleted from another folder, they are moved to the Deleted Items folder. When they are deleted from the Deleted Items folder, they are deleted permanently.

Messages are deleted in IMAP in two stages. First, they are marked for deletion by setting the \Deleted flag. Then, an EXPUNGE command clears the entire folder of messages marked for deletion. Alternatively, a CLOSE command does the same as the EXPUNGE command but also deselects the folder.

Servers may optionally allow custom flags to be set. These are called keywords; they work like tags and don't start with a backslash (\). Servers that support keywords return a \* as part of the PERMANENTFLAGS line in the SELECT response - you can see this in Gmail:

S: * OK [PERMANENTFLAGS (\Answered \Flagged \Draft \Deleted \Seen \*)] Flags permitted.

Now that I've explained flags, we can understand the remainder of the SELECT response. The RECENT count shows how many messages are marked with the \Recent flag - this may help to indicate any new messages in the folder, although using UIDNEXT is more reliable. The UNSEEN line (optional) gives the sequence number of the first message in the folder that is not marked with the \Seen flag.

The FLAGS line tells you what flags are supported for the selected folder, and the PERMANENTFLAGS tells you which flags can be modified. If any flags are in FLAGS but not in PERMANENTFLAGS, then they can only be modified temporarily; the old value is seen when a new session is initiated.

Any data about an email message can be retrieved using the FETCH command, and that includes UIDs and flags:

C: 0009 FETCH 1:* (UID FLAGS)
S: * 1 FETCH (UID 1 FLAGS (\Seen))
S: * 2 FETCH (UID 2 FLAGS ())
S: * 3 FETCH (UID 3 FLAGS ())
S: 0009 OK FETCH completed

This FETCH command is similar to the ones we used before. Instead of a single number, we specified 1:*, which means all messages in the range from 1 to *. The * unintuitively resolves to the highest sequence number in the folder, in this case 3.

The BODY[] field we were using earlier is substituted for (UID FLAGS) here. Although we could fetch UID or FLAGS individually (with or without brackets), we can specify several fields at once using an IMAP list - a bracketed space-delimited sequence of words. In the response lines, the numbers on the left (beside the asterisks) are the sequence numbers, and the items in the outer set of brackets are key-value pairs.

We can change flags using the STORE command:

C: 0010 STORE 1 +FLAGS (\Deleted)
S: * 1 FETCH (FLAGS (\Deleted \Seen) UID 1)
S: 0010 OK STORE completed

Like FETCH, the STORE command takes a sequence number, or can take a UID if preceded by the UID keyword. The second parameter (+FLAGS) describes the action to be taken. +FLAGS adds the flags in the last parameter; FLAGS replaces all flags with those in the last parameter; and -FLAGS removes the flags in the last parameter. For example:

C: 0011 STORE 2 FLAGS (\Draft \Flagged \Seen)
S: * 2 FETCH (FLAGS (\Flagged \Draft \Seen) UID 2)
S: 0011 OK STORE completed
C: 0012 STORE 1 -FLAGS (\Seen)
S: * 1 FETCH (FLAGS (\Deleted) UID 1)
S: 0012 OK STORE completed

So now the message flags look like this:

C: 0013 FETCH 1:* (UID FLAGS)
S: * 1 FETCH (UID 1 FLAGS (\Deleted))
S: * 2 FETCH (UID 2 FLAGS (\Draft \Flagged \Seen))
S: * 3 FETCH (UID 3 FLAGS ())
S: 0013 OK FETCH completed

When you want to delete messages marked with the \Deleted flag, just send an EXPUNGE or CLOSE command:

C: 0014 EXPUNGE
S: * 1 EXPUNGE
S: 0014 OK EXPUNGE Completed

The response gives you sequence numbers of deleted messages. These may include duplicates, because as messages are deleted, sequence numbers are decreased as explained earlier.

That's all for today! This article explained the various metadata associated with IMAP folders and email messages. It explained the difference between UIDs and sequence numbers, and how to work with flags among other things. I hope you'll come back to learn more! :)

Hello friends!

In yesterday's article, "Email: Protocols and Background", I explained how to set up an email client and server to study the email protocols, and also how to use Wireshark to observe how the clients and servers actually talk to each other using these protocols.

Today we'll begin learning IMAP, and use it to access our inbox and download emails.

As I explained in yesterday's article, IMAP is a protocol made up of text commands. You can open a socket (see "C# Network Programming: Simple HTTP Client"), send commands, and interpret the responses. Without needing to write any code, you can use the telnet program to open a raw connection and work using the command line. Connect to your hMailServer IMAP server using a command like this:

telnet serverIP 143

Obviously, replace serverIP with the actual IP address or hostname of the computer where hMailServer is installed. 143 is the port on which the IMAP server listens by default.

If telnet is not installed, you can install it on Windows 7 as follows. In the Start search box, find Programs and Features and open it. Select "Turn Windows features on or off" (it's a link on the left hand side). In the Windows Features window, tick the "Telnet Client" checkbox and click OK.

With telnet installed, connect to the IMAP server as described above. You should see the initial greeting from the hMailServer IMAP server:

* OK IMAPrev1

Now, type the following command and press ENTER:

01 capability

The response consists of a number of words that describe what functionality the IMAP server supports:

Next, login to the email account you created as part of yesterday's article. In my case the command looks like this:

02 login user@ranchtest.local pass

The response will tell you whether the login was successful or not.

In order to make this a little interesting, send a few emails to yourself via Thunderbird so you actually have something in your inbox. Next, access the inbox using the following command. The INBOX folder is standard in IMAP and always exists, even if you don't have anything in it.

03 select INBOX

You should be able to see a few things at this stage. First, whenever we send a command, we precede it with something like "01". This is called a tag, and can be any string (different clients use different formats ranging from numbers to random strings). When the server response to a command, the last line always starts with the same tag as the command - that way a client knows that the response for that command was received.

You'll probably also realise that telnet is a pain in the ass. If you make mistakes you can't backspace - that's because telnet sends everything you type, byte by byte. So each character is sent immediately and can't be undone. Last year I wrote a program called IMAPTalk which makes working with IMAP (and other protocols) much more convenient. Just download it and run it - no installation necessary. Just enter the hostname and connect:

If you turn on Auto-generate tags, you can actually leave out the tags and just type the commands - the tags will be filled in for you by IMAPTalk. If you repeat the commands we did above in IMAPTalk, it looks like this:

Better, no? If you look at the responses (in red), you'll notice there are a bunch of things I haven't explained yet. Don't worry about them just yet. The important thing you should take from the response to the SELECT command is this like:

* 3 EXISTS

That's telling us that there are 3 messages in the INBOX.

We can then retrieve them one by one using a FETCH command, while providing the message number:

04 fetch 1 BODY[]

The response is as follows:

You'll notice that there is a bunch of stuff in there, for such a short message. Part of it (the line starting with the * 1 FETCH) is IMAP, as are the last two lines. The stuff in between is the full email, known as the MIME. It consists of a header with a bunch of fields and values (you'll notice important stuff such as From, To, Date, etc), and after a double line break, the message itself. You can see this in Gmail by clicking on the arrow at the top-right of an opened email and clicking "Show original".

You can similarly retrieve the other messages in the inbox by replacing the 1 in the FETCH command with the message number. This is called the message sequence number, and can't be larger than the number of emails in the folder (in this case 3). If you provide a larger number you won't get an error, but you won't get any email data either.

So that's the easiest way to download emails manually using IMAP. We'll learn more about how messages are stored in IMAP folders, and more about the facilities offered by the IMAP protocol, in the coming articles. Come back again for more!

Programmer's Ranch

Gigi Labs

Monday, August 19, 2013

IMAP: Message and Folder Attributes

Monday, August 12, 2013

IMAP: Downloading emails