XiaoPaiShen 发表于 2019-11-28 03:47:43

使用imaplib获取电子邮件正文和附件

本帖最后由 XiaoPaiShen 于 2019-11-29 06:12 编辑

1.首先定义一个类: Email_Checker
在该类的构造方法中,创建邮件服务器并用邮件地址和密码登录。
选择 'inbox' 表示收件箱
2.在imap 协议下,邮件的正文可以有两种格式保存,'text/plain', 'text/html'
我将这两种格式保存在mail_content字典中,如果'text/plain'存在,就返回该值,否则返回'text/html'的值
3.mail_messages中保存了所有符合过滤条件的邮件

import imaplib
import email

from email.header import decode_header
class Email_Checker:
    def __init__(self, email='', password='', server='imap.gmail.com'):
      self.server = server
      self.email = email
      self.password = password
      self.mail = imaplib.IMAP4_SSL(self.server)
      self.mail.login(self.email, self.password)
      self.mail.select('inbox')
      self.mail_content = {'plain': None, 'html': None}
      self.mail_messages = list()
使用email.header中的decode_header方法进行解码,用于附件文件名可以有正确的显示
def decode_content(self, content):
      if not content:
            return None
      value, charset = decode_header(content)
      if charset:
            value = value.decode(charset)
      return value
在imap 协议下,可以使用uid方法查找和提取邮箱中的message,
Using UIDs instead of volatile sequential ids
The imap search function returns a sequential id, meaning id 5 is the 5th email in your inbox.
That means if a user deletes email 10, all emails above email 10 are now pointing to the wrong email.
This is unacceptable.

Luckily we can ask the imap server to return a UID (unique id) instead.
The way this works is pretty simple: use the uid function, and pass in the string of the command in as the first argument. The rest behaves exactly the same.

Parsing Raw Emails
Emails pretty much look like gibberish. Luckily we have a python library for dealing with emails called… email.
It can convert raw emails into the familiar EmailMessage object.

def filter_mails(self, title):
      status, data = self.mail.uid('search', None, "ALL")
      email_uids = data.split()
      for uid in email_uids:
            status, content = self.mail.uid('fetch', uid, '(RFC822)')
            raw_email = content
            message = email.message_from_bytes(raw_email)
            mail_subject = message['subject']

            if mail_subject == title:
                self.mail_messages.append(message)

下载附件
def download_attachment(self, message):
      for part in message.walk():
            if part.get_content_maintype() == 'multipart':
                continue
            if part.get('Content-Disposition') is None:
                continue

            filename = part.get_filename()
            filename = self.decode_content(filename)
            print('FileName: ', filename)

            if not filename:
                continue
            
            # save attachment            
            with open(filename, 'wb') as attach:
                data = part.get_payload(decode=True)
                attach.write(data)
                print("attachment {0} saved".format(filename))

完整程序:
**** Hidden Message *****

xyj1997 发表于 2021-5-21 08:01:01

很好的,解决了我的问题
页: [1]
查看完整版本: 使用imaplib获取电子邮件正文和附件