Peeking into msg files - revisited
A reader asked how I knew stream 53 mentioned in diary entry "Peeking into msg files" contained the body of the email.
At that time, it was just trial and error. Since then, with the information posted by readers, I was able to make more sense of the different streams, and I developed a plugin for oledump to help with the analysis of MSG files: plugin_msg.py.
A MSG file is a "Compound File Binary Format", or what I like to call an OLE file. OLE files can be analyzed with my tool oledump.py:
The second column is the size of the stream. Back then, I just peeked into the larger streams (3, 15, 53, 54) and discovered that the email body was inside stream 53.
With the information posted by readers, I was able to make more sense of this data. The third column is the stream name. The hexadecimal number at the end of the stream name, tells me what the stream contains and how it is encoded.
0x1000: Message body <- This is the message body
Stream 53 has name __substg1.0_1000001F, and with this I know that it contains the message body (1000) and that it is UNICODE text (001F).
This information is used in plugin_msg to identify the different streams:
The plugin analyses the name of each stream, and presents the decoded information together with the beginning of the content of the stream.
To view just the output of the plugin, without the output of oledump, I use option -q:
Stream names that are not recognized by the plugin, have a qeustion mark (?) as description. To display only known streams, I use plugin option -k:
And with this output, it's easy to see that the message body is in stream 53, that it is UNICODE, and that it starts with "Dear Sir,".
I can now select stream 53 and display it as UNICODE, like this:
Didier Stevens
Senior handler
Microsoft MVP
blog.DidierStevens.com DidierStevensLabs.com
Comments
Anonymous
Aug 12th 2018
6 years ago
I fixed it now.
Anonymous
Aug 12th 2018
6 years ago