How Microsoft Word Stores Hidden Information About You

Author: Jack Molisani

Warning: Microsoft ® Word files contain hidden information about you, your company and the subject you are writing about—information that can be seen by anyone who receives a copy of your document.

Statistically, about half of all STC (Society for Technical Communication) members in the U.S. have a degree in English, Journalism, or some other writing-related field. The other half have degrees other fields and transitioned into technical writing. I’m one of the “other” people—my degree is in Computer Engineering from Tulane University.

While my engineering background certainly aided me as a technical writer, it also enabled me to do some interesting side-jobs along the way. One of my professors from Tulane started a computer forensics consulting firm a few years ago, and recently she had more projects to do than people to do them.

(For those of you who don’t watch one of the many TV shows on crime scene investigation, “computer forensics” is the field in which a technician searches computers for evidence of wrongdoing.)

When she offered to train me on the latest tools and techniques in computer forensics so I could help her meet some delivery deadlines, I jumped on the opportunity. Somehow I know this would be more interesting than typing in review comments from subject matter experts!

One of the first things I learned in computer forensics is that operating systems and software applications record incredible amounts of information about user activity. This information is stored in various places (operating system files, application data files, etc.), and the information can be viewed if you know where to look.

While it takes special forensic tools to access most of this information, some of it is in plain view and can be seen without special tools. This article is about one of the “plain view” instances: information Microsoft Word saves about you, your company and the topic you are writing about, all of which can be seen by anyone who has access to your document.

About Meta Data

Microsoft Word® saves information about a document in addition to the actual contents of the document. This additional information (called metadata, from the Greek meta meaning “higher, beyond”) includes:

  • Who created the document
  • On which machine is was created
  • Who edited the document
  • Whether the document was saved under a different name

To see a simple example, open a Microsoft Word® file and select Properties from the File menu (File | Properties). A dialog will appear showing some of this information: (Click on the thumbnail image at the right for an enlarged view.)

A document written on a corporate PC might display more information, such as the company name, the name of a corporate template (if any), etc.

Go ahead and try this and see what your documents contain.

Accessing Hidden Information

To see the rest of the hidden information stored in a Word® file, do the following:

1. In Microsoft Word®, select File | Open…. The Open dialog will appear.

2. From the Files of type drop-down list, select Recover Text from Any File (*.*) and then select and open a Word® document. (Click on the thumbnail image at the right for an enlarged view.)

3. When the file opens, page down to see all the metadata:

In the example on the right (Click for a larger image.), you can see the name the document originally had (“Administrative details 305 198.doc”) and where it was located (on a machine named “Johnette Hassell9”).

It was then saved under a new name (“Administrative details 305.doc”) in a folder on a different machine (“E:\cs305.fall.01” on machine named “hassell0”).

There is more information you can recover, but this gives a good example of the type of data Microsoft Word® stores.

Should You Care?

While you may not care if anyone knows how many times you saved a document or the name of the last printer on which it was printed (both are shown in the example), I’ll bet you can think of several examples of information that could be stored in metadata that you wouldn’t want competitors or others to have.

Let’s look at an example from a real forensic case.

A company suspected that an employee was taking documents home that contained trade secrets and selling the secrets to a competitor. The company was granted a court order to image (make an exact bit-by-bit copy of) the hard drive on the employee’s home computer, and the company turned the copy over to us for forensic analysis.

While the employee claimed he never took documents home, here is what we found:

On the employee’s home computer was a document named “How To, Chapter 1.doc”. We opened the document and saw the following metadata:

Employer\Iran Project\Process Manual\Section 1.doc
A:\New Manual
Bob Smith7C:)\Documents\Iran Project\User Manual\Section 1.doc
Preferred Customer
New Dell User
C:\Indonesia\How To, Chapter 1.doc

Each new piece of information is appended at the bottom of the metadata, so you read the history from top down. Looking at the above, you can see:

1. The employee opened the document “Section 1.doc” on his machine at work. (Name changed to “Employer” for this article.)

2. He then saved the document to a diskette in drive A:

3. Then he saved the document on his home machine under the name “How To, Chapter 1.doc”. (A forensic tool showed that the employee never changed the default name on his home PC so it still showed “Preferred Customer.”)

Pretty incriminating, huh…?

How to Protect Yourself, Your Employer

While there is not much you can do to keep Microsoft Word® from storing information in the document metadata, there are actions you can take to keep others from seeing it:

The easiest option is to just not share the original Word® document—save or print the document as a PDF file and send that. (Metadata is not printed to the PDF file.)

However, if you must send the document itself, save the document in RTF format and send the RTF file, or first save the document in RTF format, convert it back to Word® and then send the new Word® format. Converting a file to RTF (rich text format) saves the formatting in the document but not the metadata.

Note: Saving a file in RTF strips the document of metadata, but not the revision history if you are tracking revisions. I recommend always converting Word® documents to PDF, just to be sure.

Summary

While it is pretty hard to destroy information in a computer to the point where it cannot be found by a competent forensic investigator, you can at least control how much information is made available to recipients of your documents.

Good luck, and good writing!

About the Author

When Jack Molisani is not saving the world from cyber crime, he runs ProSpring Technical Staffing (www.prospring.net) and is producing LavaCon: The Third Annual Conference on Technical Communication Management to be held this September in Honolulu, Hawaii. (See www.lavacon.org for program information.)

Jack can be reach at 310-831-1929 or at jmolisani@ElectronicEvidenceRetrieval.com.