Text Files Verses Word Processing Files
In this post, I want to prove to you just how portable and practical plain text files really are. Today we're going to deal with a little bit of math. Oh don't worry, this won't be hard and yes, I will provide the answers so there is no need for you to go and figure things out on your own. It will all be presented here in this post.
Now before I get started here, I need to clear the air on something and that's the definition of a text file. Although people may call files created with word processors text files; they're not text files even though that is all you can see in them. The reason for this is because word processing files contain a lot of extra stuff other than just the plain text that you read on the screen. The text that you see on the screen from a word processed file is just the surface. It doesn't even account for even half of what is underneath the text that you're reading.
You see, word processed files have a lot of codes and special characters in them that don't even relate to anything that a human being can read. Basically in a nut shell, all of that junk that you see when you load a word processing file into a plain text editor is machine language stuff. That language is understood by the word processing program that created the file so that it can be rendered on the screen each time that it is opened.
So no, word processed documents are not considered text files even though a lot of people would differ on that point of view. OK, so let's get into what this post is about and that is actual file sizes.
We're going to have a look at a very simple file that has only two words in it, but it will definitely shed some light on the matter here. The Hello World file which only contains the words, "hello world" is in both plain text and in Microsoft Word DOC format. Below are my findings on the sizes of both files.
Hello World.DOC is 791 bytes.
Hello World.TXT is 11 bytes.
OK, so what's the difference between the two files? They both only contain two words, and they pretty much look the same. The Microsoft Word version of the file does not have any special font sizes or anything in it that would render the text any differently than what a plain text file would render it. So what gives? Why is the word version of this file so much larger?
Well it has to do with the native word processor program itself. You see, when you create a DOC file in Microsoft Word, and then save it, you're not just saving the text itself. You're also saving all of the codes and commands that tell Microsoft Word how to display the document. Basically word processing documents are files that have been marked up so that when they're displayed, you can see all of the fonts and styling that went into creating that file.
However, on the otherhand, plain text files possess none of that specialized coding and therefore are much much smaller than their word processed counterparts are.
That is why the Hello World.TXT file is only 11 bytes because each letter in the file is recognized including the space between the words which is also recognized as a character. 11 bytes is much smaller than 791 bytes and most of that 791 bytes doesn't even represent the text itself. So much of that extra stuff is just wasted bytes. So as you can see, word processing files contain a whole lot more information in them than just the text that they represent whereas plain text files only represent the plain text only and nothing else.
This is why I am a huge fan of plain TXT files because of this fact. Compared to word processing files, plain text files are far more compressed than word processing files are. That's because plain text never uses any extra characters or specialized codes to represent the text on the screen. This is also why plain text editors no matter how large or small they are can read any plain text file because plain text files are just that; plain text only.
Now if you don't believe the two Hello World examples shown above, simply create these files on your own computer using notepad, or wordpad. In wordpad, save the Hello World file as a standard DOC file. Then in notepad, do the same thing. Crreate the Hello World file and just save it as a plain text document. Once that's done, then go and compare the two files to see what their sizes are. You will definitely see a difference in the file sizes for yourself.
I hope that this post has helped you to appreciate plain text files a bit more because I really do belive that once you begin to use plain text, you won't turn back to those huge word processing programs.