Fog Creek Software
Discussion Board

Welcome! and rules

Joel on Software

How to load words from a Word doc in C#

I would like to read all the words from a Word document in C# and load them to a String array. I am new to this .net world. Nevertheless I thought it should be a really straightforward process.

So I started reading articles here and there. All of them talked about automating Word. So I got as far as opening a Word file. Nice.

I discovered the  Word.DocumentClass.Words  collection. But now I can’t figure out how to loop through its collection of objects, tried a foreach(string word in aDoc.Words) but that’s not a valid cast…

Tried inspecting the Word.DocumentClass class through the debugger and found that Word.DocumentClass.Words  has a .First and .Last object which effectively are the first and last words of the document. But where are the rest? Anyone tried this before?


Monday, October 18, 2004

I've been unable to find the Word.DocumentClass object... where is it located?

GD (
Monday, October 18, 2004

Assuming you've set a reference to the Word COM object, this should get you started:

Microsoft.Office.Interop.Word.Document wordDoc = new Microsoft.Office.Interop.Word.GlobalClass().Application.ActiveDocument;
foreach(Microsoft.Office.Interop.Word.Range range in wordDoc.Words)

Note the Document.Words collection does not contain strings - it contains Range objects. You then use the "Text" property of each Range object to get the actual "word". The above example assumes you have a Word document open when you run the code. You'll need to adjust if you want the code itself to startup Word and open a specific doc.

Good luck.

Ryan LaNeve
Monday, October 18, 2004


It worked, thank you!

Wednesday, October 20, 2004

*  Recent Topics

*  Fog Creek Home