Friday, May 23, 2008 1:48 PM
Andy Murray
How to Assign Process Datafields Using Data From MS Word 2007
Introduction
Many of our K2 blackpearl processes involve working with MS Word documents and also document libraries in MOSS 2007. Up until now it has been a rather fiddly task to extract data from MS Word documents and use that data in our K2 blackpearl processes.
Fortunately, MS Word 2007 uses a file format called Office Open XML which is an XML-based structure that makes extracting data relatively easy compared to previous versions of MS Word.
This post looks at how you can use the features of MS Word 2007 and a little bit of code (see sample below) to extract data from an MS Word document and assign it to process datafields in a K2 blackpearl process.
Preparing the MS Word 2007 Document
The first step in this project is to prepare the document - we shall do this by adding a new table cell to the document and assigning bookmarks and properties to the document.
The screenshots below walk us through this stage.
1. Step One - Add New Table Cell
The table cell is where we are going to type in our data and later use it in the process. The feature you need to use to insert a table into a document is found on the "Insert" tab of the "ribbon" in MS Word 2007. A bit basic I know but just in case no-one's ever done this before :)
2. Step Two - Select the Cell
Once you've added a new cell to your table, put the cursor into the cell and then select the "Select Cell" option from the "Select" menu which is located on the "Layout" tab of the ribbon.
The screenshot below shows you how to select a table cell in MS Word 2007.

Step 3 - Add a Bookmark
Once we've selected our table cell we need to create a new bookmark which points to this cell. To create a new bookmark jump to the "Insert" tab on the ribbon and select the "Bookmark" option from within the "Links" section which is located in the middle of the ribbon.
Selecting the "Bookmark" option will cause the bookmark dialog box to open.
Type in a value, in this case I'm using "ApplicantName" for the name of my bookmark. We'll use the bookmark later on when we create our document properties. Once you've typed in the name of the bookmark just click the "Add" button and you're done with this step.

Step 4 - Add a New Property - Part One
Now that we've added our bookmark to the table cell we need to create a new property in the document. MS Word 2007 is based on the Office Open XML format and any properties we set on the document are stored within the document's XML structure.
First jump to the "Properties" menu item and select it. This will cause a new menu "Document Properties" to appear just beneath the ribbon.
The screenshot below shows you how to jump to the "Properties" menu item.
. 
When the "Document Properties" menu appears select the "Advanced Properties" option. This will cause the document "Properties" dialog box to appear.

Step 5 - Add a New Property - Part Two
Now that we've navigated to the "Properties" dialog box we need to jump to the custom tab in the dialog box. From here we will add a new property to the document itself.
Type in a value for the name of your new property, in this case I'm using "ApplicantName" for the name of my new property. The screenshot below shows how to do this.

Step 6 - Add a New Property - Part Three
Once we've created a new property we need to link it to the bookmark that we created in a previous step.
To do this, check the "Link to content" checkbox and then select a value from the "Source" drop-list - this drop-list contains a lookup to all of the bookmarks for the document.
You should see the name of the bookmark that you created in a previous step. Select the bookmark you want to link your property to and once you've done that click the "Add" button.
The screenshot below shows how to do this.

Next click "OK" and that's you done - you've successfully configured your document with a new property linked to a bookmark within a table cell. Later on, you can type data into this cell and whatever you type in will be extracted and used within the process. We'll talk about how that happens in the next section.
If you need to extract additional data from the document simply repeat the steps described above.
Extracting the Data
In this section we'll look at how to extract the data from a document that has been prepared using the steps described above. Before we get to the code we'll first take a look at the process included in the sample project attached to this article.
1. Process Description
We start the process by uploading a document to a MOSS 2007 document library. From there the document is then downloaded (we use standard K2 blackpearl wizards to download the document) to a location on the file system.
The reason I chose to download the document to the file system was due to the fact that it made it easier to write some C# code to extract the data - feel free to play around with other methods of manipulating the document and see what you can come up with, I'm sure there are many other ways to do this!
Once the document is downloaded we can use some C# code to get at the data in the document XML and assign it to process data fields.
Steps One and Two below show this in more detail.
Step One - The Process
The screenshot below shows the process and you can see where we first download the document and then use a server (code) event to extract the data.
o
Step Two - The Code Listing
A section of the code is shown below. {
XmlDocument xmlProperties = new XmlDocument();
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(@"C:\XXX\Application.docx", false))
{
CustomFilePropertiesPart appPart = wordDoc.CustomFilePropertiesPart;
xmlProperties.Load(appPart.GetStream());
}
XmlNodeList chars = xmlProperties.GetElementsByTagName("property");
foreach (XmlNode var in chars)
{
if (var.Attributes["name"].InnerText == "ApplicantName")
{
string ApplicantName = var.InnerText;
char[] myChar = {'.'};
K2.ProcessInstance.DataFields["ApplicantName"].Value = ApplicantName.TrimEnd(myChar);
}
You'll see that I've hard-wired the path on the file system (where the document was downloaded to) and I've also hard-wired the document name. You can also use dynamic values here, possibly from a SmartObject or process data fields, as you prefer - you choose, it's your project!
If you do decide to download the file to the file system before parsing don't forget to write a bit of "clean up" code to delete the file once it's been parsed.
Testing the Solution
Once you've got this far you're almost there. Deploy your process as normal and once you've set your process rights you can start the process off.
Further Reading
If you want to read more about Office Open XML then Microsoft have plenty of content on MSDN for you to explore.
Hope you've enjoyed this article - if you've any questions drop me a mail and I'll be happy to answer them.
Cheers and happy blackpearling..
Andy