Publishing Rich HTML without any limitations in SharePoint 2010


SharePoint 2010 ships with rich content editing capabilities. However the set of allowed HTML tags is limited. Find out how to avoid the limitations in editing rich content in SharePoint 2010.

When working with Internet-facing websites, rich content allows us to make difference and to add interaction and semantics to static pages. Out of the box SharePoint 2010 ships with a Rich Text Editor (RTE) that allows us to enrich our content with HTML markup. Thanks to its interaction with the Ribbon content editors can easily add rich media, data tables or change presentation of the content using the Rich Text Editor.

Nowadays however, rich content goes far beyond formatting and including media. We not only want to publish content but we also want to track its value using web analytics. And to make the content more valuable we want to make use of new capabilities such as microdata or HTML5. Unfortunately things aren’t as easy as you might want them to be when working with SharePoint 2010.

Inconvenient Publishing HTML Field

The Publishing HTML Field Type in SharePoint 2010 has been designed to empower content editors with rich content editing capabilities. This flexibility comes however with a price. To keep things manageable and secure the Publishing HTML Field has been equipped with a number of security gates which prevent content editors from using potentially harmful markup in web pages. Considering how dynamic the web is, and how often new possibilities emerge, the Publishing HTML Field in SharePoint 2010 uses a list of allowed HTML tags and attributes that content editor can use when writing web content. Everything else that isn’t included in those white lists is removed when saving content.

Although this is a very powerful feature and it helps you keep your website secure, it isn’t suitable as-is in all scenarios and there are situations where you would want to control what markup should be allowed. From that perspective the Publishing HTML Field is unfortunately limited. At this moment we can neither control the white lists used by the Publishing HTML Field nor we can control whether the security validation should be used at all or not. Using the Content Editor Web Part, which has far less security limitations, might be an option you could consider, but it’s not that convenient after all. And yet there is a solution to this challenge.

Edge Rich HTML Field aka. What You Type Is What You Get

To avoid the limitations of the standard Publishing HTML Field you could create a new Field Type that allows you to use any HTML markup you want. Using the extensibility capabilities of the SharePoint 2010 platform, you would make that custom Field Type use the standard SharePoint 2010 Rich Text Editor to benefit of the editing capabilities and Ribbon integration.

Let’s start off by creating the Field Definition for our new Field Type:

<?xml version="1.0" encoding="utf-8" ?>
<FieldTypes>
  <FieldType>
    <Field Name="TypeName">EdgeHTML</Field>
    <Field Name="ParentType">Note</Field>
    <Field Name="TypeDisplayName">Edge Rich HTML Field</Field>
    <Field Name="TypeShortDescription">Edge Rich HTML Field</Field>
    <Field Name="ShowOnColumnTemplateCreate">TRUE</Field>
    <Field Name="ShowOnListCreate">FALSE</Field>
    <Field Name="ShowOnDocumentLibraryCreate">FALSE</Field>
    <Field Name="ShowOnSurveyCreate">FALSE</Field>
    <Field Name="FieldTypeClass">$SharePoint.Type.c110e462-def3-4ceb-a24f-7459c94f97db.AssemblyQualifiedName$</Field>
  </FieldType>
</FieldTypes>
Field Definition highlighted in the SharePoint Project structure

Important: The fldtypes_edgehtml.xml file should be deployed to the TEMPLATE\XML mapped folder.

As you can see the Field Type Definition for our Edge HTML field is very similar to the definition of the standard Publishing HTML field. The main difference is that we don’t explicitly define our field as Rich HTML field what will allow us to omit the security gates that remove unknown HTML markup.

The next step is to create a class for our Field Definition:

using System;
using System.Runtime.InteropServices;
using Microsoft.SharePoint;
using Microsoft.SharePoint.Publishing.Internal.WebControls;
using Microsoft.SharePoint.WebControls;

namespace Mavention.SharePoint.EdgeRichHtmlField {
    [Guid("c110e462-def3-4ceb-a24f-7459c94f97db")]
    public class EdgeHtmlField : SPFieldMultiLineText {
        public override BaseFieldControl FieldRenderingControl {
            get {
                BaseFieldControl control = new Controls.EdgeRichHtmlField();
                control.FieldName = InternalName;
                return control;
            }
        }

        public EdgeHtmlField(SPFieldCollection fields, string fieldName)
            : base(fields, fieldName) {

        }

        public EdgeHtmlField(SPFieldCollection fields, string typeName, string displayName)
            : base(fields, typeName, displayName) {
        }

        // Required to support rendering Web Parts in content, reusable HTML and stuff
        public override string GetFieldValueAsHtml(object value) {
            // Required to properly support both display and edit mode
            // Must be set to true to avoid escaping HTML in display mode
            // Must not be set to avoid stripping HTML in edit mode
            if (SPContext.Current != null && SPContext.Current.FormContext.FormMode == SPControlMode.Display) {
                RichText = true;
            }

            string text = value as string;
            if (!String.IsNullOrEmpty(text)) {
                bool flag = true;
                return base.GetFieldValueAsHtml(HtmlEditorInternal.ConvertStorageFormatToViewFormat(text, out flag));
            }

            return base.GetFieldValueAsHtml(value);
        }
    }
}
Field Definition class highlighted in the SharePoint Project structure

In line 12 we create an instance of the Field Control which we will create in the next step. The rendering method overridden in line 28 allows us to correctly render the HTML in both Display and Edit modes and to support reusable content as well as Web Parts content.

Important: Before you proceed, add a reference to the Microsoft.SharePoint.Publishing assembly which is required by the HtmlEditorInternal class referenced in line 44.

The last step is to define the Field Control for our custom Field Type:

using System;
using Microsoft.SharePoint;
using Microsoft.SharePoint.Publishing.WebControls;
using Microsoft.SharePoint.Utilities;
using Microsoft.SharePoint.WebControls;

namespace Mavention.SharePoint.EdgeRichHtmlField.Controls {
    public class EdgeRichHtmlField : RichHtmlField {
        bool m_bIsFieldValueCached;
        string m_strCachedFieldValue;

        protected override Type ExpectedSPFieldClassType {
            get {
                return typeof(EdgeHtmlField);
            }
        }

        // required because the base type has reference to HtmlField which we don't use. Copy of HtmlField.OnInit
        protected override void OnInit(EventArgs e) {
            if (ControlMode == SPControlMode.Display && SPContext.Current.FieldControlCacheGetCallback != null) {
                m_bIsFieldValueCached = SPContext.Current.FieldControlCacheGetCallback(UniqueID, out m_strCachedFieldValue);
            }

            if (!InDesign && ((ControlMode == SPControlMode.New && List != null && !List.DoesUserHavePermissions(SPBasePermissions.AddListItems, true)) || (ControlMode == SPControlMode.Edit && ListItem != null && ItemId != 0 && !ListItem.DoesUserHavePermissions(SPBasePermissions.EditListItems)))) {
                SPUtility.HandleAccessDenied(new UnauthorizedAccessException());
            }

            RegisterFieldControl();
        }
    }
}
Field Control highlighted in the SharePoint Project structure

Important: Before you proceed make sure you have a reference to the System.Web assembly which is required by the reference to the UniqueID property in line 21. Additionally you should add a SafeControls entry for the EdgeRichHtmlField control that will allow it to run correctly in SharePoint. Although you could do that manually directly in the Package Manifest I chose to use an Empty Element SPI called Controls and add the entry there.

As I mentioned before all security gates responsible for removing unknown HTML markup are defined in the Publishing HTML Field Type so in order to reuse the standard RTE in our custom Field Type it’s totally okay for us to inherit from the standard RichHtmlField class.

The only reason that we have to create our own Field Control instead of referencing the standard RichHtmlField is that it has two references to the Publishing HTML Field Type. The first one is in the ExpectedSPFieldClassType property which we override and make point to our Edge Rich HTML Field Type (line 14). The second one is the OnInit method which we override in line 19 and make sure that all of the basic functionality is included.

With that you are ready to deploy the Edge Rich HTML Field Type. After the deployment succeeded you should see the Edge Rich HTML Field Type in the list of available Field Types when creating new Field:

Red arrow points to the Edge Rich HTML Field Type

After adding the newly created field to your Page Layout you should be able to edit HTML markup without any limitations:

Editing HTML source including the html5 data attribute

Red arrows pointing to html5 markup properly saved and rendered by the Edge Rich HTML Field

Important

Although the approach I showed you allows you to use any HTML markup you wish, it allows your content editors to use HTML markup that might be harmful. Therefore you should carefully analyze your requirements before implementing such solution and educate your content editing team about possible risks that this introduces.

Updated November, 4 2011

One more consequence of choosing this approach, as pointed out by Tyler Durham, is loosing link management. When using the Publishing HTML field type SharePoint keeps a list of all outgoing links to other assets within the Site Collection so that content editors can keep track of what is being used where on the site. Another benefit of the link management is, that whenever you rename or move an asset, such as image, SharePoint will automatically modify the link to point to the updated location. All of that will not work with the Edge Rich HTML Field.

Summary

SharePoint 2010 ships with rich content editing capabilities. From the security point of view those capabilities have been limited to a specific set of HTML markup. Depending on your requirements you may find the security validation, that SharePoint 2010 executes every time the content changes, limiting. One way to avoid removing HTML markup outside SharePoint’s white lists is to create a custom Field Type that allows you to use all HTML markup. With that you can use any HTML markup including web analytics tracking codes, microdata and HTML5.

Others found also helpful: