Automatically marking up abbreviations and acronyms in SharePoint 2007


Accessibility is a broad term and reaches way beyond the standards compliant code only. Accessibility is in my belief a set of features improving the understanding of information presented by an information system. I have to admit though compliant and semantic HTML is a very important factor of accessibility as it hosts the information. As I have recently solved the issue of standards compliant HTML in SharePoint 2007 I have started looking for new challenges and accessibility improving solutions. Almost immediately I have stumbled upon automatically marking up abbreviations in content.

I have faced exactly the same challenge during the Rock My Website competition last year when John, Martijn and myself were building an accessible web site in ASP.NET. I wanted to implement a solution which would automatically markup all known abbreviations in de content using some kind of dictionary. As we weren’t using any Content Management System we would need to think of another way to maintain the abbreviations dictionary. Eventually we have dropped the idea then, but now we have SharePoint 2007.

Looking at standard SharePoint 2007 features I have almost immediately came up with a solution for this challenge.

The requirements

First of all to Provide a user friendly way to maintain the abbreviations dictionary. Storing it in a central location will decrease the amount of work required to maintain the dictionary and keep the definitions consistent. Then markup all the abbreviations found in the content so that HTML will become <abbr title=“HyperText Markup Language”>HTML</abbr> instead. Last but not least: replace only the first occurrence of an abbreviation on the page as it will provide enough information for a visually impaired visitor.

The work

I thought the best way for storing and maintaining the abbreviation dictionary would be a custom list consisting of two columns: Term and Definition. Let’s call the list Abbreviations.

AbbreviationsList

Then the replacement logic. As replacing the abbreviations could be done in numerous ways I though it would be the easiest to customize the FieldValue web control and extend it with the required properties. You can find more information on this approach in one of my previous posts.

I have decided to add two properties to our extended FieldValue control: boolean MarkupAbbreviations to be able to turn it on and off easily and AbbreviationsList to pass the URL of the list as a parameter instead of hard coding it.

namespace Imtech.SharePoint.Compliancy.Controls
{
  [ToolboxData("<{0}:FieldValue runat=\"server\" />")]
  public class FieldValue : WebControl
  {
    private Dictionary<string, string> abbreviations;

    private string _FieldName;
    [Bindable(true), Localizable(false)]
    public string FieldName
    {
      get { return _FieldName; }
      set { _FieldName = value; }
    }

    private bool _MarkupAbbreviations;
    [Bindable(true), Localizable(false)]
    public bool MarkupAbbreviations
    {
      get { return _MarkupAbbreviations; }
      set { _MarkupAbbreviations = value; }
    }

    private string _AbbreviationsList;
    [Bindable(true), Localizable(false)]
    public string AbbreviationsList
    {
      get { return _AbbreviationsList; }
      set { _AbbreviationsList = value; }
    }
  }
}

I have also added a dictionary to store the abbreviations obtained from the abbreviations dictionary list.

Let’s load the available abbreviations now:

protected override void CreateChildControls()
{
  LoadAbbreviations();
  base.CreateChildControls();
}

private void LoadAbbreviations()
{
  try
  {
    Regex regEx =
        new Regex("(?<SiteUrl>/.*)/?Lists/(?<ListName>[^/]+)");
    Match m = regEx.Match(_AbbreviationsList);

    using (SPWeb site = SPContext.Current.Site.OpenWeb(
        m.Groups["SiteUrl"].Value))
    {
      SPList list = site.Lists[m.Groups["ListName"].Value];
      abbreviations = new Dictionary<string, string>(list.ItemCount);
      foreach (SPListItem abbreviation in list.Items)
        abbreviations.Add(abbreviation.Title,
             abbreviation["Comments"].ToString());
    }
  }
  catch { }
}

I have decided to get the URL of the list and the site where it resides by using regular expressions. After opening the site I open the list as we want to obtain all available items within it. Before we will walk through the available items we also need to instantiate the abbreviations variable to store the abbreviations and their definitions in code. Adding the the abbreviations to the dictionary is straight forward. In a real life scenario you might add an extra check just to get sure that you won’t add the same abbreviation with various definitions twice.

As we have the abbreviations available in code we can proceed and do the replacing in the content.

protected override void Render(HtmlTextWriter writer)
{
  string content = SPContext.Current.Item[_FieldName].ToString();
  markedAbbreviations = new List<string>(abbreviations.Count);

  foreach (KeyValuePair<string, string> abbreviation in abbreviations)
  {
    Regex regEx = new Regex(String.Format(
                            CultureInfo.CurrentCulture, @"\b{0}",
                            abbreviation.Key),
                            RegexOptions.IgnoreCase);
    content = regEx.Replace(content, String.Format(
                            CultureInfo.CurrentCulture,
                            "<abbr title=\"{0}\">{1}</abbr>",
                            abbreviation.Value,
                            abbreviation.Key), 1);
  }

  writer.Write(content);
}

We will do the replace in the Render method. First of all we will need the content of the chosen field. To do the replace we will use the regular expressions again. This time we will use the abbreviation in combination with a word boundary (\b): if looked for HTML for example we want to find occurrences of HTML but not XHTML. Word boundary will pick only the complete matches we want. The last thing we want to add is the 1 telling the regular expression engine to replace only the first occurrence. That’s it. Let’s see how it works:

Result

In the preview example I have used Mozilla Firefox as it underlines the abbreviations with a dotted line and shows tooltips with the definition on mouse hover. The solution works according to the requirements: we have an easily maintainable abbreviations dictionary and an automatic replace of the first occurrence of an abbreviation only. The downside is that the replace occurs within a field. Should you have multiple fields containing content with abbreviations on one page and you would still want to replace the first occurrence only you would have to think of a solution to store the already replaced abbreviations in a page wide available place.

Another extra feature might be extending the abbreviations dictionary with an extra column keeping the type of abbreviation so that they will be spoken out correctly when read using a screen reader. You would first define the CSS rules like for example:

<style type="text/css">
acronym {speak : normal;}
abbr.initialism {speak : spell-out;}
abbr.truncation {speak : normal;}
</style>

and then markup the various kinds of abbreviations:

<acronym title="North Atlantic Treaty Organisation">NATO</acronym>
<abbr title="Hyper Text Mark-up Language"
class="initialism">HTML</abbr>
<abbr title="Europe" class="truncation">Eur</abbr>

You could also make easily a distinction between abbreviations and acronyms (kind of abbreviations which can be pronounced as word). I hope that this solution proves how highly extensible SharePoint 2007 is and that it can support accessibility solutions as well. I would love to hear now your ideas on improving the accessibility experience in SharePoint 2007.

Others found also helpful: