This is the seventh article in the series.

Guidelines For Taxonomy Maintenance

Irrespective of how carefully we develop the initial taxonomy, it will evolve as the information it describes, users, and their use-cases evolve over time. New terms are introduced by the industry analysts, new concepts emerge, terminology and usage changes, and some terms go out of fashion or become obsolete as marketing spins new words.

It will, therefore, help to view your taxonomy as a living organism. Just like other living beings, a taxonomy that cannot evolve and adapt will perish, as the users won’t find value in using it.

The initial taxonomy design should be adaptable and the software that we use should make it easy to perform regular maintenance tasks.

The following are some common maintenance tasks that are needed to keep the taxonomy relevant at all times:

1. Addition of New Terms (tags)

Addition of new terms looks like a fairly simple task but it is one of the most common activities that damages the integrity of a taxonomy. A single term with a definition overlapping that of another term, or a misplaced hierarchy, can disrupt the whole taxonomy.

Therefore, no new term should be added to a taxonomy without a formal approval or review process.

It is, therefore, important to define a process to nominate, review, and approve new terms.

Whenever an appropriate term cannot be found in the taxonomy, the user should have an option to nominate a new term as a candidate.

The reviewer (or owner of the taxonomy) who will approve the new terms should be clearly identified. The reviewer’s responsibility is to confirm whether a new term should be added, or the scope definition of an existing term should be expanded, or an existing term should be split to cover new information?

Ideally, the candidate term shouldn’t be visible to the users. However, if your software doesn’t support this feature you can use a simple hack to achieve this. You can mark the candidate term with special symbol or phrase. This hack helps the reviewer test and experiment with the new term with real content and also helps to symbolically inform the users that the term is unapproved and should be used with discretion. When the candidate term is approved, this special identifier must be removed.

After adding the new term, the archived content should be re-indexed (tagged) with the new term. If the software doesn’t support the re-tagging and re-indexing of archive content, the scope definition should mention the date when the new term was added. It helps in setting clear user expectations about the information that can be retrieved while searching or navigating using the new term. Otherwise, the user would complain that the new tag is not useful as it doesn’t include all the results.

The addition of new terms should be with the utmost care. It is one of the main reasons for confusion and bloated taxonomy, which derails the whole idea of the information organization system.

Tip: In addition to writing the scope note for the new term, update the scope definition of other terms that might be influenced by the new addition.

2. Deletion of Redundant/ Obsolete Terms

Both, the terms that are regularly tagged incorrectly and terms that are tagged infrequently are potential candidates for modification or deletion.

Do not permanently delete a tag directly from the taxonomy. Just like the addition, deletion should also follow a review process. We should review how the tag is currently being used by the users. How many users’ saved searches and alerts refer to this tag?

Before deleting, give prior notice to your users. It’d be frustrating for users to see that their saved query has stopped working or the alert suddenly has unexpected information because the term associated with it is deleted. Therefore, the terms should be retained in the taxonomy and marked for deletion.

After a term is marked for deletion, no new information should be tagged with it. The date for deletion and the reasons should be added to the scope note of the term.

Caution: Once a term is permanently deleted from the taxonomy, it is removed from all the archive content. If we need the term back, it will be almost impossible to re-tag the same content retrospectively.

3. Splitting Terms

A term is a potential candidate for splitting when:

– It is getting tagged with different kinds of information, and

– Deeper analysis of the term’s sub-categories is required

For example, the term ‘Business Expansion’ could get tagged with the information related to both new products and new office openings. It can, therefore, be split into two terms that are unambiguous and easy to understand.

Also, we should split a term into multiple sub-categories when the deeper analysis is required to drive insights or to identify trends.

For example, the term ‘Events’ can capture all the information related to marketing events. But what about online events, ‘Webinars’?

We can add ‘Webinars’ as a sub-category to ‘Events’. Or, we can create two categories within events, ‘Online Events’ and ‘Offline Events’.

But what about investor calls? These are not marketing events. Should we make another category ‘Investor Events’ and make it a sub-category of ‘Financial Results’? Should “Offline Events” have further sub-categories as “Events Participated” and “Events Sponsored”?

These are all difficult subjective questions with no right or wrong answers. And like all subjective questions, the answer is ‘it depends’. It depends on the use-cases of your taxonomy.

If we get lucky, the answer may come directly from the users. But for the most part, we’ve to fish out answers from users by asking the right questions. For example, the question is not whether they need separate “Events Participated” and “Events Sponsored”. The question is how they will use this information; what kind of insights the user wants to extract from the analysis of this information?

4. Merging Different Terms

This is the opposite of splitting a term. Potential candidates are those terms that are tagged, inconsistently, to the same kind of information causing confusion and making it difficult to do meaningful analysis or drive insights.

The same information is tagged with either of the two tags and, therefore, the search (based on one of the tags) returns incomplete results.

This happens when two terms have overlapping scope definitions.

Usually, in the zeal of developing a ‘perfect’ taxonomy, we add too many terms. Separate tags are added for similar kinds of information, for example, ‘Data Analytics’ and ‘Big Data’, or ‘Corporate Governance’ and ‘Management Issues’.

When you have to combine different terms to get complete information for the analysis, it is a sign to consider merging those different terms.

5. Re-indexing

The decision to add, delete, split, or merge terms is only half the battle. These decisions affect the existing content that is already tagged and the tagging of new content going forward.

After a new term is added to the taxonomy, all the archive content that matches the scope of the new term should be tagged with it. Similarly, after a term is deleted completely, it should be removed from all the archive content. The same applies to the splitting and merging of terms.

All such changes require reindexing the entire content archive. This is a crucial step in the maintenance of the taxonomy and our software should support it.

6. Review Process

A good taxonomy should remain useful, ideally, for the entire life of the organization.

Taxonomy, therefore, should be treated as an ‘institution’ and all changes should go through a review process. There shouldn’t be any ad-hoc reactions to user requests to add terms or knee-jerk reactions to remove any term. Based on our experience, the following are some of the best practices and guidelines for maintaining a healthy taxonomy.

– History Notes: It is common to forget the reasons for making changes to the taxonomy. Each term, therefore, should maintain the history notes. It should track reasons for modifications and record previous forms. A History Note is especially important to inform new users about when and how a term has changed. It may also include the date discontinued, the term that succeeded, and/or the term that preceded it. The term record should note the date of each change and also identify the individual responsible for it.

– Annual Review: In addition to the review process of individual changes, establish a formal process to review the entire taxonomy, at a minimum, once a year. Even if individual changes are carefully reviewed, it is important to step back and review the integrity of the whole taxonomy as one single organism. To enable this, your software should allow the export of the taxonomy into a downloadable excel spreadsheet, complete with historical notes, tagging rules, descriptions, and other metadata.

– Communication Strategy: Define a process to communicate the taxonomy changes to all the users. Rather than sharing the changes in an ad-hoc manner, it is important to have a regular rhythm of sharing the updates – once a month or quarter. This update should include not only the ‘New Terms’, ‘Deleted Terms’, ‘Merged Terms’ and ‘Split Terms’, but also the changes under consideration for the coming months. For example, terms marked for deletion in the next month.

Key Idea: Taxonomies fail, not because they are not designed well, but because they are not maintained over time. Most users fail at the disciplined and diligent approach required to manage taxonomies. It is easy to follow the easy and ad-hoc method to make changes as and when required.

The processes that you institutionalize for taxonomy maintenance are critical for the long-term health and utility of your taxonomy. Therefore, the software that you use should be able to support the necessary maintenance functions.

Next, let’s review some of the core requirements that a taxonomy software should support.