Imagine you’re writing a book—starting with a blank page is daunting. When you start implementing a metadata catalog in your organization, you might get that same feeling. When you face an empty catalog, it can feel overwhelming to see how much there is to do and figure out where to start.
These are some of the best practices we have seen data champions adopting in their organizations.
As with any prioritization task, it helps to think of the classic 2x2 matrix:
Make items in the top-right corner the highest priority. Make those in the lower-left corner the lowest priority.
Think about the task of documentation from two perspectives:
- Looking backward, to document existing assets and capture existing knowledge.
- Looking forward, to reduce the burden of looking backwards over time.
When writing a book, you might start with outlining its key themes, events, and examples. The equivalent in our world is a backlog. If your organization has already identified assets with knowledge issues or documentation gaps, then you have a backlog.
Use the 2x2 matrix above to prioritize that backlog.
Another place to start is high-impact assets like BI reports and dashboards. How many times have you been challenged on a particular figure on these reports? Answering those challenges is difficult without a clear definition of the figure and the inputs used to calculate it. In this case, start by documenting these figures and the assets involved in their calculations.
Another dimension of this same approach is documenting the assets that are most used. For example, document tables on which users run the most queries.
Both of these could be considered some of the “big bet” priorities (top-left corner of the prioritization matrix).
The “bus factor”
One more dimension to consider is the bus factor 🚌. If you have a single team member who knows everything about an asset, what happens if that person disappears tomorrow? For example, they could leave your company.
Prioritize assets that have limited or no shared knowledge across more than one team member. These would be “big bet” priorities.
Onboarding of new assets
In combination with the techniques above, use a template to streamline onboarding of new assets. This doesn’t clear an existing backlog, but ensures that backlog does not continue to grow with each new asset.
Focus on the documentation of new data products to avoid this pitfall.
It is still worth capturing shared knowledge when it exists in the team. When onboarding new people to the team, for example, they can use this captured knowledge to learn. If many team members are able to contribute, crowdfunding should be an “incremental / filler task” priority.
Use gamification to further invigorate or incentivize crowdsourcing documentation.
You can use our documentation prioritization template to sort your documentation priorities: