My 7 cardinal rules for data collection

Nahtahna Cabanes
May 29, 2024
4 min read

In the data world, there is a term called “dirty data,” which is a funny phrase to me because it feels like it should be the title of a Michael Jackson song parody. But it is a real term.

Dirty data refers to any data that is incomplete, inconsistent, inaccurate, duplicated, or erroneous in any way.

Last month, I shared my story of how I became a data activist and emphasized the importance of using data in the decision-making process of nonprofit planning.

Before I leave my data discourse, I feel obliged to emphasize that data-driven decision-making is only as good as the data it is based upon.

If the data is incomplete, inaccurate, or inconsistent, it creates strange outliers, skews any analysis, and potentially results in false conclusions.

In other words, poor data leads to poor decisions.

So allow me to underscore the need to ensure your data is….well…clean.

Clean data (which also happens to be a term in the data world) refers to data that is complete, free of errors, and accurate.

Having clean data requires close attention during the data-collection phase of a program. This is much easier said than done in the nonprofit space where the pace can be fast, priorities can be competing, staffing can be low, and funding can be scarce.

It is for this reason that I have worked to explore the “must-haves” of data collection and distilled the elements into my seven cardinal rules. I believe following such rules may help agencies achieve accurate, comprehensive, and useful data.

I share them here:

1. Your metrics should be based on the foundation of your theory of change.

An agency’s theory of change documents the philosophy of how the work leads to specific outcomes. The theory of change will determine the data metrics because it identifies what is to be measured and why. For example, a food pantry may have a theory of change that by addressing the immediate food needs of their clients, they are creating more food security in their community. Consequently, the relevant data points might include staff/volunteer hours, meals served, clients served, and any reported increase in food security among clients. If an organization’s theory of change isn’t clearly articulated, then its data collection will be equally unorganized. Take the time to develop your agency’s theory of change before the data-collection step.

2. Data collection requires leadership buy-in.

Too often, data collection is an add-on to staff job descriptions, rather than a specific component of the job. When leadership does not make data collection integral to the job requirement, data is ignored and missed. Managers must prioritize data collection as part of program execution to maximize the chances of program success.

3. Data collection must be timely.

Data should be recorded as close to real time as possible. Long delays between what is being measured and the measuring of it, means that data entry becomes heavily reliant on human recall. Human recall is based in human memory, which is inherently flawed and potentially biased. Untimely data entry heightens the risk that the data erroneously reflects what has occurred. Timely data entry will help offset the potential of data inaccuracies.

4. Estimations are okay, but consistency is key.

There will be times when exact numbers are impossible to collect. This is often true in instances such as crowd size or park visitors. In these cases, estimations may be the only option for identifying impact. That is okay. The caveat here is that if estimations are made, they should be well-defended and uniform across programs, staff, departments, and year over year.

5. If consistency is key to estimations, then training is key to consistency.

In small nonprofits, it is likely that all staff members will be responsible for data entry. If this is the case, it is imperative to provide training on how data should be entered because if you ask 25 team members to report on data metrics you get 25 interpretations on what the question is asking; 25 interpretations on how to answer the question; and 25 interpretations on what the answer means. Providing robust training on a) the information being sought, b) how to enter that information, and c) how to understand those responses, will make certain that data is consistent.

6. Take the time to clean up incomplete and inaccurate data.

“Spring-cleaning” data will establish the foundation for analysis based on accurate numbers. It might feel time-consuming but if done right it is not only a time-saver during grant reporting season, but also a game-change during the decision-making stage.

7. Start simple.

You don’t need a sophisticated database to begin data collection. A simple spreadsheet can provide many of the functions of data collection and data analysis. However, as you grow, consider upgrades to your data-collection plan, and explore software that offers nonprofit discounts and free versions.

It is important to state that these are not the cardinal rules for data collection, but simply my cardinal rules. Other data activists may prioritize other items, say, purchasing the best database over scheduling time for data clean-up. But I can safely say that most data activists would agree that the quality of your data will determine the quality of your decisions.

So let’s clean up that dirty data.

My 7 cardinal rules for data collection

In the data world, there is a term called “dirty data,” which is a funny phrase to me because it feels like it should be the title of a Michael Jackson song parody. But it is a real term.

Dirty data refers to any data that is incomplete, inconsistent, inaccurate, duplicated, or erroneous in any way.

Last month, I shared my story of how I became a data activist and emphasized the importance of using data in the decision-making process of nonprofit planning.

Before I leave my data discourse, I feel obliged to emphasize that data-driven decision-making is only as good as the data it is based upon.

If the data is incomplete, inaccurate, or inconsistent, it creates strange outliers, skews any analysis, and potentially results in false conclusions.

In other words, poor data leads to poor decisions.

So allow me to underscore the need to ensure your data is….well…clean.

Clean data (which also happens to be a term in the data world) refers to data that is complete, free of errors, and accurate.

Having clean data requires close attention during the data-collection phase of a program. This is much easier said than done in the nonprofit space where the pace can be fast, priorities can be competing, staffing can be low, and funding can be scarce.

It is for this reason that I have worked to explore the “must-haves” of data collection and distilled the elements into my seven cardinal rules. I believe following such rules may help agencies achieve accurate, comprehensive, and useful data.

I share them here:

1. Your metrics should be based on the foundation of your theory of change.

2. Data collection requires leadership buy-in.

3. Data collection must be timely.

4. Estimations are okay, but consistency is key.

5. If consistency is key to estimations, then training is key to consistency.

6. Take the time to clean up incomplete and inaccurate data.

“Spring-cleaning” data will establish the foundation for analysis based on accurate numbers. It might feel time-consuming but if done right it is not only a time-saver during grant reporting season, but also a game-change during the decision-making stage.

7. Start simple.

So let’s clean up that dirty data.

Recent Posts

Comments