Finding and using data that matters – 6 things to think about
Search any data related posting and you’ll soon be up to your eyeballs in reports on the promise of the new data era, techniques to help build a better data engine or incorporate new data widgets, and infographics and visualizations showing the kinds of insights you can get with the right mix of techniques.
Got it. Data is big. Really big. In 2021, there were 79 zettabytes of data generated worldwide. By 2025, that number is expected to grow to 175 zettabytes of data needing analysis. That's seven sets of three zeros big. [IDC’s “Data Age 2025” White paper]
It’s also transforming business processes across the board. The big data analytics market is expected to grow to $103 billion by 2023. 97.2% are investing in big data as well as artificial intelligence, machine learning, and other applications to help them make sense of the massive troves of data they produce each day. It doesn't matter who they are or what they do - every organization is increasingly expected to take advantage of data, no matter what their size, mission, or budget. Data-driven metrics are now the norm, especially if an organization wants to show their value to decision makers and outside investors.
With limited resources, what can small and medium non-profits do with data? How can you choose an approach that combines the right mix of data and tools to provide insight relevant to your organization? And where is the data going to come from?
Finding the specific, relevant data you need, let alone getting it ready for use with the different tools and technologies that interest you, can become very costly, very quickly if you’re not careful.
But don't despair! There are ways to transform the way you do business without breaking your bank or your organization. Just make sure to consider the following BEFORE you invest time or money in any particular suite of tools or datasets:
Think about what you need your data to measure
For many organizations, it isn’t easy to build metrics that provide insight or tell the data-driven story their decision makers need to hear. Certainly, computer networks and analytic engines can spew out numbers, trend lines, percentage breakdowns, and other statistical reports as needed. But what in those numbers is really going to matter? Building good metrics that work with data driven approaches is as much a knowledge challenge as a technical one. It involves understanding what in the data is important to capture, knowing what kinds of metrics you can get from different combinations of data sources and analytic tools, and building system platforms that will sustain them.
The best metrics will be designed in collaboration with members of technical and operational teams across your organization and will include the tacit knowledge and understanding of how business processes really work in your organization. They will measure the right mix of performance, effectiveness, usability, and impact whether based on technical, financial, or other operational functions. And they will be based on understanding of the data that you can get, the outputs and insights that different combinations of tools can give, and the impact on users providing and users receiving your products and services.
Make sure your team members understand what different members mean by “data”
There are a lot of assumptions about data and what it means. Technically and mathematically inclined staff will likely think about data in terms of how machine-ready it is, and whether the contents of fields in a database or spreadsheet can be computationally manipulated through your computer systems to get multiple, automated jobs done. Operational and analytical staff will more likely think of data as pieces of information – whether it comes in the form of interactions with or observations of others, audio or video clips, images, words on a page, or numbers in a spreadsheet. This is information in terms of stuff they can research, collect, process, and analyze with their own brains using the frameworks, methodologies, and insight they learned as part of their professional practice and education. To many of them, they will not be aware that computers process data much more simply, and that the data has to be broken down into 1’s and 0’s before processing can occur.
The fundamentally different frameworks that people in your organization use to understand data can lead to a lot of confusion about what data driven approaches may be appropriate to your business process. But you need the technical, operational, and analytical understanding of data, as well as the decision makers perspective, if you are going to get full value out of any approach. And you’re also going to need to understand what is doable now and what will take time to develop.
To that end, it is important for everyone in your organization to understand data strengths and limitations from different viewpoints and to use that information to figure out what you need to do now, given your budget and resources, and what you can plan to implement in the future.
Think about how you already use data in your organization
No doubt you are already using data and have developed a series of techniques and methodologies to capture, analyze, and turn that data into information. Some of your processes and the data that supports them are likely well-documented, cleaned, and ready to ingest into your organizations' system. Others likely require significant cleaning, or manual inputs and may need more thought about how to formalize and better incorporate them into your approach. But whether it's ready for ingest or not, that body of data, data sources and techniques that is already being used by the systems and people in your organization is likely going to be 85-90% of the source data you will need to transform into your new data driven process.
Understanding what you already have – including which parts of your data process can be fully or partially automated, where you will need human inputs for higher-level decision making, and where you have data gaps – will go a long way towards helping you figure out what you can already leverage into a data driven process before you invest.
Consider what other data might be out there that you can get to fill any data gaps
There are thousands of open data portals, subscription services, and proprietary data sources available to you. But the number of data sources that are relevant to your field, let alone to any specific function or project, will undoubtedly be limited. And depending on the type of data you need, you will likely find that a lot of the data you really need for decision making will need to be processed before it can be used with your tools.
If you’re looking to leverage government funded open data, there is definitely plenty of machine-ready data available, although figuring out what is relevant and aligning it to your needs will be a factor. If marketing, product, or financial data is what you want to use, that also tends to be structured and available, and expensive. Other data that you might need will be proprietary, firewalled, protected by privacy laws, or dependent on human collection, human production, or human interpretation.
Exploring what is out there that may be relevant to your decision making before you start will guide the choice in appropriate tools and techniques. It will also help you understand what is doable now with current tools, and what might be doable, in the near future, as innovations in data technologies come of age.
Think about how much it will cost to transform the data so you can use it with your tools of choice
Data isn’t free. Not even the free stuff. For every piece of data or record that you want to use, there is a cost to preparing data so it can be ingested into, managed by, accessed from, and output to services and tools within your IT system. Depending on the type and quantity you need, what it looks like, how you will need to collect and process it, and what licensing and intellectual property issues you must contend with, costs can add up very, very quickly. And this is before you consider any costs for the technical tools that you might use to exploit the data you need.
Luckily, there are an increasing number of cheaper and cheaper tools you can use to get data into your system – e.g., data collection and mining tools, crowdsourcing options, drones, and opportunities to collect data from humans as they interact with tools and technology. And new technology that makes it easier to take advantage of data are emerging every day. Understanding what it will cost to extract, transform, and load that data into whatever tools you want to use will go a long way into making sure the data driven process you implement will work for your organization.
Think big, pilot small, then scale up accordingly
In a world of limited resources, it makes sense to think big about possibilities and test different options on a small scale before you settle on any single approach. By starting small, you might find that the cheapest option is good enough as is or, alternatively, that it needs significant, costly pre-processing to make it usable. You might also find that many solutions that work at small scale will not be effective or cost efficient at larger scales. This may mean your organization processes themselves will need to be adapted, or it may mean that you should look for another option. Testing the kinks before you invest will help you calculate what the ultimate cost for data, tool licenses, and other costs related to building out your system, training users, and establishing new protocols will be.
These steps will take some time and resources to go through. However, a small investment at the beginning to make sure you get the right combination of data and tools to drive your process will save considerably more over the long term. And that is always good for your organization’s bottom line.
Anne is a data strategist and dot connector with 17+ years of experience helping defence, intelligence, and public-sector customers take advantage of existing and emerging data-driven technologies to make sense of their worlds. She's currently in start-up mode working on a prototype to help users quickly find quality datasets that are skill-level friendly, budget-friendly, and can provide meaningful outcomes. Find her at www.linkedin.com/in/anne-russell-data-strategist.