Data Quality Management
Checklists
Level One: No way to determine data quality.
Level Two: System manages data validation and authorisation workflow.
Level Three: System logs show origins of datasets and allows rollback.
Level Four: System manages depreciation over time.
Level Five: Data quality integrated into data systems.
Reference Resources
Principles of Data Quality
Chapman, A. D. 2005. Principles of Data Quality, version 1.0. Report for the Global Biodiversity Information Facility, Copenhagen.
Quality Data Capture
The quality of the data entered into a GIS system directly affects the quality of the analysis at the other end: “rubbish data in, rubbish data out”
Capturing quality data can be time consuming and costly, depending on the level of accuracy required.
Human interaction can affect data quality, therefore the capture process needs to be carefully thought through.
If using community groups or other members of the public to collect data, the data collection process needs to be made very simple and the accuracy level may therefore need to be relaxed.
Using data collected without GPS needs to be used very carefully as accuracy can be difficult to apply after the fact. For example, older maps upon which a boundary has been hand drawn can appear as though the boundary line is very wide (20m) once imported into a GIS system. One way of dealing with this is to not import the data, but have the system refer the user to the original document.
Applying Relevancy to Data
Rating data as to its accuracy is extremely useful long-term goal. The data’s original level of accuracy at the time of collection can be recorded, as well as an ongoing data age rating – accuracy of data wanes over time, a “data half-life”. For example, planning decisions based on old inaccurate data can cause huge problems (and cost time and money).
Environment Bay of Plenty has developed a set of data quality practices.
A summary of a work session on data quality management at the 2010 National Workshop is available to view.
EBoP System Manages Data Quality Ratings and Depreciation
Environment Bay of Plenty is about 50% of the way through a project to build an integrated biodiversity suite. Jim Fretwell says the project aims to build a consistent set of systems that have a single spatial web interface. Where this is achieved, administrators can work on their own systems, but the users of the data, both within and outside the organisation, can access it via a single web interface.
One of the problems that the project addresses is the storage and use of different qualities of information. EBoP wants to store all the information that it has, even if it originates from informal observations, without compromising the integrity of high quality data. It has therefore developed a data quality rating and depreciation system. Data that is collected using highly controlled methodologies is given a high initial rating, and a relative depreciation rate (e.g. GPS data doesn't deteriorate, but the accuracy of data such as wetland extents data will deteriorate over time).
The data ratings and depreciation system provides a framework that allows the comparison of information and the ability to present the most accurate/important information to the public or internal staff, while giving them the ability to drill into all information if they need to. It also facilitates the evaluation of information when data-sharing with other organisations.
Log in and Edit this Page. You can view the edit history without logging in.
This site is provided by OnlineGroups.Net, where you can start your own free groups site, and powered by GroupServer, the open source web-based mailing list manager.