Test data management: the hidden GDPR challenge
- Published: Tuesday, 14 November 2017 12:54
As organizational preparation for the upcoming General Data Protection Regulation (GDPR) legislation gathers speed, Dan Martland and Iain Finlayson highlight an area which can easily be overlooked.
Businesses across Europe face a significant challenge in preparing for incoming General Data Protection Regulation (GDPR) legislation in May 2018. While the headline-grabbing penalties have been widely understood, the regulation is complex and in many areas potentially open to interpretation. However, there are a number of areas that businesses need to make provisions for, including the area of test data management.
We believe that poor test data management is potentially the single biggest threat to a business in terms of breaching the GDPR legislation, and typically includes:
- The targeting and creation of non-production data sets that mimic actual data so that IT and the business can perform accurate and relevant tests before releasing or updating a new service or product;
- The building of synthetic data where it is not possible or acceptable to use ‘real’ data;
- Ensuring the data can be shared across IT and business teams;
- Enabling data to ‘time travel’ to support complex business test scenarios;
- Allowing for effective backup and restore capabilities;
- Supporting the effective build and deployment of the corresponding test environments.
Here we highlight the key challenges around GDPR and how the testing industry can help to address them:
Transparency and consent
Individuals will need to provide specific and active consent covering the use of their data, and this demands a change in processes around the gathering and withdrawal of consent.
Companies will need to define and then manage legitimate data use and length of storage time before archival and deletion. Test data management will be pivotal in supplying evidence to regulators that a business has due diligence in place particularly for exceptions around legitimate business uses such as pursuing outstanding debts, or holding on to an address for a warranted product.
While a copy of real information may have been used to test systems in the past, this simply cannot continue; individuals need to give explicit and informed consent that their data can be used for testing.
The GDPR also states that individuals have the right to data portability. It allows citizens to move, copy or transfer personal data easily from one IT environment to another in a safe and secure way.
Data portability will also require testing, and ensuring compliance for data in flight will be a major exercise for organizations that have high volumes of live data in non-protected environments.
All of the above will drive new risk-based test scenarios which in turn will impact how late phase testing such as User Acceptance Testing is defined, planned and undertaken. This will have an additional impact on the quantity of ‘Must Tests’ that will need to be executed within a UAT test window.
Data masking for anonymization
Customer data is critical when building new services, and regardless of how companies and other organizations are using that data, they will all now be facing the same GDPR challenge, how to either mask customer data or build accurate and useable synthetic data, while retaining referential integrity for testing and ensuring an audit trail is in place for compliance purposes.
While the use of synthetic data - where the formatting stays the same, but values are altered to prevent identification - may potentially be preferable, it is not always possible.
Data masking strategies require powerful tools to protect the real source data, using techniques such as data snapshots, where users no longer work on the database but on an anonymised snapshot of the data. Another strategy is dynamic anonymization where the result of a query is anonymised in real-time so there is no need to take a snapshot.
Tools and data discovery
A retail client recently conducted a discovery exercise and found terabytes of customer data over ten years old. Under GDPR, the retailer needs to be able to find that data and justify holding it for ten years.
To avoid non-compliance, documentation of the use of personal data in all test environments is necessary, including backups and personal copies created by testers. A solid understanding of all real data sources and the current location of data is key to ensuring that no real personal data is exposed during software development, maintenance and test phases.
Testing teams can help search for the data using the same tools that would be typically used for automation.
In summary, GDPR compliance is one of the major IT challenges, with initial compliance followed by continuous testing to ensure the regulation is still being adhered to. A robust test data management strategy will save money – and indeed could save an organization, full stop.
Dan Martland and Iain Finlayson, Edge Testing Solutions. Edge Testing Solutions offers test data management and assessment of ongoing GDPR compliance services.