Business is driven by innovation. Competition thrives on companies racing to deliver the latest features and connected services to their customers. Yet, with a focus on providing the best customer experiences, such innovation demands huge volumes of data to provide the insights needed.

To stay innovative and competitive, increasing numbers of car manufacturers are branching out into the connected car space. When fitted with sensors connected to the Internet of Things (IoT), a connected car reveals precious insight on customers. Indeed, a single vehicle can produce many gigabytes of data each year, ranging from technical details on vehicle health to information on how and where it is driven. With the right tools and analysis, this feedback then becomes fuel for innovation, helping manufacturers make better-performing vehicles for their customers.

In this environment, GDPR has been a positive but disruptive force for good. While it protects consumers from those who would try to steal their information, the regulation also makes it more difficult for car companies to collect the very data that helps them continuously improve and innovate. The challenge for manufacturers is to ensure they can keep using that data without violating the data rights of their customers. In a post-GDPR world this challenge has become harder, but a new approach to data collection can square the circle.

Regulation meets reality

Privacy regulations distinguish between data that can be used for customer intelligence and data that constitutes personally identifiable information (PII). PII is information that can be used to find out the identity of an individual – though in practice, it is far more complicated. PII is a legal term, not a technical one. This means what it is can differ drastically by jurisdiction, and its definition is always changing. It can range from a person’s name to their IP address depending on where they live.

For a car manufacturer, the most precious data a connected car can provide is on driver behaviour and usage information. This tells you how the user interacts with the components that make up the car – from how often they engage the windscreen wipers to the use of their hazard lights. In isolation, data from a single car does not say much. However, when that same data is pulled and analysed from potentially millions of vehicles, manufacturers can properly evaluate performance and drive improvements.

However, under GDPR, what can be defined as PII has broadened considerably, and now includes much of the data that connected automobile companies have been collecting for years. The reasoning is sound – even an anonymised vehicle identity number can be used to find out who the owner is with access to the right databases.

GDPR, of course, does not prevent car companies from collecting this data, but it does put heavy restrictions on how they can use it and who has access. A customer must provide consent for you to collect and use their data, then they retain the right to delete it whenever they want. Companies of all shades continue to experiment with how this can be achieved, yet for connected car companies there is a larger security issue at stake.

Time for a rethink

Under GDPR, you cannot allow unmitigated access to the bulk data you are collecting. Typically, connected vehicle companies will aggregate all their big data into one or several data lakes. This then allows the company’s data scientists to easily explore, analyse and seek insight using as much raw data as possible. The problem is that these environments contain both customer intelligence and PII, and companies too often make little attempt to classify between them or restrict access to data that’s protected under GDPR. This gives a potential bad actor plenty of opportunities to steal customer information and cause reputational and regulatory harm to the business.

However, connected car manufacturers can avoid this risk simply by aggregating their data in a different manner. Data should be stored by its classification - whether it’s personal or commercially valuable - and user permissions applied to ensure that PII cannot be accessed by just anyone. To heighten security further they should also apply unique policies to personal data, ensuring that even those with access – or those who have acquired access through nefarious means – cannot do harm. The system could be set up, for example, to detect and stop anyone performing bulk data transfers or using unfamiliar third-party applications on the data source.

Proper data characterisation and a greater use of user permissions allows companies to secure their data lakes without jeopardising their utility. Data scientists would still be able to view the data in aggregate and spot potentially valuable trends on the surface. Bad actors would just not be able to reach below and steal data for their own purposes.

Those who fail to innovate will be left in the dust of those who do. Yet innovation begins by having the courage to take a hard look at current processes and change them for the better. Only then does an organisation realise what is truly possible. For many automotive companies connected cars are the future, but data privacy demands a rethink of how they organise and analyse their data. With a fresh, more secure approach to data lakes it becomes easy for manufacturers to innovate while keeping their customers and regulators happy.