How Do You Feed Your Digital Twin? Part 1: Data Strategy

Our role in the research project IoT4CPS (Trustworthy IoT for CPS is to design and implement a demonstrator for automotive driving scenarios, addressing cybersecurity and privacy challenges related to various stakeholders engaged along the entire lifecycle of the connected cars.

The IoT4CPS demonstrator needs to be fed by a variety of data which is either real-time or historical. Furthermore, the data needs to contain information related to cybersecurity and privacy issues; it may come as multi-tenant streams linked to smart cities, smart roads, smart manufacturing, smart driving, smart maintenance of the connected car, and other external systems. At the same time, the data needs to relate to lifecycle stages, from engineering of IoT/CPS subsystems that are part of the connected car, to the car’s manufacturing, and its driving, its maintenance, to finally, its disposal (end-of-life). The following figure captures the variety of perspectives to be addressed in the IoT4CPS demonstrator.

A conceptual view of the IoT4CPS automotive driving apps.

A current challenge in data-centric research projects like IoT4CPS is data acquisition to enable analysis and interpretation of the results in the form of effective decisions. As a metaphor of decision making intelligence, we used the term “Digital Twin”.

The design of the Digital Twin was a creative and joyful process. As soon as we faced early implementation, our hope of acquiring sufficient data started to fall apart. The automotive industry typically does not publish data due to its commercially sensitive nature! Hence, we looked at relevant open-source systems and public datasets that can be reused or used as a basis to construct specific data repositories of interest to the project’s business cases.

In this post, we present our data strategy to acquire the data and prepare it for analysis, visualisation and decision-making.

  • Searching for the relevant public data sources related to automotive lifecycle data (design, engineering and manufacturing of IoT devices and their CPSs, operational (driving) data, maintenance and disposal data),
  • Searching for the relevant public data sources related to automotive cybersecurity, privacy and safety . This data needs to address various cybersecurity lifecycle phases, from cybersecurity monitoring, to discovery, incident response and mitigation management methods.
  • The public datasets need to be analysed, normalised and fused to identify relationships, trends and anomalies to help in reacting to security- and/or safety- related vulnerabilities in the infrastructure.
  • Design methods to assimilate the diverse data sets, need to be used to support the decision-making processes.
  • Developing security, safety and privacy models and risk-based metrics to help security and safety analysts decide on relevant hypotheses.
  • Assess the increase in the value that is added to the overall security and safety of the system.
  • Assess how much improvement can be expected by taking the proposed approach.

In the next post (part 2), we will present our selection of relevant public datasets related to lifecycle stages of connected cars.

The project report is publicly available from here.

Keep reading.