Data Privacy Vocabulary (DPV)

🚀 Supercharge your legal compliance!
https://w3id.org/dpv
Represent information in machine-readable and interoperable standards
Create digital notices and documents and exchange them with partners
Build knowledge graphs to automate your legal compliance
Modular and Extensible support for EU GDPR and multiple jurisdictions
Rapid developments to match progress of AI technologies and the AI Act
Easily supports terminology and requirements of YOUR use-case with an extension

The Data Privacy Vocabulary (DPV) is a resource produced by the W3C Data Privacy Vocabularies and Controls Community Group (DPVCG) to represent information associated with processing of (personal and non-personal) data and use of technologies in a machine-readable and interoperable manner.

DPV provides an ontology of concepts that enable expressing information such as data and technologies involved, their purposes and legal basis, measures used for security, relevant laws and rights, and associated risks and impacts.

DPV also provides taxonomies for these concepts based on real-world applications so that the machine-readable representations are consistent and interoperable through the use of DPV concepts.

Examples of how DPV can represent use-cases is:

What's in the DPV?

DPV is the 'main' specification which provides the foundational framework upon which other 'extensions' are built. DPV contains the following concepts and taxonomies:

  1. Purposes e.g. Marketing, Service Provision, Compliance
  2. Processing operations e.g. Collect, Store, Use, Share, Delete
  3. Data e.g. Personal Data, Sensitive Data, Special Categories, Anonymised Data
  4. Technical Measures e.g. Encryption, Access Control
  5. Organisational Measures e.g. Notice, Policy, Assessments
  6. Legal Basis e.g. Consent - including types and status, Contract, Legal Obligation
  7. Context e.g. Location, Duration, Frequency, Necessity, Statuses
  8. Processing Context e.g. Automation, Human Involvement, Storage Conditions, Data Source
  9. Risk Assessment e.g. Risk and Mitigation Measure, Consequence and Impact
  10. Rights e.g. Data Subject Right, Rights Exercise, Rights Fulfilment and Non-fulfilment
  11. Rules e.g. Permission, Prohibition, Obligation

Extending these are the following extensions:

  1. Personal Data (PD) taxonomy with indication of Sensitive/Special Categories
  2. Locations (LOC) based on ISO 3166-2 for indicating Countries and Regions
  3. Risk Assessment and Management (RISK) concepts based on ISO 31000 series
  4. Technology (TECH) concepts to indicate Actors, Provision Method, Intended Use
  5. AI extends TECH with AI techniques, capabilities, lifecycle, risks, measures
  6. Justifications for explaining why something should be or cannot be done
  7. LEGAL concepts - laws, authorities, adequacy decisions from jurisdictions, e.g.
    1. Germany (DE)
    2. European Union (EU)
    3. United Kingdom of Great Britain and Northern Ireland (GB)
    4. Ireland (IE)
    5. India (IN)
    6. USA (US)
    with the following specific laws defined in their own extensions:
    1. EU GDPR
    2. EU Data Governance Act (DGA)
    3. EU Network and Information Security Directive (NIS2)
    4. EU AI Act
    5. EU Fundamental Rights

How does DPV enable interoperability?

DPV uses the RDF and related semantic-web standards for defining concepts and creating data that is interoperable. Through this, each concept is given a unique identifier which enables its consistent representation across use-cases. For example, https://w3id.org/dpv#Purpose always refers to 'purpose' as a concept. Organisations directly using DPV have a consistent way to exchange and interpret the data in a consistent and interoperable manner.

Even if organisations may have differing internal terminology, they can be aligned by using DPV as a 'common' vocabulary. For example, CompanyA uses 'business purpose' as the term for what is 'purpose' in DPV, and CompanyB uses 'goal' as their term. If CompanyA and CompanyB want to exchange information, they can 'map' or 'align' their respective terms to DPV's 'purpose' so that the other entity can correctly understand it.

While the DPV uses RDF and semantic-web standards, this is not strictly necessary for the use of DPV. As long as the unique identifiers of DPV's concepts are retained, use-cases can use existing technologies to store and manage their information. For example, if a spreadsheet or a database stores a record of all data categories existing within the organisation, these can be annotated with DPV concepts to specify the category (e.g. sensitive personal data). Or an organisation can maintain a data dictionary mapping its internal terminology to DPV concepts so that interoperable records can be readily produced.

What can I do with DPV?

The most basic function of DPV is to represent information in a machine-readable form. For example, the ISO/IEC TS 27560:2023 technical specification uses DPV as an example of machine-readable consent records and receipts. Other forms of organisational records and documents can also be represented using DPV e.g. privacy notices, records of processing activities, risk and impact assessments, data breach records, and how/which cloud services are being used. DPV can also be used in a 'personal' capacity e.g. to indicate privacy preferences, maintain consent records, and exercise rights.

The 'hierarchical taxonomies' in DPV also support responsible use of data and technologies. For example, the purpose taxonomy includes the concept 'Personalisation' - which by itself is vague as it does not indicate what the personalisation is about or for. DPV taxonomies expand this concept to define different kinds of personalisations such as in service provision for personalised recommendations, which is separate from personalised advertising. By using such hierarchies, the most accurate purpose can be selected and indicated - thereby increasing transparency.

Such hierarchies also enable using a broader (but sufficiently clear) purpose such as 'service personalisation' to justify the different personalisation activities that can occur. For example, if consent is given to the 'broader' concept of service personalisation, then the further 'narrower' or 'specific' personalisation purposes in the hierarchy associated with events, products, and activities are also enabled through that consent.

How does DPV deal with jurisdictions/laws?

DPV itself is intended to be jurisdiction-agnostic - its concepts, though based on GDPR terminology, do not presume any particular law to be applicable. To indicate specific jurisdictions and laws are applicable, DPV provides explicit concepts to indicate this - hasJurisdiction and hasApplicableLaw. In addition to this, DPV uses the mechanism of 'extensions' - which are concepts defined in a separate namespace - to represent the concepts from different jurisdictions and laws. For example, legal-eu represents the EU jurisdiction and EU-GDPR extension represents the GDPR law within EU. Through this, DPV can support all laws and jurisdictions without overlaps between them.

Who is using DPV?

DPV is used by several academic and industrial research projects, commercial and non-commercial organisations, and researchers. See the DPVCG Adoption wiki page for more information.

Is the DPV 'free' to use?

DPV is provided under the W3C Software and Document license which permits use of DPV in all use-cases with acknowledgement in published work.

How do I get involved?

The DPVCG is a W3C community group, and has an open membership - anyone can join and is welcome to participate. Development happens in an open forum and is visible through the GitHub repo. If you are interested in using or are already using DPV, we strongly encourage you to join the DPVCG as decisions are taken based on membership, and participation also provides communication to suggest features and requirements, and to obtain assistance when/where needed.