Roles and Responsibilities of a Data Governance Team: The Minimum Structure for the AI Act

Most data governance problems don't have a technical origin. They have an organizational one: nobody knows who decides on the data, who maintains it, and who answers when something breaks. Data governance roles aren't titles on an org chart; they're the human infrastructure without which no framework, tool, or policy works. And with the AI Act in force, that human infrastructure has direct regulatory consequences.

Why Data Governance Roles Are Critical for the AI Act

The AI Act doesn't just regulate AI systems. It regulates the processes, documentation, and people responsible for those systems. Article 10 requires data governance over training datasets. Article 11 requires technical documentation maintained by someone. Article 14 requires effective, verifiable human oversight. Article 9 requires a risk management system with assigned owners.

Each of those articles assumes there's a person or team with a clear mandate to execute those responsibilities. If that person doesn't exist in your organization — or exists on paper but without real time or authority — compliance with the Regulation is a statement of intent, not an auditable reality. And AESIA and the AEPD inspect realities, not intentions.

For a full look at the regulatory context, see AI Act Key Dates: What Your Company Must Do at Each Regulatory Milestone.

The Data Governance Lead: The Role That Makes Everything Possible

The Data Governance Lead — also called Chief Data Officer in large organizations, or Head of Data Governance in more operational settings — is the role that turns data governance from a project into an organizational capability. Without this profile holding a real executive mandate, the other roles exist in a vacuum.

Data Governance Lead Responsibilities

  • Defining and maintaining the data governance strategy aligned with business goals and regulatory requirements.
  • Chairing the Data Governance Committee and ensuring data decisions are actually followed through.
  • Assigning and managing Data Owners per domain, with formally documented responsibilities.
  • Being the main point of contact with leadership, legal, and, in the event of an inspection, AESIA or the AEPD.
  • Prioritizing critical data domains, quality projects, and catalog and lineage initiatives.
  • Managing the governance team's budget and resources.

What Profile Is Needed

It's not a purely technical or purely business profile. It's a hybrid profile that understands how data works technically, how the business uses it, and what the regulatory framework requires. In practice, the best profiles come from roles like Data Architect, Business Intelligence Manager, or Data Product Manager with exposure to regulatory compliance projects. The ability to communicate upward — to leadership and the board — and downward — to technical teams — is as important as technical knowledge.

In environments with multiple business units, such as groups with several subsidiaries, this role is replicated at two levels: a corporate Data Governance Lead who defines the standards and a local Data Governance Lead per entity who implements them. Without that federated structure, the corporate framework becomes something that "exists on paper but not in practice".

The Data Owner: The Business Owner of the Data

The data owner role is the most misunderstood in practice. It's not the technician managing the database table. It's the business owner accountable for the quality, correct use, and protection of data within their domain. In many organizations this role exists implicitly — someone "knows" which data they're responsible for — but without formalization or real authority, which makes it a decorative role.

Data Owner Responsibilities

  • Approving and revoking access to their domain's data, using business judgment.
  • Validating and approving business definitions for entities and metrics in their domain.
  • Taking formal accountability for data quality: if an executive report has incorrect data, the Data Owner is accountable.
  • Approving the use of their domain's data in AI projects, advanced analytics, or sharing with third parties.
  • Participating in the Data Governance Committee with a vote on decisions affecting their domain.
  • Signing off on dataset spec sheets for high-risk AI systems using their domain's data (AI Act Art. 10).

The Most Common Mistake With This Role

Naming Data Owners without telling them, without giving them dedicated time, and without giving them real authority over access decisions. The result is a nice-looking RACI in a presentation and a data domain with no real governance. The Data Owner needs to know they're a Data Owner, understand what that means, and have a real ability to say no when someone requests access they shouldn't have.

The Data Steward: The Day-to-Day Operator of Governance

If the Data Owner is the business owner, the Data Steward is the operational executor. Data steward responsibilities cover everything that happens in the space between policy (defined by the Lead and the Committee) and technical implementation (executed by Data Engineering). It's the role that keeps governance alive day to day.

Data Steward Responsibilities

  • Maintaining the business glossary: definitions of entities, metrics, hierarchies, and relationships between domains.
  • Defining, documenting, and overseeing data quality rules for their domain.
  • Managing first-level access requests: verifying the request is consistent with the requester's business role before escalating it to the Data Owner.
  • Documenting business lineage in the catalog: what business transformation each field represents, what rules were applied, and who approved them.
  • Coordinating with Data Engineering on the technical implementation of quality rules and access controls.
  • Acting as the point of contact for any team with questions about the definition, origin, or correct use of their domain's data.
  • Keeping dataset spec sheets up to date for AI systems using their domain's data.

How Many Data Stewards an Organization Needs

The answer depends on the number of critical domains and the governance workload in each. In practice, a Data Steward can manage between one and three domains if the workload is reasonable. Below that, the role is superficial. Above it, the Steward can't do their job well and ends up only firefighting.

In multi-entity environments, the model that works best is having local Stewards per entity — who know the local business and data — coordinated by a corporate Steward per domain who maintains consistency across master definitions.

The Data Custodian: The Technical Guardian of the Data

The Data Custodian is the technical profile responsible for storage, physical security, and infrastructure-level access. In many organizations, this role is held by the Data Engineering team or the data platform team, without anyone formally calling it Data Custodian. The name matters less than the clarity of responsibilities.

Data Custodian Responsibilities

  • Implementing and maintaining technical access controls: RBAC in Snowflake, data lake permissions, RLS in the semantic layer.
  • Managing the data's technical lifecycle: retention, archiving, deletion per privacy policies and regulation.
  • Ensuring data availability and integrity: backups, disaster recovery, pipeline monitoring.
  • Implementing the quality rules defined by the Data Steward in the data pipelines.
  • Maintaining the access and audit logs the AI Act (Art. 12) and GDPR require for regulated systems.
  • Executing technical changes resulting from access reviews: revoking permissions, adjusting roles, documenting changes.

Specialized Technical Roles: RBAC, Lineage, Quality and Audit

In organizations with higher maturity or larger data volumes, technical governance roles become specialized. These aren't always independent roles; in small teams, a single Data Engineering profile covers several. What matters is that each responsibility has a clear owner.

Access Governance Specialist (RBAC)

Designs and maintains the role-based access control model: which role has access to which data, on which platform, and under what conditions. In Snowflake and Power BI environments, this profile manages the sync between business roles defined by Data Owners and the technical implementation on the platforms. It also designs and maintains access request and approval workflows and runs quarterly reviews with documented evidence.

In environments like IAG, with thousands of users spread across multiple airlines and systems, this role is critical: a poorly designed RBAC model generates both over-privilege — access to data that shouldn't be granted — and operational friction — users who can't access what they need to do their job.

Data Lineage Specialist

Maintains end-to-end data traceability: from the transactional source to the dashboard or AI model. In modern stacks with dbt and OpenMetadata, part of this work is automated. But business lineage — what business transformation each field represents, what rules were applied, and who approved them — requires a profile who understands both the technical data and the business context. This is the profile that ensures Article 10 of the AI Act has real substance, not just formal existence.

Data Quality Specialist

Defines, implements, and monitors quality rules in data pipelines. Works with dbt tests, Great Expectations, or Soda to translate quality thresholds agreed with Data Stewards into automated controls. Maintains quality dashboards visible to Data Owners and manages the remediation process when anomalies are detected. In environments with AI model training data, this profile is also responsible for documenting the quality metrics AI Act Art. 10 requires.

Governance Audit and Reporting Specialist

Maintains the audit dashboards that make governance status visible to leadership, legal, and external auditors. Active access by role, last review dates, open quality incidents, catalog coverage by domain, AI system documentation status. This profile turns governance into something measurable and, therefore, defensible to the regulator.

Signs Your Company Needs to Structure These Roles Now

These are the clearest symptoms that your data governance roles structure is insufficient and that risk — operational and regulatory — is growing:

  • Nobody knows who owns a piece of data when something fails. The question "whose data is this?" generates silence or an email chain with no clear answer.
  • Access is approved by email or Slack. No formal workflow, no traceability, no periodic review. In an AEPD or AESIA inspection, this is indefensible.
  • The same KPI has different definitions depending on the team. A direct sign of missing Data Stewards with authority over business definitions.
  • Technical documentation for AI systems doesn't exist or isn't up to date. If AI Act Art. 11 requires documentation prior to deployment and that documentation doesn't exist, the risk is immediate.
  • The data catalog hasn't been updated in months. Without a Data Steward with dedicated time, the catalog dies within weeks. If it's dead, governance is nominal.
  • Nobody has a mandate to say no. If any access or data-use request is approved because "nobody has the authority to deny it," governance doesn't exist.
  • The company is about to deploy, or already deploys, a high-risk AI system without assigning who manages the datasets, who maintains the technical documentation, and who answers to the regulator.

What I've Seen in Complex Environments

Case 1 — RBAC Without an Owner in a Multi-Airline Group

In an environment with multiple airlines under the same holding company, Snowflake access roles had been configured in year one of the project by the engineering team. Three years later, nobody had reviewed which users were still active, which roles had grown beyond their original scope, and which access had been orphaned by departures or role changes. The problem wasn't technical: it was that no Data Custodian existed with an explicit mandate to keep that access model alive.

The solution wasn't technical at first: it was formally assigning that mandate, establishing a quarterly review process with documented evidence, and building an audit dashboard in Power BI that made access status visible to each domain's Data Owners. The tool was the same; what changed was who was responsible for maintaining it.

Case 2 — Data Stewards With No Time in Commercial BI Projects

In BI projects for commercial teams, the most common pattern is that the Data Steward role falls implicitly on the team's most senior analyst, who handles it among other responsibilities with no dedicated time or formal recognition. The result is a business glossary nobody updates, definitions that vary between reports, and a team spending 30% of its time resolving data questions instead of analyzing data.

Formalizing the role — with dedicated time, clear accountability criteria, and visibility in the catalog — turns that hidden cost into a visible asset. New analysts onboard faster, data validation meetings disappear, and trust in reports increases measurably.

Case 3 — AI Governance With No Owner in Regulated Environments

In environments with analytical systems supporting operational decisions — route optimization, dynamic pricing, demand forecasting — the question of who's responsible for the AI model usually falls on the team that built it, which has no governance mandate and doesn't maintain technical documentation beyond code comments. When that team changes or the system evolves, the documentation becomes obsolete.

The solution that works is extending the Data Governance Lead's mandate to explicitly cover AI systems: risk classification, assigning Data Owners for training datasets, periodic technical documentation review, and an approval process before any new production deployment. It's not a new team; it's an extension of the existing framework with explicit responsibilities.

Common Mistakes in Defining Data Governance Roles

  • Creating roles without assigning real time. The "part-time" Data Steward who spends 10% of their day on governance among ten other responsibilities can't do their job. Effective governance requires dedicated time, not leftover time.
  • Confusing the Data Owner with the database's technical administrator. The Data Owner is a business role. Assigning it to the DBA or Data Engineer creates confused responsibilities and technical decisions where there should be business decisions.
  • One profile for everything. Putting a single person as Data Governance Lead, Data Steward, Data Custodian, and quality specialist all at once guarantees nothing gets done well. The minimum viable team is small, but must have separate responsibilities.
  • Roles defined but no Data Governance Committee. Without a formal decision-making forum where Data Owners, the Lead, and legal meet periodically, data conflicts — and there will be some — have no path to resolution. They get resolved by hierarchy or by whoever shouts loudest, which is the most inefficient way to govern data.
  • Not documenting responsibilities formally. A RACI in a presentation nobody opens again doesn't count. Responsibilities should be documented in an accessible document, reviewed periodically, and known to everyone involved.
  • Treating AI Act governance roles as something separate. AI Act obligations around documentation, datasets, and human oversight aren't floating responsibilities: they must be assigned to specific people. If there's no explicit owner for Article 10, Article 10 isn't met.

Conclusion: Without Roles, the Framework Is Just Paper

An effective data governance team structure isn't measured by headcount or org-chart titles. It's measured by whether every critical responsibility — quality, access, lineage, documentation, audit — has an owner with the time, mandate, and real authority to carry it out. Without that, the best framework in the world is a nice-looking presentation that protects nobody and creates no value.

With the AI Act in full effect, the absence of formalized roles isn't just an operational problem: it's a concrete regulatory risk. Organizations with their data governance responsibilities correctly assigned are the ones that can respond in minutes during an inspection. Those without them are the ones who improvise under pressure and generate evidence of non-compliance in the process.

To build the framework supporting these roles, see How to Implement an Effective Data Governance Framework in the AI Act Era.

Checklist: An Operational Data Governance Role Structure

  • Data Governance Lead appointed with an executive mandate and dedicated time.
  • Data Owners formalized per critical domain, with a documented RACI and communicated responsibilities.
  • Data Stewards assigned per domain with dedicated time and operational decision-making capacity.
  • Data Custodian or technical team with an explicit mandate over access, retention, and logs.
  • Active Data Governance Committee with periodic meetings and documented minutes.
  • RBAC specialist or owner with quarterly access review and evidence.
  • Data quality specialist or owner with defined rules and active alerts.
  • Lineage owner with end-to-end coverage documented in the catalog.
  • Audit and reporting owner with an active governance dashboard.
  • AI Governance Officer or extended Lead mandate to cover risk classification, technical documentation (Art. 11), and AI dataset management (Art. 10).
  • All responsibilities documented in an accessible, reviewed formal document.
  • Onboarding process for new owners of governance roles.

Frequently Asked Questions About Data Governance Roles

What are the essential data governance roles in an organization?

The minimum viable structure includes a Data Governance Lead with an executive mandate, Data Owners per business domain, Data Stewards with operational decision-making capacity, and at least one technical profile managing access, quality, and lineage. In AI Act-regulated environments, an AI Governance Officer is added, or the DPO's mandate is extended to cover the Regulation's obligations.

What's the difference between a Data Owner and a Data Steward?

The Data Owner is the business owner of the data: they decide on its use, approve access, and take formal responsibility for its quality. The Data Steward is the operational executor: they maintain definitions, apply quality rules, and act as the day-to-day point of contact for data questions. The Owner decides; the Steward operates. Confusing the two roles — or assigning them to the same person without enough time — is one of the most common mistakes in practice.

What responsibilities does a Data Steward have?

The main data steward responsibilities are: keeping the business glossary up to date, defining and overseeing data quality rules, handling first-level access requests, documenting business lineage in the catalog, and coordinating with Data Owners on decisions about data use and classification. In environments with AI systems, they also maintain the dataset spec sheets Article 10 of the AI Act requires.

How does the AI Act affect the Data Governance role structure?

The AI Act adds specific responsibilities that must be formally assigned: AI system risk classification, maintaining Article 11 technical documentation, managing training datasets per Article 10, overseeing audit logs, and coordinating with AESIA and the AEPD during inspections. These responsibilities can fall on expanded existing roles or on a dedicated AI Governance Officer, depending on the size and complexity of the organization's AI systems.

How many people does an effective Data Governance team need?

There's no universal number, but the minimum structure for a mid-sized organization with cloud data is: a full- or part-time Data Governance Lead with an executive mandate, a Data Steward per critical domain (between two and five depending on the organization), and at least one technical Data Engineering profile. The most common mistake is trying to do everything with a single profile that ends up with no time or authority for anything. More than headcount, what determines effectiveness is the clarity of responsibilities and the real authority of each role.

Where does your organization stand?

Free maturity assessment for AI Act, Data Governance, NIS2 and GDPR. Instant results with your priority gaps.

Take the assessment → View templates →