General Data Protection Regulation (GDPR) for Identity Architects

Published in

FACILELOGIN

12 min readAug 9, 2017

The EU General Data Protection Regulation (GDPR) is the regulation 2016/679 of the European parliament and of the council, which replaces the Data Protection Directive 95/46/EC and was designed to harmonize data privacy laws across Europe, to protect and empower all EU citizens (and residents) data privacy and to reshape the way organizations across the region approach data privacy. Commonly known as GDPR, was passed as a regulation on 27th April 2016 — and will be effective from 25th May 2018. GDPR became quite prominent due to the heavy penalties introduced by it for violators — which could be as much as 4% of the annual global turnover or €20 Million (whichever is greater).

If you would like to read more on the regulatory aspect of GDPR, the blog post I did couple of days back, Understanding GDPR For Everyone Who Hates Reading Law! will help. In this blog I mostly talk about technical aspects of GDPR, as an Identity Architect you should worry about.

User/Employee On-boarding

On-boarding from an IAM perspective means getting the user into the IAM infrastructure and make all the services ready to use, with no further delay. An organization performing at the level — 4 or 5 (Measured/Optimized)of the Forrester Identity Management Maturity Model, is expected to automate the entire process — with less human involvement. The on-boarding process has multiple variations based on how complex the business requirements are.

Capture Only the Minimal Personal Data

To be compliant with GDPR, we need to make sure, we only gather the minimal set of user personal data during the on-boarding process — and nothing more. For example, if the user preference is to use email address as the primary means of communication, then no point of keep tracking of user’s physical address. At the same time — if user has opt out to use multi-factor authentication with SMS, or receive notification via SMS, then we should not keep track of user’s mobile number.

You must not capture personal data for the anticipated usage in the future. For example you may be running a web site for your restaurant where customers can login and pre-order food before they get to the restaurant or for pick up. You must not capture customer’s home address at that point anticipating a home-delivery service that you may have in the future.

Record the User Consent on Privacy/Data Policy

Recording user consent on the usage of user attributes is a must under GDPR. If you use user’s email address to send monthly newsletters — it has to be an opt-in option — and must do with user’s consent. In case you have any plans to share user attributes with third parties — that has to be clearly mentioned — and get the user consent. The bottom line is — you should never use user personal data in any other way than the original intended purpose of storing them.

The safest option for any business is to prominently display the organization’s privacy/data policy and get the user’s consent at the point of on-boarding. These should clearly indicate the usage of personal data. GDPR has a quite broader definition of personal data — which also covers cookies, IP addresses, device identifiers, etc. Here is a very good example of a data policy by Facebook. The policy addresses following aspects:

What kinds of information Facebook collects?
How does Facebook use that information?
How is the information shared?
How can user manage or delete information about him/her?
How does Facebook respond to legal requests or prevent harm?
How will Facebook notify its users of changes to the data policy?

Facebook also has another policy, which goes into the details, explaining its usage of cookies.

What cookies are?
Why does Facebook use cookies (the purpose of cookies)?
Where does Facebook store cookies?
Do other parties use cookies in connection with the Facebook Services?
How can you control Facebook’s use of cookies to show you ads?

Self-service User Portal

Once the user is signed up — the IAM infrastructure has to provide the user a portal, where he/she can view and update his/her user profile. This is a common option most of the IAM vendors provide. Under GDPR it’s crucial that user having this option to view and update his/her own personal data — but GDPR goes well beyond that. When designing a user portal in a GDPR compliant manner, we need to think through the following aspects:

The user needs to have a way to view/update his/her personal data. The IAM infrastructure should have a mechanism to propagate such changes to other applications (service providers).
The user needs to have an option to delete his/her own personal data. The IAM infrastructure should have a mechanism to propagate such changes to other applications (service providers).
The user should have an option to request to export his/her personal data. The role of IAM infrastructure is limited here. We do not store/track business/transactional data at the IAM layer. For example, IAM layer may track which service provider you login and when, but it does not track what personal data those service providers store against your account.
The user portal should provide an option to see which service providers the user has given his/her consent to share personal data with — under which conditions — and there should be a way to revoke. Any revocation action from the user — has to be propagated to the corresponding service providers — so they can act upon that.
The user portal should provide an option to the user to restrict the usage of selected personal data. For example, the user should be able to say — stop using his/her physical address. Then the IAM infrastructure has to enforce that policy in all the places it shares user attributes with the rest. It can be during a login flow with SAML 2.0 or OIDC — or in an SCIM API call — even in an administrative dashboard or report, the physical address should not appear against that user.

Identity Federation and Single Sign On

Identity federation and single sign-on are two key functionalities provided in an IAM infrastructure. This can be between multiple applications within the same trust domain (of the controller — as defined in GDPR) — or between multiple trust domains. For example, if your employees use their corporate credentials to login into the internal HR application, that interaction is within the same trust domain. But, in case your employees access Salesforce or Google Apps, with their corporate credentials (let’s say using SAML), then the interaction happens between two trust domains.

Whenever your identity provider, in your corporate IAM infrastructure, shares personal data with other applications (within the same trust domain or outside), it has to be done so with the user’s consent. Even inside the same trust domain, there can be applications, that require attributes for a different purpose than it was originally collected by the identity provider.

Even in the same overall trust domain, the identity provider runs on a different trust plane. It’s quite important to understand the trust boundary between the identity provider and the service providers. Let me provide you an example how we (WSO2) handled this with one of the largest technology provider for electronic payments systems in USA.

They had multiple applications running within the same domain, which are open for both their customers and employees. Different applications do have different attribute requirements. One application may need first name, last name and the email address while the other application may need phone number in addition. We didn’t want each application to directly do the attribute collection from the user — and then update the central identity store. The corporate identity provider is at a much higher trust plane than the other applications, who in fact consume user attributes. We will not allow a write from a low trust plane to a higher trust plane. Following lists down in step by step, the approach we followed.

User visits the application he/she wants to access and clicks on login, which will take the user to the identity provider.
The identity provider (assuming user has no login session) takes the user into a page which has both the login and signup options.
User picks the signup.
The identity provider knowing from which application the initial request came from — prompts the user to enter the attributes as requested by the corresponding application. Also — it provides a link to the privacy/data policy of the corresponding application. Identity provider, captures all these metadata about the application, during the application on-boarding process.
Once the user signs up with the identity provider — providing the consent for both the identity provider’s and service provider’s data/privacy policies — the identity provider shares the requested user attributes with the service provider — and the user logs into the application.
The above completes the initial signup flow of a user. Now the same user tries to access another application, which needs more user attributes.
The user clicks on the login link of the new application, which will take him/her to the identity provider.
The identity provider (assuming user has a login session) takes the user to a page where he/she has to enter the missing attributes, required by this new application. At the same time it provides a link to the privacy/data policy of the corresponding application.
Once the user provides the missing attributes to the identity provider — with his/her consent for both the identity provider’s and service provider’s data/privacy policies — the identity provider shares the requested user attributes with the service provider — and the user logs into the application.

The above approach helps to govern the control of the user attributes centrally, but still decentralizes the responsibility. Each service provider must be compliant with GDPR in the way they handle personal data.

When you are working with a SaaS app you do not have that much of luxury. It’s not just two trust planes — but major trust domains.You can only get the user’s consent to share the attributes with the third party SaaS app, the SaaS app itself now has to act as a GDPR compliant controller. This may look like a gray area — and in that case it’s always better you consult your lawyers.

Identity Provisioning

Identity provisioning is another key feature provided in an IAM infrastructure by the identity provider. For example, during the on-boarding process of an employee, you may need to provision him/her to other applications with the appropriate roles. Also you may do selective provisioning. For example, you may provision everyone in your organization to Google Apps, but only the sales team to Salesforce. In such cases, you as the organization (and also as the GDPR controller) should be aware that all the dependent applications treat personal data in a GDPR compliant manner and get the user’s consent for attribute sharing during the on-boarding process itself. In case you introduce new applications, which require attribute sharing — then the user consent can be taken via the user portal — and also by sending a notification to the user’s registered email address.

Just-in-time Provisioning (JIT)

JIT provisioning is another way of on-boarding users into the system. This is a common use case, we’ve found in most of the IAM deployments. Let me walk you through an example here.

You have multiple applications/service providers in your business, which are open to the public. Each service provider trusts the corporate identity provider and has enabled social login to the corresponding application via the corporate identity provider.
User clicks on login on the application, which will take the user to the corporate identity provider, which will again redirect the user to the social identity provider (let’s say Facebook) for login.
The user will provide his consent at the Facebook for sharing attributes with the corporate identity provider — and upon successful login, Facebook will redirect the user back to the corporate identity provider.
Corporate identity provider will now JIT provision the user to it’s own local identity store. This will include all the attributes fetched from the Facebook. This provisioning may be required for reporting purposes, marketing purposes — and for any offline communications with the user — or even to define access control policies for the user at the application side.
The identity provider — to be GDPR compliant has to get the user’s consent for storing attributes — and for the purpose those are going to be used. Then again this needs to be treated as a user on-boarding function and get the user consent on corporate/application data/privacy policies.

Identity Analytics

Identity analytics plays a key role in an IAM infrastructure. Let me list out few key scenarios where identity analytics comes in handy.

To track all the successful/failed login attempts — with the origin IP address.
To track all user login patterns against service providers and external identity providers.
To build a complete login profile of a given user for a given time period.
To build a complete login profile of a given service provider for a given time period.
To track all the active sessions in the system, average active sessions, average session length, etc.
To track all the administrative actions taken by IAM administrators. This may include add/update/delete of user/role/groups/service providers/identity providers/policies/etc.
To track and detect anomalous activities and fire notifications.

Most of these information is collected centrally at the corporate identity provider level and pushed into an analytics engine for further processing. Under GDPR any of these information collected under a given user identity falls under personal data, and must handle with care. To be transparent with the users, in the corporate data policy, all these information must be mentioned and get the user consent. For example, following are couple of extracts from the Facebook’s data policy, under the section, What kinds of information do we collect?.

Device information.
We collect information from or about the computers, phones, or other devices where you install or access our Services, depending on the permissions you’ve granted. We may associate the information we collect from your different devices, which helps us provide consistent Services across your devices. Here are some examples of the device information we collect:
Attributes such as the operating system, hardware version, device settings, file and software names and types, battery and signal strength, and device identifiers.
Device locations, including specific geographic locations, such as through GPS, Bluetooth, or WiFi signals.
Connection information such as the name of your mobile operator or ISP, browser type, language and time zone, mobile phone number and IP address.
Your networks and connections.
We collect information about the people and groups you are connected to and how you interact with them, such as the people you communicate with the most or the groups you like to share with. We also collect contact information you provide if you upload, sync or import this information (such as an address book) from a device.

Pseudonymized Data

As we discussed before the definition of personal data under GDPR has a broader scope. It’s a good practice to record any additional data that you collect, apart from direct user attributes (like what we gather in identity analytics)— against a pseudonym. Pseudonymous and anonymous carry two different meanings. Anonymous is pseudonymous + unlinkability. Any data about the user apart from the core set of attributes — like user behaviors and access patterns, can be recorded against a pseudonym. The link between the pseudonym and user can be maintained in a different table — and to make it much secure, we just encrypt the data in this table. Once the user request to delete his/her account, apart from the user attributes, the mapping between the username and the pseudonym can also be removed. This will make all the recorded data against the corresponding pseudonym, anonymous. Anonymous data are safe under GDPR — and you do not need to worry erasing them.

Then again right to be forgotten is a tricky requirement under GDPR. Some financial/tax regulations require the retention of certain data for a given period. When there is a conflicting regulation, in most of the time tax regulations will win. You may need to consult your lawyer to get more clarity on this.

Securing the IAM Infrastructure

The key objective of GDPR is the privacy of user personal data. All the regulations enforced in GDPR is to achieve that goal. GDPR does not define how to do things — but only what needs to be achieved. Properly securing the IAM infrastructure plays a key role in meeting GDPR objectives, as the corporate identity provider acts as the single source of truth. Following are few things you would need to worry about in securing the IAM infrastructure.

Enable strong authentication for IAM administrators and users. Look for the identity provider solutions which do support FIDO U2F, TOTP, SMS/Email OTP.
Do not share the IAM infrastructure between customers (users) and corporate employees.
Enforce access control polices — and do an impact analysis on all the changes to such policies.
Occupy identity analytics to detect anomalous activities and fire notifications.
Audit all the actions in the IAM infrastructure — not just by IAM administrators — but also by the system administrators.
Have a way to protect the integrity of the audit logs and verify frequently.
Occupy a Privileged Accounts Management (PAM) system to manage powerful/privileged/system accounts within the IAM infrastructure.
Protect the integrity of personal data — both at rest and while in transit. The data at rest can be protected with disk level encryption or database level encryption. TLS must be used everywhere, when sharing personal data over communication channels.
Have a plan for high-availability and disaster recovery.
Checkout the recommendations in ISO/IEC 27000 family standards and BS 10012.