GDPR Data Retention: What to Keep, When to Delete, and Why Most Teams Get It Wrong

Security / Privacy / GDPR Compliance / Legal Architecture
GDPR Data Retention: What to Keep, When to Delete, and Why Most Teams Get It Wrong

GDPR doesn't tell you how long to keep data — it tells you to justify why you still have it. Here's a practical framework for deciding retention periods, choosing between anonymization and deletion, and avoiding the most common compliance mistakes.

The Retention Problem Nobody Plans For

GDPR doesn't hand you a spreadsheet with retention periods. It says something much harder to implement: keep data only as long as you have a valid reason, and get rid of it when that reason expires.

Most teams don't think about this until someone submits a "delete my data" request — and then they discover their user's email lives in 10 different tables, three third-party tools, and last Tuesday's backup. There's no single owner for the decision "how long do we keep this?" because nobody ever made that decision explicitly.

In 2023, Meta received a €1.2 billion fine for transferring EU data to the US without proper legal basis. But it's not just tech giants. Small businesses across Europe regularly face fines of €5,000–50,000 for missing retention policies or ignoring deletion requests. Estonia's Data Protection Inspectorate (Andmekaitse Inspektsioon) is active too — enforcement is real even for companies with a handful of employees.

Legal Basis: The Root of Every Retention Decision

You cannot define how long to keep data without knowing why you collected it in the first place. GDPR defines six legal bases for processing personal data. In practice, three come up in almost every project:

  • Consent — the user said yes. Data stays until they say no.
  • Contract — you need the data to deliver a service. Data stays while the contract is active.
  • Legitimate interest — you argue the business need yourself and define the scope and duration.

Here's where it gets tricky: the same email address can be processed under different legal bases in different contexts. Your billing system holds it under "contract." Your newsletter tool holds it under "consent." Your analytics might reference it under "legitimate interest." Each context has its own retention period — for the same field, in the same database, belonging to the same person.

This is why retention can't be a single rule applied globally. It's a decision tree rooted in why you have each piece of data.

Data Mapping: Before You Write Any Policy

You can't build a retention policy without knowing what you store and where. A data map doesn't need to be a 40-page document — but it does need to answer four questions for every type of personal data you hold:

  • What personal data is it? (name, email, IP address, purchase history…)
  • Where does it live? (which database table, which third-party service, which backup)
  • Under which legal basis do you process it?
  • Who owns the decision about its lifecycle?

That last question matters more than people expect. If nobody owns the retention decision, nobody makes it — and data accumulates forever by default. This isn't a technical exercise. It's a conversation between developers, product owners, and whoever handles legal compliance. The output is a shared understanding, not a config file.

How to Set a Retention Period for Any Data Type

The logic is always the same: why did you collect it → when is that purpose fulfilled → is there an external requirement to keep it longer?

A few concrete examples:

  • Contact form submission — purpose is fulfilled when you've responded (or decided not to). A reasonable retention: 6 months, then delete.
  • Transaction record — tax law requires keeping financial records. In Estonia, that's 7 years under the Accounting Act. But "financial record" means the transaction date, amount, and tax category — not necessarily the buyer's full profile, shipping address, or phone number.
  • Marketing preferences — retained until the user withdraws consent. No consent = no data.
  • Server access logs — legitimate interest in security monitoring. 90 days is common; 12 months is the upper bound most DPAs consider reasonable without specific justification.

One important detail: different fields within the same database record can have different retention periods. An order record might keep the amount and date for 7 years (tax obligation) while the customer name and address get anonymized after 2 years (no longer needed for the original purpose). Treating an entire table as one retention unit is the most common shortcut — and the most common mistake.

Three Types of "Deletion" — and Why Soft Delete Isn't One

When someone says "we deleted the user," they usually mean one of three things. Only two of them count under GDPR:

  • Soft delete — a flag flips from 0 to 1. The data is hidden from the UI but still sits in the database, still appears in backups, still shows up in direct queries. Legally, this is not deletion. It's concealment. GDPR doesn't care about your UI — it cares about whether the data exists.
  • Anonymization — personal identifiers are removed irreversibly. The record still exists, but it can no longer be linked to a person. Legally, anonymized data is no longer personal data. GDPR stops applying to it.
  • Hard delete — the data is physically removed. Gone from the database, gone from indexes, eventually gone from backups as they rotate out.

The most common illusion: "we soft-delete users, so we're compliant." No. If the data is recoverable, it's not deleted. If your admin panel has a "restore" button for deleted users, you haven't deleted anything.

Pseudonymization Is Not Anonymization

Replacing an email with a hash feels like anonymization. It isn't. GDPR draws a sharp line between the two:

  • Pseudonymization — personal data is transformed so it can't be attributed to a person without additional information (like a key or lookup table). The data is still personal data under GDPR because re-identification is possible.
  • Anonymization — re-identification is impossible, even when combined with other available data. Only then does GDPR stop applying.

The test isn't whether you can re-identify someone — it's whether anyone reasonably could. Nulling an email but leaving a rare first name + city + date of birth in the same record? That combination can identify a person. GDPR sees through it.

True anonymization requires checking the full combination of remaining fields, not just blanking out obvious identifiers one by one.

Anonymize or Delete: When to Use Which

The decision is straightforward once you ask the right question: do you need the record for anything after the person is gone?

  • Anonymize when the data serves analytics, reporting, or auditing purposes. An order record (date, amount, product category) without personal identifiers is still valuable for business intelligence.
  • Delete when the data exists solely to identify or contact a person. Names, emails, phone numbers, physical addresses — if there's no business purpose without the identity attached, remove it entirely.

One trap to avoid: partial anonymization without checking the remaining combination. You strip the name and email from an order, but leave the delivery address, the order date, and a niche product. In a small town, that combination might point to exactly one person. Anonymization is only real when the full remaining dataset passes the re-identification test.

Consent Withdrawal vs. Right to Erasure: Different Rights, Different Responses

These two requests arrive in the same inbox but trigger different processes:

Consent withdrawal ("I no longer want marketing emails") stops processing based on consent — but data collected under a different legal basis stays. If you also hold that email under a contract (because the user is an active customer), you don't delete it. You stop sending newsletters, but the email remains in your billing system.

Right to Erasure ("delete everything you have about me") is broader — but not absolute. You can decline if you have a legal obligation to retain certain data (tax records, for example) or if legitimate interest outweighs the request. However, you must respond within 30 days, explain what you did or why you declined, and document the decision.

Treating these as the same request is a common mistake. A consent withdrawal doesn't require a full data purge. An erasure request doesn't automatically override legal retention obligations. Each needs its own workflow and its own response template.

The Practical Takeaway

Retention isn't a single policy document you write once and forget. It's a set of decisions — one per data type, one per legal basis, one per storage location — that need to be made explicitly, documented clearly, and enforced automatically.

Start with the data map. Define retention periods based on legal basis and purpose. Choose anonymization or deletion for each data type. Build the workflow for consent withdrawal and erasure requests separately. And make sure someone owns each decision — not "the team," not "legal," but a specific person who can explain why this data is kept for this long.

In the next article, we'll take these principles into code: how to implement retention policies in Laravel with scheduled cleanup, cascade rules for related records, and a Right to Erasure workflow that actually works in production.

Storing personal data?

Let's Build Your Retention Logic

We help development teams implement automated data cleanup, anonymization workflows, and erasure request handling — the engineering side of GDPR compliance.

By submitting, you agree that we’ll process your data to respond to your enquiry and, if applicable, to take pre-contract steps at your request (GDPR Art. 6(1)(b)) or for our legitimate interests (Art. 6(1)(f)). Please avoid sharing special-category data. See our Privacy Policy.
We reply within 1 business day.