Apr 08, 2026

GDPR Data Retention: What to Keep, When to Delete, and Why Most Teams Get It Wrong

Security / Privacy / GDPR Compliance / Legal Architecture

GDPR doesn't tell you how long to keep data — it tells you to justify why you still have it. Here's a practical framework for deciding retention periods, choosing between anonymization and deletion, and avoiding the most common compliance mistakes.

The Retention Problem Nobody Plans For

GDPR doesn't hand you a spreadsheet with retention periods. It says something much harder to implement: keep data only as long as you have a valid reason, and get rid of it when that reason expires.

Most teams don't think about this until someone submits a "delete my data" request — and then they discover their user's email lives in 10 different tables, three third-party tools, and last Tuesday's backup. There's no single owner for the decision "how long do we keep this?" because nobody ever made that decision explicitly.

In 2023, Meta received a €1.2 billion fine for transferring EU data to the US without proper legal basis. But it's not just tech giants. Small businesses across Europe regularly face fines of €5,000–50,000 for missing retention policies or ignoring deletion requests. Estonia's Data Protection Inspectorate (Andmekaitse Inspektsioon) is active too — enforcement is real even for companies with a handful of employees.

Legal Basis: The Root of Every Retention Decision

You cannot define how long to keep data without knowing why you collected it in the first place. GDPR defines six legal bases for processing personal data. In practice, three come up in almost every project:

Consent — the user said yes. Data stays until they say no.
Contract — you need the data to deliver a service. Data stays while the contract is active.
Legitimate interest — you argue the business need yourself and define the scope and duration.

Here's where it gets tricky: the same email address can be processed under different legal bases in different contexts. Your billing system holds it under "contract." Your newsletter tool holds it under "consent." Your analytics might reference it under "legitimate interest." Each context has its own retention period — for the same field, in the same database, belonging to the same person.

This is why retention can't be a single rule applied globally. It's a decision tree rooted in why you have each piece of data.

Data Mapping: Before You Write Any Policy

You can't build a retention policy without knowing what you store and where. A data map doesn't need to be a 40-page document — but it does need to answer four questions for every type of personal data you hold:

What personal data is it? (name, email, IP address, purchase history…)
Where does it live? (which database table, which third-party service, which backup)
Under which legal basis do you process it?
Who owns the decision about its lifecycle?

That last question matters more than people expect. If nobody owns the retention decision, nobody makes it — and data accumulates forever by default. This isn't a technical exercise. It's a conversation between developers, product owners, and whoever handles legal compliance. The output is a shared understanding, not a config file.

How to Set a Retention Period for Any Data Type

The logic is always the same: why did you collect it → when is that purpose fulfilled → is there an external requirement to keep it longer?

A few concrete examples:

Contact form submission — purpose is fulfilled when you've responded (or decided not to). A reasonable retention: 6 months, then delete.
Transaction record — tax law requires keeping financial records. In Estonia, that's 7 years under the Accounting Act. But "financial record" means the transaction date, amount, and tax category — not necessarily the buyer's full profile, shipping address, or phone number.
Marketing preferences — retained until the user withdraws consent. No consent = no data.
Server access logs — legitimate interest in security monitoring. 90 days is common; 12 months is the upper bound most DPAs consider reasonable without specific justification.

One important detail: different fields within the same database record can have different retention periods. An order record might keep the amount and date for 7 years (tax obligation) while the customer name and address get anonymized after 2 years (no longer needed for the original purpose). Treating an entire table as one retention unit is the most common shortcut — and the most common mistake.

Three Types of "Deletion" — and Why Soft Delete Isn't One

When someone says "we deleted the user," they usually mean one of three things. Only two of them count under GDPR:

Soft delete — a flag flips from 0 to 1. The data is hidden from the UI but still sits in the database, still appears in backups, still shows up in direct queries. Legally, this is not deletion. It's concealment. GDPR doesn't care about your UI — it cares about whether the data exists.
Anonymization — personal identifiers are removed irreversibly. The record still exists, but it can no longer be linked to a person. Legally, anonymized data is no longer personal data. GDPR stops applying to it.
Hard delete — the data is physically removed. Gone from the database, gone from indexes, eventually gone from backups as they rotate out.

The most common illusion: "we soft-delete users, so we're compliant." No. If the data is recoverable, it's not deleted. If your admin panel has a "restore" button for deleted users, you haven't deleted anything.

Pseudonymization Is Not Anonymization

Replacing an email with a hash feels like anonymization. It isn't. GDPR draws a sharp line between the two:

Pseudonymization — personal data is transformed so it can't be attributed to a person without additional information (like a key or lookup table). The data is still personal data under GDPR because re-identification is possible.
Anonymization — re-identification is impossible, even when combined with other available data. Only then does GDPR stop applying.

The test isn't whether you can re-identify someone — it's whether anyone reasonably could. Nulling an email but leaving a rare first name + city + date of birth in the same record? That combination can identify a person. GDPR sees through it.

True anonymization requires checking the full combination of remaining fields, not just blanking out obvious identifiers one by one.

Anonymize or Delete: When to Use Which

The decision is straightforward once you ask the right question: do you need the record for anything after the person is gone?

Anonymize when the data serves analytics, reporting, or auditing purposes. An order record (date, amount, product category) without personal identifiers is still valuable for business intelligence.
Delete when the data exists solely to identify or contact a person. Names, emails, phone numbers, physical addresses — if there's no business purpose without the identity attached, remove it entirely.

One trap to avoid: partial anonymization without checking the remaining combination. You strip the name and email from an order, but leave the delivery address, the order date, and a niche product. In a small town, that combination might point to exactly one person. Anonymization is only real when the full remaining dataset passes the re-identification test.

Consent Withdrawal vs. Right to Erasure: Different Rights, Different Responses

These two requests arrive in the same inbox but trigger different processes:

Consent withdrawal ("I no longer want marketing emails") stops processing based on consent — but data collected under a different legal basis stays. If you also hold that email under a contract (because the user is an active customer), you don't delete it. You stop sending newsletters, but the email remains in your billing system.

Right to Erasure ("delete everything you have about me") is broader — but not absolute. You can decline if you have a legal obligation to retain certain data (tax records, for example) or if legitimate interest outweighs the request. However, you must respond within 30 days, explain what you did or why you declined, and document the decision.

Treating these as the same request is a common mistake. A consent withdrawal doesn't require a full data purge. An erasure request doesn't automatically override legal retention obligations. Each needs its own workflow and its own response template.

The Practical Takeaway

Retention isn't a single policy document you write once and forget. It's a set of decisions — one per data type, one per legal basis, one per storage location — that need to be made explicitly, documented clearly, and enforced automatically.

Start with the data map. Define retention periods based on legal basis and purpose. Choose anonymization or deletion for each data type. Build the workflow for consent withdrawal and erasure requests separately. And make sure someone owns each decision — not "the team," not "legal," but a specific person who can explain why this data is kept for this long.

In the next article, we'll take these principles into code: how to implement retention policies in Laravel with scheduled cleanup, cascade rules for related records, and a Right to Erasure workflow that actually works in production.

Storing personal data?

Let's Build Your Retention Logic

We help development teams implement automated data cleanup, anonymization workflows, and erasure request handling — the engineering side of GDPR compliance.

Name

E-Mail

What do you need?

Project details

By submitting, you agree that we’ll process your data to respond to your enquiry and, if applicable, to take pre-contract steps at your request (GDPR Art. 6(1)(b)) or for our legitimate interests (Art. 6(1)(f)). Please avoid sharing special-category data. See our Privacy Policy.

We reply within 1 business day.

What to read next

Short, practical reads to continue the thread.

May 13, 2026

WordPress to Statamic: how your page builder shapes the migration cost

Most "migrate WordPress to Statamic" guides assume you're on plain Gutenberg. Real business sites run ACF, Elementor, Divi or WPBakery — and that choice shapes migration cost more than any other factor. Here's how to spot your situation and what to ask before signing scope.

May 06, 2026

Statamic vs Webflow: a 3-year cost and performance model

How much does each platform cost to run over three years, and how do real sites perform on each? We built a transparent model on a fixed reference marketing site — original measurements, public pricing, fully reproducible.

Apr 17, 2026

Data Retention in Laravel: Scheduled Cleanup, Anonymization, and Erasure Requests

No framework ships with GDPR retention out of the box — Laravel included. Here's how to build on Prunable, add anonymization logic, handle cascade deletion across related models, and process Right to Erasure requests in production.

Apr 01, 2026

One Developer, No Team: Why Solo Builders Create Expensive Problems

A fast, confident developer who has never worked in a team sounds like a bargain — until you need someone else to maintain, scale, or even understand what they built.

Mar 25, 2026

How "Just One API Call" Snowballs Into a Homegrown Framework

Every vanilla PHP integration starts with a simple promise: "It's just a few API calls." Three integrations later, you've built your own router, error handler, and config manager — and nobody else can maintain it.

Mar 11, 2026

Sensitive Form Fields for Statamic: Encrypt Submissions and Control Access

Contact forms often collect personal data—emails, phone numbers, messages, and sometimes details that can become sensitive in context. The risk is rarely the form itself; it’s how submissions are stored, backed up, exported, and accessed in the Control Panel. Sensitive Form Fields for Statamic helps reduce that risk by encrypting selected submission fields at rest and, in Pro, controlling who can view the decrypted values.

Mar 11, 2026

Automated Deployments: When They Save You Time (and When They Don't)

Not every project needs a full CI/CD pipeline from day one. Here's how to decide when automated deployments are worth the setup — and when a simpler process does the job just fine.

Mar 04, 2026

Lead Insights for Statamic: Practical Lead Attribution with UTMs, Referrer & Landing Page

Most websites can collect inquiries, but many teams still can’t answer a simple question: which channels and campaigns consistently generate leads—not just traffic. Analytics tools can show sessions and page views, but attribution often gets lost between marketing reports and the actual form submissions your sales or delivery team works with. Lead attribution becomes truly useful when it’s attached directly to each lead, so you can review outcomes by source, campaign, form, and landing page without building a complex tracking stack.

Feb 26, 2026

Stop Form Spam Without Killing Conversions

Spam in your contact, demo, and support forms isn’t just an annoyance—it’s a quiet tax on revenue and delivery. It wastes sales time, pollutes CRM data, distorts analytics, and can even become a pathway for more serious abuse (payload injection, probing, and automated scanning). The right anti-spam setup should stop bots without adding friction that scares away legitimate leads. This article breaks down the most common approaches—reCAPTCHA, honeypot + timing, Cloudflare Turnstile, rate limiting, and CDN/WAF protection—and shows how to combine them into a layered system that works for business.

Feb 25, 2026

Why You Should Keep Plugins, Packages, and Dependencies Updated

Keeping your website or web app updated isn’t “extra maintenance”. It’s a practical way to reduce risk, avoid downtime, and keep delivery predictable. Most issues don’t come from your custom code—they come from third-party components: plugins, Composer packages, npm libraries, SDKs, and platform dependencies.

1 / 10