SAGE PEOPLE CLOUD ARCHIVE

    Sage People Cloud Archive — Queryable Parquet, Cents per GB

    Retire your Sage People org without losing a row of history. Columnar Parquet on your own S3 / Blob / GCS, queryable via Athena / Synapse / BigQuery, tiered hot-to-frozen automatically, UK GDPR / ICO compliant.

    5–10×
    Cheaper than read-only org
    Parquet
    Columnar, open format
    UK / EU
    Data residency controls
    Athena/Synapse/BQ
    SQL on archived data

    Why the Sage People cloud archive beats every other retention option

    Sage People is a Salesforce Platform org. Keeping it alive read-only after a migration burns Salesforce licences forever. Killing it loses the history. The Sage People cloud archive is the third option — and it wins on every dimension that matters.

    When a Sage People customer migrates to Oracle Fusion HCM, Workday, or any other target, the source org doesn't disappear — it sits there holding 5–10+ years of Worker__c, Employment_Record__c, Salary__c, Leave_Request__c, and Performance_Review__c history that you legally cannot throw away. UK GDPR Art. 5(1)(e) demands you keep it 'no longer than necessary' — which for HR records typically means 6–7 years post-leaver under HMRC and the Limitation Act, and longer for pension scheme members under TPR rules.

    The default answer most customers fall into — keeping the Salesforce org alive in read-only mode — costs £60–150k/year for a typical 2,000-employee firm, indefinitely, because Salesforce Platform licences are priced for active users not archive retention. The Sage People cloud archive replaces that with a £10–20k/year footprint: pennies-per-GB object storage in your own cloud tenant, plus the Syntra ETL query catalog. Same legal coverage. Better query performance for audit and SAR requests. Lower carbon footprint. And the data is yours, in your account, in an open columnar format — not locked inside a vendor's proprietary schema.

    The Sage People cloud archive isn't a feature bolted onto a migration tool. It's purpose-built for the Salesforce-platform shape of Sage People: extractors that respect Bulk API limits, schema converters that preserve custom-field extensions, deletion engines that honour UK GDPR right-to-erasure at the row level, and lifecycle policies tuned to UK HMRC, TPR, FCA SMCR, and ICO retention rules.

    What the Sage People cloud archive delivers

    1
    Open Parquet on your storage
    Data lives in your AWS / Azure / GCS tenant in standard Apache Parquet — no vendor lock, no proprietary blob format, portable to any future tool.
    2
    SQL queries via Athena/Synapse/BQ
    Standard SQL over the entire Sage People history — Worker__c, Employment_Record__c, Salary__c, Leave_Request__c, custom fields — through your existing analytics stack.
    3
    Hot-to-frozen automatic tiering
    Lifecycle rules move partitions from Standard to IA to Glacier as they age — 80%+ storage cost reduction across the full retention horizon.
    4
    UK GDPR / ICO erasure at row level
    Right-to-erasure honoured by rewriting affected Parquet partitions; signed deletion certificates emitted for ICO and data-subject evidence.

    The six architectural choices that make the Sage People cloud archive different

    Every decision optimised for the specific shape of Sage People history and the regulatory framework UK and EMEA customers operate under.

    ☁️

    Your cloud, your data

    Sage People cloud archive deploys into your AWS / Azure / GCS account. Syntra never custodies your data. UK region pinning available (London, UK South) for post-Brexit residency.

    📦

    Open Parquet, not proprietary

    Apache Parquet with embedded schema and JSON Schema sidecars. Readable by any tool, portable to any future system. No vendor lock-in baked into the file format.

    🔍

    Athena / Synapse / BigQuery SQL

    Standard SQL queries against archived Sage People data through your existing analytics stack. No new dashboard tool to learn, no new BI licence to buy.

    🌡️

    Hot → warm → cold → frozen tiering

    Lifecycle policies move partitions automatically as they age. Hot data (13 months) on Standard storage; frozen data (year 8+) on Deep Archive at sub-cents per GB-month.

    🗑️

    Row-level UK GDPR erasure

    Right-to-erasure under UK GDPR Art. 17 honoured by rewriting affected Parquet partitions. Signed deletion certificates emitted as ICO and data-subject evidence.

    🇬🇧

    UK regulatory lifecycle defaults

    Pre-configured retention rules for HMRC (6+ years post-leaver), TPR pension scheme retention, FCA SMCR (5 years), and ICO best-practice defaults.

    Stand up the Sage People cloud archive in 6–10 weeks

    A repeatable deployment path tuned for Sage People's Salesforce-platform shape and UK / EMEA regulatory context. No surprises, no scope creep.

    1

    Cloud account provisioning — Week 1

    Your AWS / Azure / GCS account is configured: storage buckets, IAM roles, query catalog (Glue / Purview / Data Catalog), KMS keys. Region pinning set per data-sovereignty requirement. Networking allow-list for Syntra ETL writer role.

    2

    Sage People extract scope sign-off — Week 1–2

    Which custom objects are in scope (standard set plus your org's extensions), what custom fields are mandatory vs droppable, what the per-object retention horizon is. Output: scope document signed by HR, legal, and audit.

    3

    Pilot extract & query validation — Week 2–4

    Pilot extract of one business unit's full Sage People history (Worker__c, Employment_Record__c, Salary__c, Leave_Request__c). Sample SAR and HMRC queries executed against the Sage People cloud archive to validate performance and accuracy before scale-up.

    4

    Full historical extract — Week 4–7

    Salesforce Bulk API 2.0 extraction of the complete Sage People history for every BU in scope. Output Parquet partitions written to hot tier. Row-level reconciliation against the live org. Reconciliation pack signed by HR and audit.

    5

    Incremental delta replay — Week 6–8

    SystemModstamp-based incremental capture stood up. Nightly delta runs replay live-org changes into the Sage People cloud archive. Replay log retained for SOX-style change evidence.

    6

    Cutover + lifecycle activation — Week 8–10

    Cut. The Sage People cloud archive is declared the durable record of historical HR data. Source org either decommissioned or reduced to minimum-licence read-only runout. Lifecycle tiering rules activated; hot/warm/cold/frozen transitions begin.

    What the Sage People cloud archive does that read-only Salesforce can't

    The structural advantages of moving Sage People history into queryable Parquet on cheap object storage — versus leaving it inside the original Salesforce org.

    💰

    5–10× lower TCO

    Sage People cloud archive at £10–20k/yr replaces £60–150k/yr in Salesforce read-only licences for a 2,000-employee customer — savings compound across the full 7–10 year retention horizon.

    Faster audit queries

    Columnar Parquet + Athena beats SOQL on a sleepy read-only org for any analytical workload — 5–10 year headcount trends, cross-BU salary analysis, audit-period aggregations.

    🔓

    No vendor lock-in

    Data lives in open Parquet on your storage. Switch query engines (Athena → Synapse, Synapse → BigQuery) without touching the underlying bytes. No proprietary export step needed ever.

    📈

    Scales without licence drama

    Adding a new HR analyst querying the Sage People cloud archive costs $0 — they just get an Athena role. Adding the same analyst to a read-only Salesforce org costs another £150–250/month.

    🛡️

    Better data-sovereignty story

    UK region pinning is straightforward. Bring-your-own-key (BYOK) KMS encryption is straightforward. Both are vastly easier than reasoning about Salesforce's shared-tenancy controls.

    🌱

    Lower carbon footprint

    Cold and frozen tier object storage uses a fraction of the compute Salesforce sandboxes burn keeping a barely-used org warm. ESG reporting picks up a clean reduction.

    Frequently asked questions

    What is the Sage People cloud archive and how does it differ from data archival?+

    The Sage People cloud archive is a specific deployment shape of the Syntra ETL archive: a queryable, cost-tiered object-storage estate that holds extracted Sage People data as columnar Parquet files in your own AWS S3, Azure Blob, or Google Cloud Storage account. Data archival is the broader business activity — what you retain, for how long, under which regulation. The Sage People cloud archive is the technical product that delivers that activity at cents-per-GB economics, while keeping queries fast enough to satisfy an HMRC subject access request, an ICO data-subject request, or an internal audit lookup against Worker__c and Salary__c history without ever waking the original Salesforce org. Most customers run both: the policies and retention schedules sit in the archival programme, and the cloud archive is where the bytes physically live.

    Where exactly does the Sage People cloud archive store data?+

    Inside your own cloud tenancy — not Syntra's. The Sage People cloud archive deploys into your AWS account (S3 + Athena + Glue Data Catalog), your Azure subscription (Blob Storage + Synapse Serverless + Purview), or your Google Cloud project (Cloud Storage + BigQuery external tables). Syntra ETL writes Parquet partitions, registers them in your catalog, and grants your IAM roles read access. We never custody the data, never hold a copy on Syntra-owned infrastructure, and never charge you a per-GB storage fee — your cloud provider invoices you directly. For UK-resident Sage People customers worried about data sovereignty post-Brexit, the cloud archive can be pinned to UK regions (London for AWS, UK South for Azure) with explicit residency controls.

    How is the Sage People cloud archive tiered for cost?+

    Tiering follows access frequency. Hot tier (last 13 months of Worker__c, Salary__c, Leave_Request__c activity) lives on S3 Standard / Azure Hot Blob / GCS Standard — queries return in seconds. Warm tier (months 14–36) sits on S3 Standard-IA / Azure Cool / GCS Nearline — queries return in tens of seconds, storage cost drops ~40%. Cold tier (years 4–7) moves to S3 Glacier Instant Retrieval / Azure Cool-to-Archive lifecycle / GCS Coldline — queries return in minutes, storage cost drops 80%+. Frozen tier (year 8+, regulatory minimum retention) moves to S3 Glacier Deep Archive / Azure Archive / GCS Archive — sub-cents per GB-month with a 12-hour retrieval window suitable for once-a-decade HMRC investigations. The Sage People cloud archive moves partitions automatically based on lifecycle rules you set per object and per business unit.

    What kinds of queries does the Sage People cloud archive support?+

    Anything you'd ask of the live Sage People org, against historical data, expressed in standard SQL. AWS Athena, Azure Synapse Serverless, and BigQuery all read the Parquet directly using their respective SQL dialects. Typical queries: 'show all salary changes for employee EMP-12345 in 2021' (subject access request); 'list every UK worker who was on SMP in tax year 2022–23' (HMRC inquiry); 'export the full Leave_Request__c history for the Manchester office, FY2019–FY2023' (litigation hold); 'rebuild headcount as at 31 March 2022 for the audit working paper' (statutory audit). Because Parquet is columnar, a query touching three columns scans 3% of the bytes — costs and latencies are dramatically lower than a Salesforce SOQL query against the same logical data.

    How does the Sage People cloud archive handle UK GDPR and DPA 2018 deletion rights?+

    Right-to-erasure requests under UK GDPR Art. 17 and DPA 2018 are honoured at the row level. The Syntra ETL retention engine flags a worker record across every partition that contains it (Worker__c, Employment_Record__c, Salary__c, Leave_Request__c, Performance_Review__c, plus any custom-object extensions). On scheduled retention runs, the engine rewrites the affected Parquet partitions excluding the flagged rows, replaces them atomically, and updates the catalog. The deletion is permanent and verifiable — there is no soft-delete tombstone left behind. A signed deletion certificate (subject identifier, partitions affected, byte counts before/after, SHA-256 hashes) is emitted as evidence for the ICO and for the data subject. The Sage People cloud archive maintains a separate, ICO-compliant exception register for records subject to legal hold, where erasure is suspended pending resolution.

    How does the Sage People cloud archive compare to keeping the Salesforce org alive read-only?+

    Keeping a retired Sage People org alive in read-only mode still requires Salesforce Platform licences (typically £100–£250 per user per month for the Integration / Read-only SKUs, multiplied by the user count that needs HR-history access), plus annual sandbox refresh costs, plus any third-party AppExchange app subscriptions that were used inside Sage People. For a 2,000-employee customer with ~50 HR/payroll/audit users who need historical access, the read-only org route runs £60–150k/year indefinitely. The Sage People cloud archive replaces that with a £2–6k/year object-storage bill plus £8–15k/year for query catalog and Syntra ETL platform — a 5–10× reduction sustained across the entire retention horizon. ROI typically pays back the migration project cost in 18–30 months on archive savings alone.

    How does cutover from live Sage People to the cloud archive work?+

    Cutover is staged. Stage 1: full historical extract of every Sage People custom object (Worker__c, Employment_Record__c, Salary__c, Leave_Request__c, Position__c, plus your custom-object extensions) into the cloud archive, with row-level reconciliation against the live org. Stage 2: incremental delta capture via SystemModstamp watermarks while the org is still live, replayed nightly into the archive so it stays current. Stage 3: cut. The live Sage People org is either decommissioned outright (if you've migrated to a new HCM) or reduced to read-only with the licence count dropped to the minimum required for the runout period. Stage 4: ongoing operation — the Sage People cloud archive is the durable record; the live org (if any) is treated as transient. Most customers complete stages 1–3 in 6–10 weeks for a 2,000-employee org.

    Can the Sage People cloud archive feed downstream systems and reports?+

    Yes. The Sage People cloud archive is a first-class data source for any BI tool that reads Parquet, Athena, Synapse, or BigQuery — Tableau, Power BI, Looker, Qlik, ThoughtSpot, plus any custom Python/R notebook. Common downstream patterns: workforce analytics dashboards that need 5+ years of headcount and attrition trend (impossible to keep performant inside Sage People itself); finance reconciliation reports that join archived salary history to the GL; HRIS audit reports for SOC 2 / ISAE 3402 evidence collection; ML feature stores that need historical worker attributes for talent-prediction models. The Sage People cloud archive is also the standard source for downstream Oracle Fusion HCM if a future migration happens — eliminating a second extraction from a long-since-decommissioned Salesforce org.

    Ready to scope your Sage People cloud archive?

    Book a 30-minute architecture call. We'll walk through your Sage People org size, custom-object footprint, UK regulatory profile, and target cloud provider — and give you a sized cloud-archive deployment plan with concrete cost numbers.