OAuth 2.0 connected, Bulk API 2.0 powered, metadata-aware extraction for every Sage People custom object. Schedule-driven, API-limit-safe, Parquet/CSV/JSON output, SOX-grade audit logging.
Sage People is a Salesforce org. Extracting from it is a Salesforce platform exercise — and naive scripts hit API limits, miss custom fields, and leak credentials.
Most teams attempting Sage People data extraction reach first for the Salesforce Data Loader, a SOQL export script, or a hand-rolled Bulk API integration. All three approaches work for a one-off pull but fail as soon as scope grows beyond a handful of objects: API limits get exhausted, custom fields added by your org's admins get missed, sensitive field handling becomes manual, and audit evidence is ad-hoc at best.
The Syntra ETL Sage People data extraction tool is purpose-built for the Sage People object model. It knows the relationships between Worker__c, Employment_Record__c, Position__c, Salary__c, and Leave_Request__c. It reads your Salesforce org's Metadata API on every run, so customer-added custom fields appear in extracts automatically. It uses Bulk API 2.0 for large objects to stay under daily API limits. And it produces signed, manifest-tracked output suitable for SOX, IRS, and HMRC audit evidence.
Use it as a one-off for migration prep, on a recurring schedule for ongoing data warehouse refresh, or as the extraction layer behind an archival pipeline. Same tool, same output format, same audit trail — different downstream consumers.
Six engineering decisions that separate purpose-built from improvised.
OAuth 2.0 connected app or named credential authentication. Bulk API 2.0 for large objects, REST for reference data, Metadata API for schema. Standard Salesforce patterns your admin already knows.
Reads org Metadata API on every run. Custom fields added by your admins to Worker__c, Salary__c, etc. appear in extracts without config changes. Schema drift logged for awareness.
Cron-driven scheduler with full and SystemModstamp-driven incremental modes. Named windows (nightly, weekly, monthly), concurrency throttling, retry-with-backoff on transient failures.
Salary, bank account, National Insurance Number, date of birth flagged at extraction time. Configurable masking, hashing, or pass-through per field per consumer.
Parquet (default, columnar, compressed), CSV (for downstream tools), JSON (for REST consumers). Multiple formats from one extract. Schema sidecars for tooling integration.
Every run produces a signed manifest: rows extracted per object, hash totals, API calls used, runtime, schema fingerprint. SOX-grade evidence ready for auditors.
From kick-off to first audited extract in production. Typical engagement: 5–10 business days.
You create a Salesforce connected app (OAuth) or Integration User (named credential), grant the Syntra ETL tenant access with the explicit permission set we provide. We confirm connectivity, read the org metadata, surface a discovery report of all Sage People objects and custom fields detected.
Choose which objects to extract (typically all Sage People custom objects + selected standard objects). Set sensitive-field handling rules. Choose output destination (cloud storage bucket, SFTP, etc.) and format (Parquet/CSV/JSON). Define retention policy for extracts.
A first full extract runs against the live org during a low-usage window. Output validated: row counts match Sage People-side queries, hash totals stable, manifest complete. API call consumption measured against your daily limit.
Recurring schedule configured (nightly/weekly/monthly). Monitoring webhooks set up (Slack, Teams, PagerDuty) for run-failures or schema drift. Audit log destination configured (SIEM, S3, etc.).
First production extract reconciled, manifest signed, audit pack delivered. Runbook handed to your data ops team. Tool is now running unattended on schedule.
Six artefacts produced per extract run, all hash-signed and timestamped.
Per-object output files partitioned by extract date and natural keys (BU, pay group). Parquet default; CSV/JSON on request. Schema embedded; column types preserved from Salesforce.
Signed JSON manifest: extract start/end timestamps, per-object row counts, hash totals, API calls used, runtime, schema fingerprint, sensitive-field handling applied. SOX evidence-ready.
Every row content-hashed (stable hash excluding system audit fields). Downstream consumers can verify load integrity by re-hashing post-load and comparing.
JSON Schema, Avro, BigQuery DDL, Snowflake DDL — generated alongside data so downstream tools have programmatic access to field types and constraints.
Apex class/trigger source, Flow definitions, Process Builder rules, validation rule logic — extracted via Metadata API and stored alongside data. Essential for migration and post-decommission evidence.
Every API call logged: user, timestamp, query, rows returned, sensitive fields accessed. Streamed to SIEM (Splunk, Datadog, etc.) or persisted to immutable log storage.
It's a purpose-built extractor for the Sage People HCM platform (built on Salesforce). It runs as a managed service or as an installable agent inside your network, authenticates to the Sage People Salesforce org via OAuth 2.0 or named-credential connection, reads metadata to discover all standard and customer-added custom fields on Worker__c, Employment_Record__c, Salary__c, Leave_Request__c, Position__c and every other Sage People custom object, and extracts data via Salesforce Bulk API 2.0 (for large objects) and REST API (for reference data). Output lands in cloud object storage as Parquet or CSV, partitioned and hash-signed, ready for downstream migration, archival, or analytics consumption.
Two supported connection patterns. OAuth 2.0 connected app: you create a Salesforce connected app with the right scopes (api, refresh_token, full where needed), grant Syntra ETL's tenant the access token; everything is auditable in the Salesforce Connected Apps Usage page. Named credential with username-password flow: for orgs that prefer not to use OAuth, a dedicated Integration User account with explicit permission sets covering the relevant Sage People objects works equally well. Either way, the extractor logs every API call against the connection, and your Salesforce admin can revoke access at any time — no shared credentials, no service-account passwords floating in scripts.
Salesforce orgs have strict per-24-hour API request limits (15,000 for Enterprise Edition; scaled higher with API call add-ons or higher editions). The Syntra ETL extractor uses three strategies to stay safe: (1) Bulk API 2.0 for large objects — each bulk job counts as a single API call regardless of how many records it processes, so a 500,000-row Worker__c extract is one API call; (2) batching of REST calls into 200-record retrieves where Bulk isn't appropriate; (3) configurable concurrency throttle and timed execution windows so extracts run during your lowest-usage period. A complete extraction for a 5,000-employee customer typically uses under 500 of your 15,000 daily limit.
Yes. The extraction tool ships with a built-in scheduler supporting cron expressions, named windows (nightly, weekly, monthly), and SystemModstamp-driven incremental extracts. Common patterns: daily incremental of Worker__c, Leave_Request__c, Salary__c for ongoing data warehouse refresh; weekly full extract of Position__c and reference data; monthly full snapshot of all objects for archive checkpointing. Each scheduled run produces a manifest (rows extracted, hash totals, API calls used, runtime) that's stored alongside the data — useful for SOX evidence and for spotting drift in extraction volume that might indicate Sage People configuration changes.
Defense in depth across four layers. Connection: OAuth 2.0 or named credential with rotation support; never embed passwords in config. Authorisation: the Integration User has only the explicit Object Permissions and Field-Level Security needed for the extract scope — typically Read on the Sage People custom objects, no write permission anywhere. Data at rest: all extracted output is encrypted with KMS-managed keys; sensitive fields (Salary__c.Annual_Salary__c, Worker__c.Bank_Account__c, National_Insurance_Number__c) are flagged and can be hashed or masked at extraction time. Data in transit: TLS 1.3 from Salesforce through to cloud storage. Audit: every API call logged with timestamp, user, query, row count.
Parquet is the default — columnar, compressed, embedded schema, partitioned by extract-date and (where natural) by business unit or pay group. CSV is supported for tools that need it (typically Workday EIB, SuccessFactors HCI, and certain UK payroll providers). JSON is supported for REST API consumers. The tool can produce multiple formats from a single extract — write Parquet to archive AND CSV to a target payroll provider's SFTP drop in the same run. Schema metadata sidecars (JSON Schema, Avro, BigQuery DDL) generate alongside the data.
Yes. In addition to data, the extraction tool reads the Salesforce Metadata API to capture custom field definitions, validation rules, formula fields, record types, page layouts, Apex triggers (source + version), Process Builder flows, Salesforce Flows, Visualforce pages and Lightning Web Components scoped to the Sage People managed package. This metadata catalog is essential for a Fusion migration (you need to know what the source actually does before you can replicate it) and for SOX-grade post-decommission evidence. The metadata catalog is stored alongside the data with the same retention and access policies.
Sage People itself is delivered as a Salesforce managed package, which Sage Group upgrades roughly twice a year. New fields, new objects, modified validation rules — the extraction tool reads the Metadata API at each extract run, so new fields show up automatically in subsequent extracts without configuration changes. Schema drift (a field appearing or disappearing) is logged in the extract manifest, so your data team is alerted. Major version upgrades (new business object introductions) typically require a tool config update from us; we track Sage People release notes and ship matching extractor updates within 30 days of release.
30-minute call. We'll walk through your Sage People org footprint, custom field profile, target output destinations, and audit requirements — and confirm a 5-10 business day on-ramp.