The SyntraETL historical reporting platform — Parquet on S3/Azure/GCS, Presto/Trino SQL, saved-search web UI, REST APIs, native OAC / Power BI / Tableau connectors. Sub-second response on multi-TB archives. Hash-signed immutable storage. Multi-tenant SaaS or single-tenant in customer cloud. Source-system security mirrored, every access logged, audit-defensible by default.
Most customers come to a historical reporting platform after the legacy ERP retirement decision is already made. The product question is whether the archive can actually do the job — performance, access controls, BI integration, audit-trail integrity, deployment flexibility.
The SyntraETL historical reporting platform — internally and in customer environments referred to as Archive Reports — is the product engine behind every archive reporting deliverable. Three architectural layers sit underneath: a Parquet-on-object-storage layer (S3 / Azure Blob / GCS with hash-signed immutable manifests, partition organisation by year / month / entity / domain, per-column compression and indexes), a query layer (Presto/Trino-backed SQL-92 with partition pruning, predicate push-down and bloom-filter point-lookups), and an access layer (saved-search web UI, OAuth-secured REST API, native connectors for Oracle Analytics Cloud, Microsoft Power BI and Tableau).
Performance on multi-TB archive reporting workloads is the design baseline, not the headline. The architecture is built so that the 95th-percentile historical reporting query returns sub-second on the kinds of TB-scale archives a typical 10-year EBS, PeopleSoft or SAP decommissioning produces. Columnar storage means queries scan 1.5% of total bytes rather than full rows; partition pruning skips 99%+ of irrelevant data before reading any bytes; bloom filters make point-lookups (specific document numbers, supplier codes, journal IDs) finish in single-digit milliseconds. Auditors notice immediately — most archive UIs are faster than the legacy ERPs they replaced.
Deployment is flexible. Multi-tenant SaaS for fastest deployment and lowest cost per TB, with logical tenant isolation, separate KMS keys per tenant and segregated access controls. Single-tenant in customer cloud (AWS / Azure / GCP) for regulated industries that demand the historical reporting platform sit entirely within their own account, behind their identity provider, VPC controls and private endpoints. Many customers run hybrid: SaaS for general legacy data reporting, single-tenant for the most sensitive data domains. Either way, the same product engine, the same web UI, the same REST APIs and the same BI connectors.
Built once at the product level — deployed identically across every archived source system and every customer.
Per-column compression and indexes. Partition organisation by year / month / entity / domain. Hash-signed immutable manifests bind every Parquet file to the source extraction record. Tampering detectable at query time.
SQL-92 with extensions. Partition pruning at storage layer. Predicate push-down. Bloom-filter point-lookups in single-digit milliseconds. Sub-second P95 response on multi-TB archives.
Search by period, entity, document, counterparty. Drill from summary to detail. PDF / CSV / XLSX export. Scheduled reports and alerts. Evidence-pack export bundles multiple searches into one ZIP.
OAuth-secured endpoints: dataset metadata, structured query, document lookup, audit-trail query, legal-hold management, export job control. JSON / NDJSON output, per-tenant rate limiting.
Native subject-area connector for OAC. ODBC + Power Query for Power BI direct-query and import. Tableau native connector with live and extract modes. Semantic model push-down for filter pruning.
Multi-tenant SaaS with logical isolation per tenant and separate KMS keys. Or single-tenant in customer's own AWS / Azure / GCP account, behind customer IdP and VPC controls. Hybrid supported.
What actually happens between an auditor clicking 'search' and getting their result — six stages of the platform's runtime.
User authenticates through customer identity provider (Okta, Azure AD/Entra ID, Ping, AWS IAM IdC, Google Workspace) with MFA enforced. Role-based access control resolves which archives, data domains and entity scopes the user can query. Source-system security mirrors apply automatically.
Query (from web UI saved-search, REST API call, or BI connector) is parsed by the Presto/Trino query layer. Partition pruning identifies the minimum set of Parquet files needed. Predicate push-down passes filter clauses to the storage layer. Bloom filters resolve point-lookups before any I/O.
Parquet files pulled from hot / warm / cold tier (S3 Standard / IA / Glacier IR equivalent). Per-column scan reads only the columns referenced in the query. Hash signatures validated at read time — any integrity break aborts the query and triggers an alert.
Aggregations computed in the query layer with push-down where supported by Parquet. Result set assembled and paginated. For BI connector queries, results are streamed back over ODBC / native protocol; for web UI queries, results render in the saved-search interface; for REST API queries, NDJSON streams back.
Access event written to the immutable audit log: user, timestamp, query SQL, partition set scanned, rows returned, BI tool or API caller, source IP, request ID. Log forwarded to customer SIEM (Splunk / Sentinel / Chronicle / QRadar) for cross-platform correlation if configured.
Optional: user packages result as PDF / CSV / XLSX export for evidence submission, or as an evidence-pack ZIP bundling multiple related searches with chain-of-custody attribution. Export action is itself logged. Auditor or tax authority receives audit-defensible artifact.
The product integrates upward into BI, sideways into compliance tooling, and downward into source-system security models.
Historical reporting appears as a federated subject area inside Oracle Analytics Cloud — reports blend live Fusion balances with archived historical context in one canvas. Single Sign-On from OAC into the archive.
Power BI gateway connects via ODBC driver for direct-query mode. Power Query connectors for import mode. Semantic model push-down means Power BI filters prune partitions at the archive layer.
Tableau live-connection or extract-based mode. Extract refresh schedules align with archive partition updates. Tableau workbooks consume archive data identically to live data sources.
Splunk, Sentinel, Chronicle, QRadar forwarding. ServiceNow GRC and RSA Archer for retention policy management. Relativity, Logikcull, Onna for litigation scope synchronisation.
Okta, Azure AD/Entra ID, Ping, AWS IAM Identity Center, Google Workspace. SSO with MFA enforcement. SCIM-based lifecycle automation. Group-based role assignment.
AWS KMS, Azure Key Vault, Google Cloud KMS. Customer-managed keys for at-rest encryption. Per-tenant key separation in SaaS. Customer-controlled key rotation for single-tenant deployments.
The SyntraETL historical reporting platform — also known as Archive Reports — is the product-layer engine that delivers audit-grade query access to multi-TB archives of retired ERP, HCM and operational data. Architecturally it combines columnar Parquet storage on S3 / Azure Blob / GCS, a Presto/Trino-backed SQL query layer, a saved-search web UI, REST API endpoints, and pre-built BI connectors for Oracle Analytics Cloud, Microsoft Power BI and Tableau. Hash-signed immutable storage guarantees archive integrity from the point of extraction through any query result. The historical reporting platform is the product CFOs, Controllers, Audit Leads and Compliance Officers actually log into for archive reporting after a legacy ERP is decommissioned — and the engine BI tools call into for legacy data reporting in dashboards.
Three-layer architecture. Storage layer: Parquet files on S3 / Azure Blob / GCS with per-column compression, columnar indexes and partition organisation (year / month / entity / data-domain). Hash-signed manifests bind every Parquet file to the source extraction record. Query layer: Presto/Trino cluster providing SQL-92 with extensions, partition pruning, bloom-filter point lookups and predicate push-down to the storage layer. Multi-tenant or single-tenant deployment depending on customer choice. Access layer: web UI for human users (search, drill-down, export), REST API for programmatic access (OAuth-secured, throttled per tenant), and BI connectors (ODBC, Power Query, Tableau native). Every layer instrumented with access logs and integrity checks — auditor-grade by design.
Three architectural choices working together. Columnar Parquet storage with per-column compression: queries reading 3 of 200 columns scan ~1.5% of total bytes. Partition pruning at the storage layer: archives are partitioned by year, month, entity and data-domain so queries against 'AP invoices for Q3 2022 for legal entity DE-01' skip 99%+ of the archive before reading any bytes. Bloom filters on high-cardinality columns (document numbers, supplier codes, customer IDs, employee numbers, journal numbers): point-lookups return in single-digit milliseconds even on multi-TB datasets. Combined, these deliver sub-second response on the 95th percentile of historical reporting queries — the kind of performance most legacy ERPs never delivered even when live.
The web UI is the primary access surface for auditors, tax teams, finance, HR and legal. Capabilities: search by period, entity, document type, counterparty, document number, GL account, supplier, customer, employee, or any indexed field; drill from summary balances down to individual journal lines or document detail; filter and pivot on any column; export results to PDF (formatted), CSV (raw) or XLSX (formatted with totals); save common searches as scheduled reports that email on cadence; subscribe to alerts when specific data appears (litigation hold scenarios, fraud monitoring, regulatory review triggers); export evidence packs (multiple related searches bundled into a single ZIP for audit submission). Every action is logged with user, timestamp, query and rows accessed — chain-of-custody by default.
Every archived dataset is queryable through OAuth-secured REST endpoints. Patterns supported: dataset-level metadata queries (list available archives, retention policies, last refresh); structured queries (POST a SQL-like query payload, receive paginated JSON or NDJSON); document-level lookups (retrieve a specific journal, invoice, voucher or payroll record by document number); audit-trail queries (retrieve the access log for a specific record over a time window); legal-hold management (apply, lift, query hold scope for litigation matters); export job control (kick off a long-running export, poll status, download result). Rate limiting is per-tenant and per-API-key. The API is the integration point for custom dashboards, legal discovery pipelines, compliance automation and reconciliation flows.
Three patterns supported. Oracle Analytics Cloud (OAC): native subject-area connector — the historical reporting platform appears as a federated subject area alongside live Fusion data, so reports can blend current Fusion balances with archived historical context in one canvas. Microsoft Power BI: ODBC driver for direct query mode, plus Power Query connectors for import mode; semantic model push-down passes Power BI filter selections back to the archive's query layer for partition pruning. Tableau: native connector with both live-connection and extract-based modes; extract refresh schedules align with archive partition updates. In all three cases, archive reporting flows into the same BI tools finance, audit and operations already use — historical reports look and feel like live-system reports, just sourced from a much larger time window.
Both. Multi-tenant SaaS: SyntraETL operates the historical reporting platform in shared cloud infrastructure (AWS / Azure / GCP regions of customer choice), with logical isolation between tenants, separate KMS keys per tenant, and segregated access controls. Fastest to deploy, cheapest per TB. Single-tenant in customer cloud: the historical reporting platform is deployed entirely within the customer's own AWS / Azure / GCP account, behind the customer's identity provider and VPC controls. Customer retains direct control over the storage and queries. Common for regulated industries (financial services, healthcare, government, defence). Some customers run hybrid: SaaS for general historical reporting, single-tenant for the most sensitive data domains.
Role-based access control aligned with the customer's identity provider (Okta, Azure AD / Entra ID, Ping Identity, AWS IAM Identity Center, Google Workspace). Roles typically map to: finance read-only (current and prior 7 years), audit read-only (full retention window), tax read-only (statutory-relevant data only), legal hold (litigation-specific scope), HR read-only (personnel data only with PHI/PII overlays), compliance (PII access with audit logs reviewed quarterly), platform admin (configuration but not data). Crucially, source-system security can be mirrored: if a user couldn't see ledger ABC in EBS, they can't see archived ledger ABC in the historical reporting platform. Every read access is logged with user, timestamp, query and rows accessed — supporting SOX access reviews, HIPAA access tracking and GDPR DSAR response.
Encryption in transit (TLS 1.3 enforced) and at rest (AES-256-GCM with customer-managed KMS keys — AWS KMS, Azure Key Vault, Google Cloud KMS). Hash-signed immutable storage: every Parquet file is signed at write time with a SHA-256 content hash and a per-set Merkle root; tampering is detectable at query time. Multi-factor authentication enforced for any external access. Service-account read patterns are explicitly excluded from production scope unless approved through change management. Network controls: VPC-only access for single-tenant deployments, IP allow-listing for SaaS, private endpoint support (AWS PrivateLink, Azure Private Endpoint). Audit logs are immutable and exportable to the customer's SIEM (Splunk, Sentinel, Chronicle) for cross-platform correlation.
Two billing components. Storage: per-TB-per-month based on tier (hot / warm / cold) — typically $25/TB/month hot, $12/TB/month warm, $4/TB/month cold. Query: per-million-rows-scanned for ad-hoc queries; scheduled and pre-built reports are bundled. Typical pricing for a 10-year, 7 TB archive (1 TB hot, 2 TB warm, 4 TB cold) with moderate query volume is $1,200–$2,400/month all-in — versus $40K–$200K/month for keeping the legacy ERP alive. Enterprise pricing is committed per-program with discounting at the 20+ TB tier. Single-tenant deployments are committed-capacity priced rather than per-query. Multi-system archives (multiple retired ERPs in one platform) get sub-linear pricing as the platform overhead is amortised.
Yes — and this is the common steady-state. Customers commonly accumulate 3–8 retired systems in the platform over a 2–4 year decommissioning program: Oracle EBS from one acquisition, PeopleSoft from another, an old SAP ECC tenant, a Workday instance from a divested business unit, a Microsoft Dynamics deployment from a regional subsidiary. The historical reporting platform is multi-archive by design — each archived system has isolated storage, isolated access controls and isolated retention policies, but a unified search experience across the estate. Cross-system queries (e.g., 'find every payment to vendor ACME across all retired systems for the IRS audit') work natively. The unified retired-system reporting view becomes the system of record for archived data.
Retention policies are configured per data domain per legal entity per jurisdiction. A typical configuration: GL postings 10 years (HGB) or 7 years (US SOX), AP/AR documents 7 years (IRS) or 6 years (HMRC), payroll records 4+ years (US IRS) or 50 years (French specific cases), HR personnel records per employment-law jurisdiction, FDA Part 11 batch records per the longer of 7 years or 1 year past lot expiry. The platform enforces retention automatically: data within the retention window is preserved and queryable; data past retention is moved to a quarantine state for review; verified-eligible data is deleted with a cryptographically-signed deletion certificate. Legal hold overrides automatic deletion. GDPR Article 17 right-to-erasure overrides retention except where conflicting statutory obligation applies — and the conflict resolution is logged.
A custom data lake (S3 plus Athena, or Azure Data Lake plus Synapse) can technically hold archived ERP data — but lacks the application-specific features that make a historical reporting platform actually useful. Missing pieces: source-system security mirroring (custom data lakes have no concept of EBS responsibilities or PeopleSoft permission lists); hash-signed integrity from extraction (custom data lakes are unsigned by default); pre-built audit reports (custom data lakes start empty); BI connectors that understand archive schemas (custom data lakes expose raw tables); retention policy enforcement with jurisdictional logic (custom data lakes have no concept of HGB or SAF-T); legal hold management (custom data lakes have no holds primitive); GDPR DSAR support with PII tagging. Building these features costs $1.5M–$4M and 12–18 months of engineering — SyntraETL ships them on day one.
Yes. Standard integrations include: SIEM forwarding (Splunk, Microsoft Sentinel, Google Chronicle, IBM QRadar) for cross-platform access log correlation; identity provider deep integration (Okta, Azure AD/Entra ID, Ping) for SSO, MFA enforcement and lifecycle automation; GRC platform integration (ServiceNow GRC, RSA Archer) for retention policy management and compliance evidence; ticketing integration (ServiceNow ITSM, Jira) for access requests, legal hold tickets and DSAR workflow; legal hold platform integration (Relativity, Logikcull, Onna) for litigation scope synchronisation. The REST API also supports custom integrations for niche compliance and audit tooling. Webhook delivery for access-event-driven workflows is standard.
Three-step path. Step 1: 30-minute discovery call covering current legacy ERP estate, retention obligations, audit-team access patterns and BI tool footprint. Step 2: 2-week paid assessment producing the data inventory, retention policy design, identity-provider integration plan and pilot scope. Step 3: 4–6 week pilot extracting a meaningful subset of legacy data (typically one legal entity or one fiscal year) into a customer-accessible historical reporting platform instance; auditors and finance teams validate access patterns directly. Pilot output is a production go/no-go with hard timeline and budget for full rollout. Most customers move from pilot to production rollout within 60 days because the value is immediately visible to internal audit and the CFO.
Book a 30-minute platform demo. We will walk through the web UI, API surface, BI connectors, deployment models and pricing — and answer the architecture questions your security and compliance teams will ask.