Question 1

What is an SAP S/4HANA data extraction tool and how does Syntra ETL's compare to alternatives?

Accepted Answer

An SAP S/4HANA data extraction tool reads data out of the S/4HANA stack — through HANA SQL, CDS views, OData services, or BAPI/RFC — and writes it to a destination (typically cloud object storage, a data warehouse, or downstream applications) without disrupting the source system. Alternatives include SAP Data Services (BODS — heavy, ABAP-stack-dependent), SAP Datasphere / SAC Live Data (good for cloud reporting but not bulk extraction), SLT (real-time but operationally complex), and home-grown ABAP downloads (brittle, slow, no governance). Syntra ETL is a purpose-built extraction layer with pre-configured table-and-view definitions for every delivered S/4HANA object, parallelism out of the box, Parquet output, and zero ABAP-stack footprint.

Question 2

Which SAP interfaces does the Syntra ETL extraction tool use?

Accepted Answer

All four: direct HANA SQL (highest throughput, requires HANA DB user on on-prem or BTP-private S/4HANA), CDS views (the strategic SAP-blessed access layer for RISE and Cloud editions where DB access is locked down — thousands of pre-built I_* and C_* views), OData services (cloud-RESTful access via SAP Gateway, ideal for incremental and event-driven extraction), and BAPI/RFC (legacy access where CDS coverage gaps exist or for specific function-module driven extractions like HR-PA). For each table or business object, Syntra ETL automatically picks the highest-throughput available interface for your environment, with config override if you have policy reasons to prefer one over another.

Question 3

How does the extraction tool handle SAP S/4HANA Cloud where direct database access is forbidden?

Accepted Answer

S/4HANA Cloud Public Edition allows no direct HANA database access — extraction must use the SAP Cloud APIs. Syntra ETL targets the published CDS view catalog (I_* and C_* views are documented in the SAP API Business Hub) and the OData services exposed through SAP Gateway. For RISE with SAP Cloud Private Edition, the situation is similar but with more flexibility: some customers have CDS view access, some have OData-only, very few have HANA DB access. Syntra ETL auto-detects the available interfaces during the discovery phase and configures the extraction layer accordingly. Throughput is lower than direct HANA SQL but completely production-viable for migration, archival, and analytical use cases.

Question 4

Can the extraction tool extract from custom Z-tables and CDS view extensions?

Accepted Answer

Yes. Custom tables (Z*/Y* tables in the ABAP dictionary, registered in TADIR) are auto-discovered during the inventory phase and added to the extractable-object catalog. CDS view extensions (custom views, view extensions on standard SAP views, ABAP Cloud CAP services) are similarly discovered via the CDS catalog metadata. Extraction config for custom objects is generated automatically — Syntra ETL reads the data-dictionary definition (column names, datatypes, key fields), constructs the extraction SQL or OData query, and stages output to Parquet with the schema preserved. Custom objects participate in the same reconciliation, hashing, and audit-log workflow as delivered tables.

Question 5

What output formats does the SAP S/4HANA data extraction tool produce?

Accepted Answer

Default output is Parquet — columnar, compressed, schema-embedded, optimal for downstream query (Athena, BigQuery, Snowflake, Spark) and for long-term archival storage. Optional outputs include CSV for legacy consumers, JSON for document-store destinations, and direct loading into Oracle Fusion via FBDI/HDL/REST for migration use cases. The Parquet output is partitioned by configurable keys (typically fiscal year + period + company code for FI tables; year + plant for MM tables; year + sales org for SD tables) so downstream consumers get partition-pruning performance automatically.

Question 6

How does parallelism work in the extraction tool?

Accepted Answer

Each extractable object has a natural partition key (RBUKRS company code for ACDOCA, BUKRS for BKPF, MANDT + WERKS for MARC, etc.). The extraction engine splits the source range across N parallel workers, each pulling a non-overlapping partition, hashing rows as they're read, and writing to its own Parquet shard. The reconciliation engine then validates that all shards together cover the expected range (no gaps, no overlaps) and that row-count and sum-total invariants hold. Typical parallelism is 4–16 workers per object; throughput scales close to linearly until either the HANA source becomes the bottleneck or the output object store does. For RISE/Cloud where API rate limits exist, parallelism is auto-throttled to respect SAP's published thresholds.

Question 7

Does the extraction tool impact SAP S/4HANA production performance?

Accepted Answer

Configurably, no. Three controls limit impact: (1) read-only HANA user or CDS-view-read-only authorisation, so no write contention is possible; (2) configurable concurrency limits and statement timeouts so HANA workload management can prioritise online users; (3) optional scheduling against a HANA system-replication secondary (HSR-active or HSR-readable) for zero impact on the primary. For RISE-hosted S/4HANA where customer Basis access is restricted, throttling is managed via API rate limits and time-window scheduling. Customers routinely run multi-terabyte extracts during business hours with no detectable impact on online TPS.

Question 8

How does the extraction tool fit into compliance audits (SOX, German HGB, BaFin)?

Accepted Answer

Every extraction operation produces a signed audit log: timestamp, source system, object extracted, row count, sum totals, hash of result-set, identity of the requesting user/service. Logs are immutable, retained per configurable retention policy, and exportable as evidence packs. For SOX, the audit log proves data lineage from S/4HANA to downstream system. For German HGB/AO §147 retention, the extraction log proves the archive was a true and complete copy at extraction time. For BaFin in regulated financial services, the audit log proves no unauthorised modifications were possible during extraction. Auditors typically sign off on the extraction process itself after seeing one full run-through, then sample-test subsequent runs.

SAP S/4HANA Data Extraction Tool — HANA, CDS, OData, BAPI

What an SAP S/4HANA data extraction tool needs to do — and what most don't

Extraction interfaces, ranked by typical throughput

What the SAP S/4HANA extraction tool ships with on day one

Finance (FI/CO)

AP / AR / Banking

Materials / MM

Sales / SD

Production / PP & EAM

HR / Payroll (when present)

How a typical SAP S/4HANA extraction job runs

Source discovery — 5 minutes

Job configuration — 15 minutes

Authorisation provisioning — Same-day

First-run smoke test — 30 minutes

Full extract — Hours to days

Reconciliation & evidence — Automatic

Operational characteristics that matter at scale

Throughput

Restartable

Sensitive-data masking

Incremental & delta

Audit logging

Job observability

Frequently asked questions

Need an SAP S/4HANA data extraction tool that works on day one?