Question 1

What is a Cornerstone OnDemand data extraction tool?

Accepted Answer

A cornerstone ondemand data extraction tool is software that programmatically pulls user, OU, Custom Field, Learning Object, transcript, certification, performance review, goal, succession-plan, requisition and compensation data from your Cornerstone tenant for use in migration, archival, reporting or analytics outside the platform. Cornerstone exposes three primary surfaces: Cornerstone Edge APIs (REST and GraphQL) for transactional access; Reporting 2.0 for canned report exports; and RDW (Reporting Data Warehouse) for SQL access to the analytical replica. A capable extraction tool authenticates correctly (OAuth2 client credentials with scope minimization), respects rate limits, paginates correctly, and produces hash-signed output reconcilable to the source. Syntra ETL is that tool, pre-configured for every Cornerstone data domain that matters.

Question 2

Why not just use Cornerstone Reporting 2.0 to export the data?

Accepted Answer

Reporting 2.0 is excellent for canned, finance-friendly exports — and it remains the right surface for ad-hoc report extraction. But three structural limits make it the wrong primary tool for large-scale extraction: report cap row limits (cap-driven truncation on the largest transcripts), no programmatic delta watermark, and limited concurrency. For migration-scale extraction (15+ years of transcripts, the full SCORM/xAPI content library, complete Custom Field and OU catalog, every active and expired certification) you need API-level access with watermarking, parallel concurrency and rate-limit awareness. The Syntra ETL cornerstone ondemand data extraction tool uses Cornerstone Edge REST/GraphQL plus RDW SQL where appropriate, and falls back to Reporting 2.0 only for specific finance-friendly extracts.

Question 3

How does Syntra ETL extraction tool handle Cornerstone Edge API rate limits?

Accepted Answer

Cornerstone Edge enforces tenant-level rate limits across REST and GraphQL endpoints. Hammering the API risks 429 throttling and operational impact on live training delivery. Syntra ETL's extraction tool implements adaptive concurrency: starts at a conservative concurrent-request count, monitors response time and 429 frequency, dials concurrency up or down to maximize throughput while staying inside the safe operating envelope. RDW SQL is preferred for bulk historical extraction since it queries the analytical replica without touching the operational Edge surface. The combined approach typically completes a multi-decade transcript extract in 2–4 days while keeping live operations completely unaffected.

Question 4

Can the Cornerstone OnDemand data extraction tool capture incremental deltas?

Accepted Answer

Yes. Every Cornerstone Edge endpoint with a last-modified or similar timestamp is wrapped with a watermark-aware extractor that captures only records changed since the previous run. Watermarks are persisted in a partition-aware state store, so a re-run after a network blip resumes from the last good watermark rather than re-extracting from scratch. This is essential for the parallel-run window during cutover: Cornerstone continues live operation, deltas are captured every N minutes/hours, and replayed into the Fusion-target system through HDL incremental or REST API endpoints. The tool also handles physically-deleted records via tombstone comparison against a hash-signed snapshot.

Question 5

Does the cornerstone ondemand data extraction tool capture SCORM and xAPI content packages?

Accepted Answer

Yes. The extraction tool handles three content surfaces: SCORM 1.2/2004 packages (downloaded as the original .zip with imsmanifest.xml intact), xAPI (Tin Can) content with its TinCan.xml descriptor, and AICC/CMI5 packages. Each package is downloaded with its full file tree, hash-signed at the package level and at the individual SCO/SCORM-object level, and indexed by the original Cornerstone Learning Object ID. The xAPI statement archive (the per-user activity stream for xAPI content) is extracted separately through the LRS endpoint and stored in raw form for compliance retention. This is critical because the Cornerstone content library is often the largest single data domain in the migration.

Question 6

What output formats does the extraction tool produce?

Accepted Answer

Output formats are governed by downstream use. For Fusion-target migration: HDL bundle source (CSV/JSON) ready for the Cornerstone data conversion stage; Parquet for analytical staging and reconciliation; SCORM .zip bundles preserved verbatim. For long-term archival: Parquet partitioned by user, BU and fiscal year for transcript and certification archive; original SCORM/xAPI packages stored in cloud object storage with hash-signed manifests; xAPI statement archive in JSON-LD for regulator-friendly export. Every output carries a manifest with row counts, sum totals, hash signatures and source-extract timestamp for reconciliation.

Question 7

How does the cornerstone ondemand data extraction tool handle the Saba / EdCast / SumTotal data heritage?

Accepted Answer

Mature Cornerstone tenants often carry data with multiple lineages — original Cornerstone-native data, Saba data migrated in after the 2020 merger, EdCast learning-experience content migrated in after the 2022 acquisition, and SumTotal data where the Skillsoft connection brought records in. The extraction tool's discovery engine identifies the heritage of every user, course and transcript record by looking at the originating system tag, the import-batch metadata and the ID format. Records are tagged with heritage in the output, so downstream conversion can apply heritage-specific rules — for example, Saba-origin courses often need additional metadata normalization, while EdCast-origin learning-experience content routes to Fusion Learn's video-content path rather than the SCORM path.

Question 8

How does Syntra ETL extraction tool authenticate to Cornerstone?

Accepted Answer

The extraction tool authenticates via OAuth2 client credentials issued by Cornerstone Edge admin. The pattern follows the principle of least privilege: a dedicated read-only client per extract project, with scopes restricted to only the APIs needed for the in-flight extraction (e.g., Learning scope for transcripts and Learning Objects, Performance scope for review data, Admin scope only when crawling Custom Field and OU catalog). Tokens are rotated on a schedule, no admin credentials are ever embedded in extraction code, and all token usage is logged for SOC 2 audit. Credentials are stored in a secrets manager — typically AWS Secrets Manager, Azure Key Vault or HashiCorp Vault — and pulled at runtime.

Cornerstone OnDemand Data Extraction Tool — Edge + RDW Native

What a real cornerstone ondemand data extraction tool actually needs to do

What the extraction tool produces

Why teams choose Syntra ETL as their cornerstone ondemand data extraction tool

Adaptive concurrency

Edge + RDW dual stream

Watermark-aware deltas

SCORM / xAPI packager

Heritage-aware extraction

OAuth2 hardening

How the cornerstone ondemand data extraction tool runs — operational flow

Provision OAuth2 client — Day 0

Discovery sweep — Days 1–2

Structured extract (current) — Days 2–4

Bulk historical sweep (RDW) — Days 3–7

Content package download — Days 5–9

Reconciliation manifest — Day 10

Every Cornerstone domain pre-wired in the extraction tool

Users, OUs, Custom Fields

Learning Objects

Transcripts (Edge + RDW)

Certifications

Performance, Succession

Recruit, Compensation, Engagement

Frequently asked questions

Need a production-grade cornerstone ondemand data extraction tool?