Building an Enterprise Data Warehouse for Retail Banking on Oracle
A leading retail bank faced fragmented data across PostgreSQL transactional systems, with no unified view for regulatory reporting, near real-time analytics, or governance. OptiSol architected a full-stack Oracle Data Lake and Enterprise Data Warehouse, integrating Qlik Replicate for real-time CDC, dimensional modelling, and Power BI reporting layers — transforming the bank's data ecosystem into a governed, AI-ready foundation for decision-making.
Key Outcomes
Challenges and Solutions
Fragmented Data Landscape
Core banking and transactional data resided in isolated PostgreSQL systems. Data silos prevented enterprise-level reporting and analytics, with no unified layer available for business or compliance users.
Implemented an Oracle Data Lake as the single source of truth, replicating all PostgreSQL sources via Qlik Replicate's real-time CDC — consolidating fragmented systems into a governed, unified data platform.
Reporting Limitations
Operational and regulatory reports required complex aggregations that the application database could not support. Reports were either non-existent or severely degraded in performance, limiting business visibility.
Built subject-oriented Data Marts with conformed dimensions and fact tables (Transactions, Loans, Deposits) atop the Oracle EDW, with business-specific aggregated views optimized for Power BI reporting across management hierarchies.
Lack of Near Real-Time Insights
Significant data latency — even for operational reports — left business users without timely visibility into day-to-day banking operations, delaying critical decisions.
Deployed Qlik Replicate to capture transaction log changes (PostgreSQL WAL files) in near real-time with minimal source system impact, feeding the Oracle Data Lake with continuous incremental updates.
Data Quality and Governance Gaps
Inconsistent data definitions across systems, limited data lineage, and no auditability created compliance exposure and eroded trust in reported metrics across the organization.
Established a data governance framework with defined ownership, lineage tracking, metadata management, and automated reconciliation between source and target — with full audit trails maintained via CDC history.
Our approach
Source System Assessment
Mapped all PostgreSQL banking modules — Customers, Accounts, Transactions, Loans, Deposits, Regions — to understand data volumes, change frequencies, and integration dependencies.
CDC Pipeline via Qlik Replicate
Configured non-intrusive real-time replication from PostgreSQL to the Oracle Data Lake using WAL-based change capture, handling inserts, updates, and deletes automatically with zero source system load.
Oracle Data Lake Build
Established a raw, partitioned Oracle Data Lake as the primary source of truth — maintaining full historical CDC changes and enabling incremental data storage with built-in scalability.
Dimensional Warehouse Design
Designed and built the Enterprise Data Warehouse using star-schema dimensional modelling with SCD Type 2 on key dimensions (Customer, Product, Branch, Account, DateTime) and fact tables for all banking events.
ETL Orchestration via Airflow
Implemented Oracle Stored Procedures for fact and dimension population, scheduled via Apache Airflow DAGs — ensuring reliable, automated refresh cycles with performance-optimized execution.
Power BI Reporting Layer
Built granular operational reports and summarized strategic dashboards targeting multiple management tiers — sourced from aggregated Oracle views for wider, faster, and more accurate business insights.
