## Date Dim **Table Name:** date_dim **Table Type:** Dimension **Description:** Each row in this table represents one calendar day. **Load Frequency:** Static (load once) **Source:** Loaded once when the DW is deployed from a DDL file. | Target Schema: | Target Table: | Target Column: | Target Datatype: | Key: | SCD Type: | Transformation: | Note: | | -------------- | ------------- | ------------------ | ---------------- | ---- | --------- | ----------------- | --------------------------------------------------------------------------------------------------------------- | | dw\_mystore | date\_dim | date\_key | date | PK | N/A | | Using the date datatype for the key as it will allow range restricted scans on fact tables. | | dw\_mystore | date\_dim | day\_date | date | NK | N/A | | Source column is repeated here for user convenience as primary keys are often hidden in BI presentation layers. | | dw\_mystore | date\_dim | month\_seq | int | | N/A | | | | dw\_mystore | date\_dim | week\_seq | int | | N/A | | | | dw\_mystore | date\_dim | quarter\_seq | int | | N/A | | | | dw\_mystore | date\_dim | year | int | | N/A | | | | dw\_mystore | date\_dim | dow | int | | N/A | | | | dw\_mystore | date\_dim | moy | int | | N/A | | | | dw\_mystore | date\_dim | dom | int | | N/A | | | | dw\_mystore | date\_dim | qoy | int | | N/A | | | | dw\_mystore | date\_dim | fy\_year | int | | N/A | | | | dw\_mystore | date\_dim | fy\_quarter\_seq | int | | N/A | | | | dw\_mystore | date\_dim | fy\_week\_seq | int | | N/A | | | | dw\_mystore | date\_dim | day\_name | varchar(9) | | N/A | | | | dw\_mystore | date\_dim | quarter\_name | varchar(6) | | N/A | | | | dw\_mystore | date\_dim | holiday | varchar(1) | | N/A | | | | dw\_mystore | date\_dim | weekend | varchar(1) | | N/A | | | | dw\_mystore | date\_dim | following\_holiday | varchar(1) | | N/A | | | | dw\_mystore | date\_dim | first\_dom | int | | N/A | | | | dw\_mystore | date\_dim | last\_dom | int | | N/A | | | | dw\_mystore | date\_dim | same\_day\_ly | int | | N/A | | | | dw\_mystore | date\_dim | same\_day\_lq | int | | N/A | | | | dw\_mystore | date\_dim | current\_day | varchar(1) | | N/A | | | | dw\_mystore | date\_dim | current\_week | varchar(1) | | N/A | | | | dw\_mystore | date\_dim | current\_month | varchar(1) | | N/A | | | | dw\_mystore | date\_dim | current\_quarter | varchar(1) | | N/A | | | | dw\_mystore | date\_dim | current\_year | varchar(1) | | N/A | | | | dw\_mystore | date\_dim | dw\_insert\_date | timestamp | | N/A | SYSDATE on Insert | SYSDATE on Insert | | dw\_mystore | date\_dim | dw\_update\_date | timestamp | | N/A | SYSDATE on Update | SYSDATE on Update | ## Time Dim **Table Name:** time_dim **Description:** Each row in this table represents one second. **Load Frequency:** Static (load once) **Source:** Loaded once when the DW is deployed from a DDL file. | Target Schema: | Target Table: | Target Column: | Target Datatype: | Key: | SCD Type: | Transformation: | | -------------- | ------------- | ---------------- | ---------------- | ---- | --------- | ----------------- | | dw\_mystore | time\_dim | time\_key | int | PK | N/A | | | dw\_mystore | time\_dim | hour | int | NK | N/A | | | dw\_mystore | time\_dim | minute | int | NK | N/A | | | dw\_mystore | time\_dim | second | int | NK | N/A | | | dw\_mystore | time\_dim | am\_pm | varchar(2) | | N/A | | | dw\_mystore | time\_dim | shift | varchar(20) | | N/A | | | dw\_mystore | time\_dim | sub\_shift | varchar(20) | | N/A | | | dw\_mystore | time\_dim | meal\_time | varchar(20) | | N/A | | | dw\_mystore | time\_dim | dw\_insert\_date | timestamp | | N/A | SYSDATE on Insert | | dw\_mystore | time\_dim | dw\_update\_date | timestamp | | N/A | SYSDATE on Update | ## Customer Dim **Table Name:** customer_dim **Table Type:** Dimension **Description:** Each row in this table represents a customer. **Load Frequency:** Dynamic **Source Tables:** store_customer, store_customer_address **Source Table Joins:** `FROM stg_mystore.v_store_customer c LEFT OUTER JOIN stg_mystore.v_store_customer_address a ON c.address_id = a.address_id` **Transformation Notes:** All the columns from the store_customer table are treated as SCD1 in the target. All the columns from the store_customer_address table are treated as SCD2 in the target. Null source values are transformed to 'Unknown' or '#' in the target, depending on the target column width. Null source dates/timestamps are transformed to '01/01/1000'. The source data lake tables are HUDI tables. HUDI returns timestamps in the Unix time (epoch) format. The Unix time columns are converted via a SQL function and are encapsulated in views: v_store_customer and v_store_customer_address. | Source Data Lake Layer: | Source Database: | Source Table: | Source Column: | Source Datatype: | Target Schema: | Target Table: | Target Column: | Target Datatype: | Key: | SCD Type: | Transformation: | Note: | | ----------------------- | ------------------------------- | ------------------------ | ------------------ | ---------------- | -------------- | ------------- | -------------------------- | ---------------- | ---- | --------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------- | | | | | | | dw\_mystore | customer\_dim | customer\_key | int | PK | N/A | Create new surrogate key for new records and SCD2 changes. | | | clean | ara\_clean\_data\_ | store\_customer | customer\_id | string | dw\_mystore | customer\_dim | customer\_id | varchar(16) | NK | N/A | | | | clean | ara\_clean\_data\_ | store\_customer | salutation | string | dw\_mystore | customer\_dim | salutation | varchar(10) | | 1 | | The full history is updated on SCD1
 columns for a change in the source. | | clean | ara\_clean\_data\_ | store\_customer | first\_name | string | dw\_mystore | customer\_dim | first\_name | varchar(20) | | 1 | | | | clean | ara\_clean\_data\_ | store\_customer | last\_name | string | dw\_mystore | customer\_dim | last\_name | varchar(30) | | 1 | | | | clean | ara\_clean\_data\_ | store\_customer | birth\_country | string | dw\_mystore | customer\_dim | birth\_country | varchar(20) | | 1 | | | | clean | ara\_clean\_data\_ | store\_customer | email\_address | string | dw\_mystore | customer\_dim | email\_address | varchar(50) | | 1 | | | | clean | ara\_clean\_data\_ | store\_customer | birth\_date | date | dw\_mystore | customer\_dim | birth\_date | date | | 1 | | | | clean | ara\_clean\_data\_ | store\_customer | gender | string | dw\_mystore | customer\_dim | gender | varchar(10) | | 1 | if 'M' then 'Male'
 if 'F' then 'Female'
 else gender | | | clean | ara\_clean\_data\_ | store\_customer | marital\_status | string | dw\_mystore | customer\_dim | marital\_status | varchar(10) | | 1 | if 'D' then 'Divorced'
 if 'M' then 'Married'
 if 'S' then 'Single'
 if 'U' then 'Unknown'
 if 'W' then 'Widowed'
 else marital\_status | | | clean | ara\_clean\_data\_ | store\_customer | education\_status | string | dw\_mystore | customer\_dim | education\_status | varchar(10) | | 1 | | | | clean | ara\_clean\_data\_ | store\_customer | purchase\_estimate | bigint | dw\_mystore | customer\_dim | purchase\_estimate | int | | 1 | | | | clean | ara\_clean\_data\_ | store\_customer | credit\_rating | string | dw\_mystore | customer\_dim | credit\_rating | varchar(10) | | 1 | | | | clean | ara\_clean\_data\_ | store\_customer | buy\_potential | string | dw\_mystore | customer\_dim | buy\_potential | varchar(10) | | 1 | | | | clean | ara\_clean\_data\_ | store\_customer | vehicle\_count | bigint | dw\_mystore | customer\_dim | vehicle\_count | int | | 1 | | | | clean | ara\_clean\_data\_ | store\_customer | lower\_bound | bigint | dw\_mystore | customer\_dim | income\_band\_lower\_bound | int | | 1 | | | | clean | ara\_clean\_data\_ | store\_customer | upper\_bound | bigint | dw\_mystore | customer\_dim | income\_band\_upper\_bound | int | | 1 | | | | clean | ara\_clean\_data\_ | store\_customer | customer\_datetime | date | dw\_mystore | customer\_dim | start\_date | date | | 1 | f\_from\_unixtime(customer\_datetime) | | | clean | ara\_clean\_data\_ | store\_customer\_address | address\_id | string | dw\_mystore | customer\_dim | address\_id | varchar(16) | | 2 | | | | clean | ara\_clean\_data\_ | store\_customer\_address | address\_datetime | string | dw\_mystore | customer\_dim | address\_date | timestamp | | 2 | f\_from\_unixtime(address\_datetime) | | | clean | ara\_clean\_data\_ | store\_customer\_address | street | string | dw\_mystore | customer\_dim | street | varchar(20) | | 2 | | | | clean | ara\_clean\_data\_ | store\_customer\_address | city | string | dw\_mystore | customer\_dim | city | varchar(60) | | 2 | | | | clean | ara\_clean\_data\_ | store\_customer\_address | county | string | dw\_mystore | customer\_dim | county | varchar(30) | | 2 | | | | clean | ara\_clean\_data\_ | store\_customer\_address | state | string | dw\_mystore | customer\_dim | state | varchar(2) | | 2 | | | | clean | ara\_clean\_data\_ | store\_customer\_address | zip | string | dw\_mystore | customer\_dim | zip | varchar(10) | | 2 | | | | clean | ara\_clean\_data\_ | store\_customer\_address | country | string | dw\_mystore | customer\_dim | country | varchar(20) | | 2 | | | | clean | ara\_clean\_data\_ | store\_customer\_address | gmt\_offset | string | dw\_mystore | customer\_dim | gmt\_offset | numeric(5) | | 2 | | | | clean | ara\_clean\_data\_ | store\_customer\_address | location\_type | string | dw\_mystore | customer\_dim | location\_type | varchar(20) | | 2 | | | | | | | | | dw\_mystore | customer\_dim | scd\_start\_date | timestamp | | N/A | On SCD2 change:
\- store\_customer.customer\_datetime on first record.

\- store\_address.address\_datetime on the new record | | | | | | | | dw\_mystore | customer\_dim | scd\_end\_date | timestamp | | N/A | On SCD2 change:
 - store\_address.address\_datetime on the existing current record
 - '31/12/2999' on the new record | | | | | | | | dw\_mystore | customer\_dim | scd\_current\_flag | varchar(1) | | N/A | On SCD2 change:
 - 'N' on existing record
 - 'Y' on new record | | | | | | | | dw\_mystore | customer\_dim | dw\_insert\_date | timestamp | | N/A | SYSDATE on Insert | | | | | | | | dw\_mystore | customer\_dim | dw\_update\_date | timestamp | | N/A | SYSDATE on Update | | ## Sale Fact **Table Name:** sale_fact **Table Type:** Fact **Description:** Each row in this table represents a single line item from a sale. **Load Frequency:** Dynamic **Source Tables:** Lake: store_sale DW: date_dim, time_dim, customer_dim | Source Data Lake Layer: | Source Database: | Source Table: | Source Column: | Source Datatype: | Target Schema: | Target Table: | Target Column: | Target Datatype: | Key: | Transformation: | Note: | | ----------------------- | ---------------------------------- | ------------- | -------------------- | ---------------- | -------------- | ------------- | -------------------- | ---------------- | ---- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----- | | N/A | Sourced from DW dw\_mystore schema | date\_dim | date\_key | date | dw\_mystore | sale\_fact | sold\_date\_key | date | | Surrogate key lookup on:

 SELECT NVL(dd.date\_key, TO\_DATE('01-JAN-1900', 'DD-MON-YYYY'))
 FROM clean\_store\_sale ss
 LEFT OUTER JOIN dw\_mystore.date\_dim dd
 ON TRUNC(ss.sale\_datetime) = dd.day\_date | | | N/A | Sourced from DW dw\_mystore schema | time\_dim | time\_key | int | dw\_mystore | sale\_fact | sold\_time\_key | int | | Surrogate key lookup:

 SELECT NVL(td.time\_key, -1)
 FROM store\_sale ss
 LEFT OUTER JOIN dw\_mystore.time\_dim td
  ON EXTRACT(hour from ss.sale\_datetime) = td."hour"
 AND EXTRACT(min from ss.sale\_datetime) = td."minute"
 AND EXTRACT(sec from ss.sale\_datetime) = td."second" | | | clean | ara\_clean\_data\_ | store\_sale | sale\_datetime | timestamp | dw\_mystore | sale\_fact | sold\_date | timestamp | PK | | | | N/A | Sourced from DW dw\_mystore schema | customer\_dim | customer\_key | int | dw\_mystore | sale\_fact | customer\_key | int | | Surrogate key lookup:

 SELECT NVL(cd.customer\_key, -1)
 FROM store\_sale ss
 LEFT OUTER JOIN dw\_mystore.customer\_dim cd
 ON ss.customer\_id = cd.customer\_id
 AND ss.sold\_datetime >= cd.scd\_start\_date
 AND ss.sold\_datetime < cd.scd\_end\_date | | | clean | ara\_clean\_data\_ | store\_sale | ticket\_id | bigint | dw\_mystore | sale\_fact | ticket\_id | varchar(20) | PK | | | | clean | ara\_clean\_data\_ | store\_sale | item\_id | bigint | dw\_mystore | sale\_fact | item\_id | int | PK | | | | clean | ara\_clean\_data\_ | store\_sale | quantity | bigint | dw\_mystore | sale\_fact | quantity | int | | | | | clean | ara\_clean\_data\_ | store\_sale | wholesale\_cost | decimal(15,2) | dw\_mystore | sale\_fact | wholesale\_cost | decimal(15,2) | | | | | clean | ara\_clean\_data\_ | store\_sale | list\_price | decimal(15,2) | dw\_mystore | sale\_fact | list\_price | decimal(15,2) | | | | | clean | ara\_clean\_data\_ | store\_sale | sales\_price | decimal(15,2) | dw\_mystore | sale\_fact | sales\_price | decimal(15,2) | | | | | clean | ara\_clean\_data\_ | store\_sale | ext\_discount\_amt | decimal(15,2) | dw\_mystore | sale\_fact | ext\_discount\_amt | decimal(15,2) | | | | | clean | ara\_clean\_data\_ | store\_sale | ext\_sales\_price | decimal(15,2) | dw\_mystore | sale\_fact | ext\_sales\_price | decimal(15,2) | | | | | clean | ara\_clean\_data\_ | store\_sale | ext\_wholesale\_cost | decimal(15,2) | dw\_mystore | sale\_fact | ext\_wholesale\_cost | decimal(15,2) | | | | | clean | ara\_clean\_data\_ | store\_sale | ext\_list\_price | decimal(15,2) | dw\_mystore | sale\_fact | ext\_list\_price | decimal(15,2) | | | | | clean | ara\_clean\_data\_ | store\_sale | ext\_tax | decimal(15,2) | dw\_mystore | sale\_fact | ext\_tax | decimal(15,2) | | | | | clean | ara\_clean\_data\_ | store\_sale | coupon\_amt | decimal(15,2) | dw\_mystore | sale\_fact | coupon\_amt | decimal(15,2) | | | | | clean | ara\_clean\_data\_ | store\_sale | net\_paid | decimal(15,2) | dw\_mystore | sale\_fact | net\_paid | decimal(15,2) | | | | | clean | ara\_clean\_data\_ | store\_sale | net\_paid\_inc\_tax | decimal(15,2) | dw\_mystore | sale\_fact | net\_paid\_inc\_tax | decimal(15,2) | | | | | clean | ara\_clean\_data\_ | store\_sale | net\_profit | decimal(15,2) | dw\_mystore | sale\_fact | net\_profit | decimal(15,2) | | | | | | | | | | | customer\_dim | dw\_insert\_date | timestamp | | SYSDATE on Insert | | | | | | | | | customer\_dim | dw\_update\_date | timestamp | | SYSDATE on Update | |