|
What does a complete ETL lifecycle look like?
This is a real-life example and may need to be adjusted if you are using MERGE updates, in which case you will not drop tables etc. You also would have data warehouse labels as an option for providing consistency of data presentation between ETL runs for your users if you do not have to drop tables each ETL run. This example does not reflect that scenario.
A clear out and pull all data cycle is illustrated below.
1. SET SCHEMA to your existing data to be updated.
SET SCHEMA 'TESTING';2. DROP TABLE on the old stuff. This will also delete old indexes you had on the tables.
3. Compact the database on disk.
ALTER SYSTEM DEALLOCATE OLD;4. Get rid of your old foreign schema
DROP SCHEMA MYSQL_TESTING CASCADE;5. Get rid of your old foreign server connection definition
DROP SERVER jdbc_link_testing CASCADE;6. Recreate a foreign server connection definition
create server jdbc_link_testing
foreign data wrapper sys_jdbc
options(
driver_class 'com.mysql.jdbc.Driver',
url 'jdbc:mysql://172.27.0.204:3306/testing?useCursorFetch=true',
user_name 'sg',
password 'Complex38403BlahPass',
login_timeout '10',
fetch_size '1000',
validation_query 'select 1',
schema_name 'testing',
table_types 'TABLE');
7. Give LucidDB a schema where it can show the foreign tables
CREATE SCHEMA MYSQL_TESTING;
import foreign schema "testing" limit to ("FACT_ANSWERS","DIM_QUESTION","DIM_SUBMISSION_TYPE","FACT_VBP_CENSUS",
"FACT_SUBMISSION_RESULTS","DIM_PATIENT","DIM_HOSPITAL","DIM_DISEASE",
"DIM_READMISSION_REASON","DIM_DOCTOR")
from server jdbc_link_testing into MYSQL_TESTING;8. Switch active schema to the destination for your imported tables
SET SCHEMA 'TESTING';9. Perhaps use applib.create_table to help replicate the foreign tables from source to destination tables and schema
CALL applib.create_table_as(null,'DIM_SUBMISSION_TYPE','select * from "MYSQL_TESTING"."DIM_SUBMISSION_TYPE"',true);
CALL applib.create_table_as(null,'FACT_VBP_CENSUS','select * from "MYSQL_TESTING"."FACT_VBP_CENSUS"',true);
.....
10. Create indexes on all the columns involved in joining and filtering in all the tables.
11. Tell LucidDB to calculate statistics about indexes and all tables for the query planner.
....
ANALYZE TABLE DIM_QUESTION COMPUTE STATISTICS FOR ALL COLUMNS;
ANALYZE TABLE FACT_ANSWERS COMPUTE STATISTICS FOR ALL COLUMNS;
... more tables
12. Grant access to users for the new schema you created during the process if necessary
CALL APPLIB.GRANT_SELECT_FOR_SCHEMA( 'TESTING', 'pentaho_lucid' );
|
|