Diagram the following query:
SELECT ...
FROM Customers C, ZIP_Codes Z, ZIP_Demographics D, Regions R
WHERE C.ZIP_Code=Z.ZIP_Code
AND Z.Demographic_ID=D.Demographic_ID
AND Z.Region_ID=R.Region_ID
AND C.Active_Flag='Y'
AND C.Profiled_Flag='N'
AND R.Name='SOUTHWEST'
AND D.Name IN ('YUPPIE', 'OLDMONEY');Make the usual assumptions about primary-key names, except that the primary key of ZIP_Codes is simply ZIP_Code, and note that the Name columns of both REGIONS and ZIP_Demographics are also uniquely indexed. You have 5,000,000 rows in Customers, 250,000 rows in Zip_Codes, 20 rows in ZIP_Demographics, and 5 rows in Regions. Assume all foreign keys are never null and always point to valid primary keys. The following query returns 2,000,000 rows:
SELECT COUNT(*) FROM Customers C WHERE Active_Flag='Y' AND Profiled_Flag='N';
Diagram the following query:
SELECT ...
FROM Regions R, Zip_Codes Z, Customers C, Customer_Mailings CM,
Mailings M, Catalogs Cat, Brands B
WHERE R.Region_ID(+)=Z.Region_ID
AND Z.ZIP_Code(+)=C.ZIP_Code
AND C.Customer_ID=CM.Customer_ID
AND CM.Mailing_ID=M.Mailing_ID
AND M.Catalog_ID=Cat.Catalog_ID
AND Cat.Brand_ID=B.Brand_ID
AND B.Name='OhSoGreen'
AND M.Mailing_Date >= SYSDATE-365
GROUP BY... ORDER BY ...Start with the same assumptions and statistics as in Exercise 1. Customer_Mailings contains 30,000,000 rows. Mailings contains 40,000 rows. Catalogs contains 200 rows. Brands contains 12 rows and has an alternate unique key on Name. The following query returns 16,000 rows:
SELECT COUNT(*) FROM Mailings M WHERE Mailing_Date >= SYSDATE-365;
Diagram the following query:
SELECT ...
FROM Code_Translations SPCT, Code_Translations TRCT, Code_Translations CTCT,
Products P, Product_Lines PL, Inventory_Values IV, Brands B,
Product_Locations Loc, Warehouses W, Regions R,
Inventory_Taxing_Entities ITx, Inventory_Tax_Rates ITxR, Consignees C
WHERE W.Region_ID=R.Region_ID
AND Loc.Warehouse_ID=W.Warehouse_ID
AND W.Inventory_Taxing_Entity_ID=ITx.Inventory_Taxing_Entity_ID
AND ITx.Inventory_Taxing_Entity_ID= ITxR.Inventory_Taxing_Entity_ID
AND ITxR.Effective_Start_Date <= SYSDATE
AND ITxR.Effective_End_Date > SYSDATE
AND ITxR.Rate>0
AND P.Product_ID=Loc.Product_ID
AND Loc.Quantity>0
AND P.Product_Line_ID=PL.Product_Line_ID(+)
AND P.Product_ID=IV.Product_ID
AND P.Taxable_Inventory_Flag='Y'
AND P.Consignee_ID=C.Consignee_ID(+)
AND P.Strategic_Product_Code=SPCT.Code
AND SPCT.Code_Type='STRATEGIC_PRODUCT'
AND P.Turnover_Rate_Code=TRCT.Code
AND TRCT.Code_Type='TURNOVER_RATE'
AND P.Consignment_Type_Code=CTCT.CODE
AND CTCT.Code_Type='CONSIGNMENT_TYPE'
AND IV.Effective_Start_Date <= SYSDATE
AND IV.Effective_End_Date > SYSDATE
AND IV.Unit_Value>0
AND P.Brand_ID=B.Brand_ID
AND B.Name='2Much$'
AND ITX.Tax_Day_Of_Year='DEC31'
GROUP BY... ORDER BY ...Start with the same assumptions and statistics as in Exercises 1 and 2, except that W.Inventory_Taxing_Entity_ID points to a valid taxing entity only when it is not null, which is just 5% of the time. The counts for table rows are as follows:
Code_Translations has a two-part primary key: Code_Type, Code.
Inventory_Values and Inventory_Tax_Rates have a time-dependent primary key consisting of an ID and an effective date range, such that any given date falls in a single date range for any value of the key ID. Specifically, the join conditions to each of these tables are guaranteed to be unique by the Effective_Start_Date and Effective_End_Date conditions, which are part of the joins, not separate filters. (Unfortunately, there is no convenient way to enforce that uniqueness through an index; it is a condition created by the application.) The following queries return the rowcounts shown in the lines that follow each query:
Q1: SELECT COUNT(*) A1 FROM Inventory_Taxing_Entities ITx
WHERE ITx.Tax_Day_Of_Year='DEC31'
A1: 2
Q2: SELECT COUNT(*) A2 FROM Inventory_Values IV
WHERE IV.Unit_Value>0
AND IV.Effective_Start_Date <= SYSDATE
AND IV.Effective_End_Date > SYSDATE
A2: 7,400
Q3: SELECT COUNT(*) A3 FROM Products P
WHERE P.Taxable_Inventory_Flag='Y'
A3: 8,300
Q4: SELECT COUNT(*) A4 FROM Product_Locations Loc
WHERE Loc.Quantity>0
A4: 123,000
Q5: SELECT COUNT(*) A5 FROM Inventory_Tax_Rates ITxR
WHERE ITxR.RATE>0
AND ITxR.Effective_Start_Date <= SYSDATE
AND ITxR.Effective_End_Date > SYSDATE
A5: 4
Q6: SELECT COUNT(*) A6 FROM Inventory_Values IV
WHERE IV.Effective_Start_Date <= SYSDATE
AND IV.Effective_End_Date > SYSDATE
A6: 8,500
Q7: SELECT COUNT(*) A7 FROM INVENTORY_TAX_RATES ITxR
WHERE ITxR.Effective_Start_Date <= SYSDATE
AND ITxR.Effective_End_Date > SYSDATE
A7: 4
Q8: SELECT COUNT(*) A8 FROM Code_Translations SPCT
WHERE Code_Type = 'STRATEGIC_PRODUCT'
A8: 3
Q9: SELECT COUNT(*) A9 FROM Code_Translations TRCT
WHERE Code_Type = 'TURNOVER_RATE'
A9: 2
Q10: SELECT COUNT(*) A10 FROM CTCT
WHERE Code_Type = 'CONSIGNMENT_TYPE'
A10: 3Fully simplify the query diagram for Exercise 1. Try starting from the queries and the query statistics, rather than from the full query diagrams. Then, compare your result with what you get when you start from the full query diagrams that you already did.
Fully simplify the query diagram for Exercise 2, following the guidelines in Exercise 4.
Fully simplify the query diagram for Exercise 3, following the guidelines in Exercise 4.