6.6 EXERCISES
Patient data was collected concerning the diagnosis of cold or flu (Table 6.21).
- Calculate the Euclidean distance (replacing None with 0, Mild with 1 and Severe with 2) using the variables: Fever, Headaches, General aches, Weakness, Exhaustion, Stuffy nose, Sneezing, Sore throat, Chest discomfort, for the following pairs of patient observations from Table 6.21:
- 1326 and 398
- 1326 and 1234
- 6377 and 2662
- The patient observations described in Table 6.21 are being clustered using agglom-erative hierarchical clustering. The Euclidean distance is used to calculate the distance between observations using the following variables: Fever, Headaches, General aches, Weakness, Exhaustion, Stuffy nose, Sneezing, Sore throat, Chest discomfort (replacing None with 0, Mild with 1 and Severe with 2). The average linkage joining rule is being used to create the hierarchical clusters. During the clustering process observations 6377 and 2662 are already grouped together. Calculate the distance from observation 398 to this group.
- A candidate rule has been extracted using the associative rule method from Table 6.1:
If Exhaustion = None AND
Stuffy node = Severe
THEN Diagnosis = cold
Calculate the support, confidence, and lift for this rule.
- Table 6.21 is to be used to build a decision tree to classify whether a patient has a cold or flu. As part of this process the Fever column is being considered as a splitting point. Two potential splitting values are being considered:
- Where the ...
Get Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.