PACKAGES

CHAID | partykit | ggparty

Decision Tree: CHAID¶

chaid: CHi-squared Automated Interaction DetectionÂ’

r.chaid utilizza l’algoritmo CHAID (CHi-squared Automated Interaction Detection [1]) per determinare un modello predittivo di classificazione albero decisionale. Sono accettate solo variabili nominali o ordinali.

Argomenti: [2]

  • varname => varlist: un hash con la variabile dipendente e l’elenco delle variabili indipendenti
  • :scale => varlist: specifica le variabili singole che devono essere considerate come ordinali
  • :weight => varname: il nome di una variabile di ponderazione
  • :tree => :party: metodo utilizzato per la generazione del grafico (default: :party)
  • alpha2: level of significance used for merging of predictor categories (step 2)
  • alpha3: if set to a positive value $< 1$, level of significance used for the the splitting of former merged categories of the predictor (step 3). Otherwise, step 3 is omitted (the default)
  • alpha4: level of significance used for splitting of a node in the most significant predictor (step 5)
  • minsplit: number of observations in splitted response at which no further split is desired
  • minbucket: minimum number of observations in terminal nodes
  • minprob: mininimum frequency of observations in terminal nodes
  • stump: only root node splits are performed
  • maxheight: maximum height for the tree

Tabelle disponibili:

  • :analysis: riepilogo dei parametri dell’analisi
  • :nodes: tabella dei nodi generati
  • :varimp: importanza relativa delle variabili

Grafici disponibili:

  • :tree: il grafico ad albero
  • :varimp: importanza relativa delle variabili

Attenzione

In caso di problemi con l’installazione del package CHAID, installarlo direttamente in r con l’istruzione:

install.packages("CHAID", repos="http://R-Forge.R-project.org")
1
2
3
4
deplist = ["businesstravel", "department", "education", "educationfield", "environmentsatisfaction",
           "gender", "jobinvolvement", "joblevel", "jobrole", "jobsatisfaction", "maritalstatus", "numcompaniesworked",
           "overtime", "performancerating", "relationshipsatisfaction", "stockoptionlevel", "trainingtimeslastyear", "worklifebalance"]
r.chaid :attrition => deplist, :maxheight => 3
_images/chaid_1.png
_images/chaid_tree.png
_images/chaid_2.png
_images/chaid_imp.png
_images/chaid_3.png


Note

[1]
    1. Kass (1980) An Exploratory Technique for Investigating Large Quantities of Categorical Data, Journal of the Royal Statistical Society. Series C (Applied Statistics) Vol. 29, No. 2, pp. 119-127 (9 pages), Wiley
[2]Consultare Analisi per l’elenco dei parametri generali.