The model has been explored and tested for multiple parameters on real and simulated datasets. The research includes the following outline with separate notebooks for each part.
Notebook Outline:
Introduction Notebook (current)
As prefaced earlier, the Geographically Weighted Regression model in PySAL can currently estimate Gaussian, Poisson and Logistic models though the Multiscale extension of the GWR model is currently limited to only Gaussian models. This part of the project aims to expand the MGWR model to nonlinear local spatial regression modeling techniques where the response outcomes may be binomial (or a Logit model). This will enable a richer and holistic local statistical modeling framework to model multi-scale process heterogeneity for the open source community.
A conventional Logistic regression model with $x_1, x_2, ... ,x_k$ as predictors, a binary(Bernoulli) response variable y and l denoting the log-odds of the event that y=1, can be written as:
\begin{align} l = log_b ( p / (1-p)) = ({\sum} {\beta} & _k x _{k,i}) \\ \end{align}where $x_{k,1}$ is the kth explanatory variable in place i, $π½_{ks}$ are the parameters and p is the probability such that p = P( Y = 1 ).
By exponentiating the log-odds:
$p / (1-p) = b^ {π½_0+π½_1 x_1+π½_2 x_2} $
It follows from this - the probability that Y = 1 is:
$p = (b^ {π½_0 + π½_1 x_1 + π½_2 x_2}) / (b^ {π½_0 + π½_1 x_1 + π½_2 x_2} + 1)$ = $1 / (1 + b^ {-π½_0 + π½_1 x_1 + π½_2 x_2})$
Following the technique from (Hastie & Tibshirani, 1986), for logisitic generalized additive models the model was estimated using the local scoring algorithm as follows:
These steps are repeated until the relative change in the fitted coefficients and the functions is below a tolerance threshold (1e-05 in this case).
Reference for these equations: http://ugrad.stat.ubc.ca/~nancy/526_2003/projects/kazi2.pdf
The parameters for the estimated model using Monte Carlo tests with simulated data are close to expected. Further exploration is required to theoretically justify the model in the context of spatial data models, especially MGWR.
As an exploration, this work includes results from both adding a stochastic error to the model during calibration and without it. Results for both are shown in the notebooks below.
Initial module changes and univariate model check
Monte Carlo Simulation Visualization
Monte Carlo Simulation Visualization