Automatic Rationalization with Initial Seed Location
ARiSeL, proposed by [Duque_Church2004] , aggregates N areas into P spatially contiguous regions while minimizing intra-regional heterogeneity (measured as the within-cluster sum of squares from each area to the attribute centroid of its cluster). This algorithm is a modification of Openshaw’s AZP-tabu [Openshaw_Rao1995]. In ARISeL the construction of a initial feasible solution is repeated several times (inits) before running Tabu Search algorithm [Glover1977].
Duque and Church argue that:
- constructing and initial feasible solution is computationally less expensive than performing local search.
- local search by moving bordering areas between region do not allow an extensive search in the solution space and it is computationally expensive.
Based on those two ideas, the authors propose to generate as many different initial feasible solutions and run Tabu search on the best initial solution obtained so far.
The initial solution follows a “growing regions” strategy. It starts with a initial set of seeds (as many seed as regions) selected using the K-means++ algorithm. From those seeds, other neighbouring areas are assigned to its closest (in attribute space) growing region. This strategy has proven better results.
Layer.cluster('arisel',vars,regions,<wType>,<std>,<inits>,<initialSolution>,<convTabu>,<tabuLength>,<dissolve>,<dataOperations>)
| Parameters: |
|
|---|
The dictionary structure must be as showed bellow.
>>> X = {}
>>> X[variableName1] = [function1, function2,....]
>>> X[variableName2] = [function1, function2,....]
Where functions are strings wich represents the name of the functions to be used on the given variableName. Functions could be,’sum’,’mean’,’min’,’max’,’meanDesv’,’stdDesv’,’med’, ‘mode’,’range’,’first’,’last’,’numberOfAreas. By deffault just ID variable is added to the dissolved map.
Automatic Zoning Procedure (AZP)
AZP is a mildly steepest descent algorithm that aggregates N zones (areas) into M regions. “The M output regions should be formed of internally connected, contiguous, zones.” ([Openshaw_Rao1995] pp 428).
In [Openshaw_Rao1995] the objective function is not defined because AZP can be applied to any function, F(Z). “F(Z) can be any function defined on data for the M regions in Z, and Z is the allocation of each of N zones to one of M regions such that each zone is assigned to only one region” ([Openshaw_Rao1995] pp 428).” In clusterPy we Minimize F(Z), where Z is the within-cluster sum of squares from each area to the attribute centroid of its cluster.
NOTE: The original algorithm proposes to start from a random initial feasible solution. Previous computational experience showed us that this approach leads to poor quality solutions. In clusterPy we started from an initial solution that starts with a initial set of seeds (as many seed as regions) selected using the K-means++ algorithm. From those seeds, other neighbouring areas are assigned to its closest (in attribute space) growing region. This strategy has proven better results.
Layer.cluster('azp',vars,regions,<wType>,<std>,<initialSolution>,<dissolve>,<dataOperations>)
| Parameters: |
|
|---|
The dictionary structure must be as showed bellow.
>>> X = {}
>>> X[variableName1] = [function1, function2,....]
>>> X[variableName2] = [function1, function2,....]
Where functions are strings wich represents the name of the functions to be used on the given variableName. Functions could be,’sum’,’mean’,’min’,’max’,’meanDesv’,’stdDesv’,’med’, ‘mode’,’range’,’first’,’last’,’numberOfAreas. By deffault just ID variable is added to the dissolved map.
Simulated Annealing variant of Automatic Zoning Procedure (AZP-SA)
AZP-SA aggregates N zones (areas) into M regions. “The M output regions should be formed of internally connected, contiguous, zones.” ([Openshaw_Rao1995] pp 428).
AZP-SA is a variant of the AZP algorithm that incorporates a seach process, called Simulated Annealing algorithm [Kirkpatrick_Gelatt_Vecchi1983]. Simulated annealing algorithm “permits moves which result in a worse value of the objective function but with a probability that diminishes gradually, through iteration time” ([Openshaw_Rao1995] pp 431).
In Openshaw and Rao (1995) the objective function is not defined because AZP-Tabu can be applied to any function, F(Z). “F(Z) can be any function defined on data for the M regions in Z, and Z is the allocation of each of N zones to one of M regions such that each zone is assigned to only one region” ([Openshaw_Rao1995] pp 428).” In clusterPy we Minimize F(Z), where Z is the within-cluster sum of squares from each area to the attribute centroid of its cluster.
In order to make the cooling schedule robust the units of measure of the objective function, we set the Boltzmann’s equation as: R(0,1) < exp((-(Candidate Solution - Current Solution) / Current Solution)/T(k)). The cooling schedule is T(k) = 0.85 T(k-1) ([Openshaw_Rao1995] pp 431), with an initial temperature T(0)=1.
NOTE: The original algorithm proposes to start from a random initial feasible solution. Previous computational experience showed us that this approach leads to poor quality solutions. In clusterPy we started from an initial solution that starts with a initial set of seeds (as many seed as regions) selected using the K-means++ algorithm. From those seeds, other neighbouring areas are assigned to its closest (in attribute space) growing region. This strategy has proven better results.
layer.cluster('azpSa',vars,regions,<wType>,<std>,<initialSolution>,<maxit>,<dissolve>,<dataOperations>)
| Parameters: |
|
|---|
The dictionary structure must be as showed bellow.
>>> X = {}
>>> X[variableName1] = [function1, function2,....]
>>> X[variableName2] = [function1, function2,....]
Where functions are strings wich represents the name of the functions to be used on the given variableName. Functions could be,’sum’,’mean’,’min’,’max’,’meanDesv’,’stdDesv’,’med’, ‘mode’,’range’,’first’,’last’,’numberOfAreas. By deffault just ID variable is added to the dissolved map.
Tabu variant of Automatic Zoning Procedure (AZP-Tabu)
AZP-Tabu aggregates N zones (areas) into M regions. “The M output regions should be formed of internally connected, contiguous, zones.” ([Openshaw_Rao1995] pp 428).
AZP-Tabu is a variant of the AZP algorithm that incorporates a search process, called Tabu algorithm [Glover1977]. Tabu “allows the search process to escape from local optima whilst avoiding cyclical behaviour.” ([Openshaw_Rao1995] pp 432).
In Openshaw and Rao (1995) the objective function is not defined because AZP-Tabu can be applied to any function, F(Z). “F(Z) can be any function defined on data for the M regions in Z, and Z is the allocation of each of N zones to one of M regions such that each zone is assigned to only one region” ([Openshaw_Rao1995] pp 428).” In clusterPy we Minimize F(Z), where Z is the within-cluster sum of squares from each area to the attribute centroid of its cluster.
Openshaw and Rao (1995) do not specify a value for the two most important parameters of Tabu seach: length of the tabu list and convergence criteria. See [Duque_Anselin_Rey2010] for an evaluation of the performance of tabu search within the context of spatial clustering.
NOTE: The original algorithm proposes to start from a random initial feasible solution. Previous computational experience showed us that this approach leads to poor quality solutions. In clusterPy we started from an initial solution that starts with a initial set of seeds (as many seed as regions) selected using the K-means++ algorithm. From those seeds, other neighbouring areas are assigned to its closest (in attribute space) growing region. This strategy has proven better results.
layer.cluster('azpTabu',vars,regions,<wType>,<std>,<initialSolution>,<convTabu>,<tabuLength>,<dissolve>,<dataOperations>)
| Parameters: |
|
|---|
The dictionary structure must be as showed bellow.
>>> X = {}
>>> X[variableName1] = [function1, function2,....]
>>> X[variableName2] = [function1, function2,....]
Where functions are strings which represents the name of the functions to be used on the given variableName. Functions could be,’sum’,’mean’,’min’,’max’,’meanDesv’,’stdDesv’,’med’, ‘mode’,’range’,’first’,’last’,’numberOfAreas. By default just ID variable is added to the dissolved map.
Reactive tabu variant of Automatic Zoning Procedure (AZP-R-Tabu)
AZP-R-Tabu aggregates N zones (areas) into M regions. “The M output regions should be formed of internally connected, contiguous, zones.” ([Openshaw_Rao1995] pp 428).
AZP-R-Tabu is a variant of the AZP algorithm that incorporates a seach process, called Reactive Tabu Search algorithm [Battiti_Tecchiolli1994]. The main difference between the reactive tabu and the tabu search, devised by [Glover1977] , is that the former does not require to define the number of times a reverse move is prohibited (tabuLength). This parameter is dynamically adjusted by the algorithm.
In [Openshaw_Rao1995] the objective function is not defined because AZP-Tabu can be applied to any function, F(Z). “F(Z) can be any function defined on data for the M regions in Z, and Z is the allocation of each of N zones to one of M regions such that each zone is assigned to only one region” ([Openshaw_Rao1995] pp 428).” In clusterPy we Minimize F(Z), where Z is the within-cluster sum of squares from each area to the attribute centroid of its cluster.
NOTE: The original algorithm proposes to start from a random initial feasible solution. Previous computational experience showed us that this approach leads to poor quality solutions. In clusterPy we started from an initial solution that starts with a initial set of seeds (as many seed as regions) selected using the K-means++ algorithm. From those seeds, other neighbouring areas are assigned to its closest (in attribute space) growing region. This strategy has proven better results.
layer.cluster('azpRTabu',vars,regions,<wType>,<std>,<initialSolution>,<convTabu>,<dissolve>,<dataOperations>)
| Parameters: |
|
|---|
The dictionary structure must be as showed bellow.
>>> X = {}
>>> X[variableName1] = [function1, function2,....]
>>> X[variableName2] = [function1, function2,....]
Where functions are strings wich represents the name of the functions to be used on the given variableName. Functions could be,’sum’,’mean’,’min’,’max’,’meanDesv’,’stdDesv’,’med’, ‘mode’,’range’,’first’,’last’,’numberOfAreas. By deffault just ID variable is added to the dissolved map.
Generate random regions
This algorithm aggregates, at random, a set of areas into a predefined number of spatially contiguous regions.
layer.cluster('random',vars,regions,<wType>,<dissolve>,<dataOperations>)
| Parameters: |
|
|---|
The dictionary structure must be as showed bellow.
>>> X = {}
>>> X[variableName1] = [function1, function2,....]
>>> X[variableName2] = [function1, function2,....]
Where functions are strings which represents the name of the functions to be used on the given variableName. Functions could be,’sum’,’mean’,’min’,’max’,’meanDesv’,’stdDesv’,’med’, ‘mode’,’range’,’first’,’last’,’numberOfAreas. By default just ID variable is added to the dissolved map.