Table Of Contents

Previous topic

Exogenous number of regions (defined by the user)

Next topic

AMOEBA [Alstadt_Getis2006], [Duque_Alstadt_Velasquez_Franco_Betancourt2010]

This Page

Endogenous number of regions (Defined into algorithm)

Max-p Tabu [Duque_Anselin_Rey2010]

clusterpy0_9_9.clusterpy.core.toolboxes.cluster.maxpTabu.execMaxpTabu(y, w, threshold=100.0, maxit=2, tabuLength=5, typeTabu='exact')

Max-p-regions model (Tabu)

The max-p-regions model, devised by [Duque_Anselin_Rey2010] , clusters a set of geographic areas into the maximum number of homogeneous regions such that the value of a spatially extensive regional attribute is above a predefined threshold value. In clusterPy we measure heterogeneity as the within-cluster sum of squares from each area to the attribute centroid of its cluster.

The max-p-regions algorithm is composed of two main blocks:

  • construction of a initial feasible solution.
  • local improvement.

There are three methods for local improvement: Greedy (execMaxpGreedy), Tabu (execMaxpTabu), and Simulated Annealing (execMaxpSa). A detailed explanation of each method can be found in Duque, Anselin and Rey (2010) [Duque_Anselin_Rey2010].

For this version, the tabu search algorithm will stop after max(10,N/maxP) nonimproving moves.

layer.cluster('maxpTabu',vars,<threshold>,<wType>,<std>,<maxit>,<tabuLength>,<typeTabu>,<dissolve>,<dataOperations>)
Parameters:
  • vars (list) – Area attribute(s). Important: the last variable in vars correspond to the spatially extensive attribute that will be constrained to be above the predefined threshold value (e.g. [‘SAR1’,’SAR2’,’POP’])
  • threshold (integer) – Minimum value of the constrained variable at regional level. Default value threshold = 100.
  • wType (string) – Type of first-order contiguity-based spatial matrix: ‘rook’ or ‘queen’. Default value wType = ‘rook’.
  • std (binary) – If = 1, then the variables will be standardized.
  • maxit (integer) – Number of times that the construction phase is repeated. The larger the value the higher the possibility of getting a large number of regions. Default value maxit = 2.
  • tabuLength (integer) – Number of times a reverse move is prohibited. Default value tabuLength = 85.
  • typeTabu (string) – Type of tabu search: (a) exact: chooses the best neighbouring solution for evaluation (it implies the enumeration of all the neighbouring solution at each iteration); (b) “random”: evaluates a neighbouring solution selected at random and (See Ricca, F. and Simeone (2008) for more on the difference between exact and random tabu). Default value typeTabu = “exact”.
  • dissolve (binary) – If = 1, then you will get a “child” instance of the layer that contains the new regions. Default value = 0. Note: Each child layer is saved in the attribute layer.results. The first algorithm that you run with dissolve=1 will have a child layer in layer.results[0]; the second algorithm that you run with dissolve=1 will be in layer.results[1], and so on. You can export a child as a shapefile with layer.result[<1,2,3..>].exportArcData(‘filename’)
  • dataOperations (dictionary) – Dictionary which maps a variable to a list of operations to run on it. The dissolved layer will contains in it’s data all the variables specified in this dictionary. Be sure to check the input layer’s fieldNames before use this utility.

The dictionary structure must be as showed bellow.

>>> X = {}
>>> X[variableName1] = [function1, function2,....]
>>> X[variableName2] = [function1, function2,....]

Where functions are strings which represents the name of the functions to be used on the given variableName. Functions could be,’sum’,’mean’,’min’,’max’,’meanDesv’,’stdDesv’,’med’, ‘mode’,’range’,’first’,’last’,’numberOfAreas. By deffault just ID variable is added to the dissolved map.

SOM [Kohonen1990]

clusterpy0_9_9.clusterpy.core.toolboxes.cluster.som.originalSOM(y, w, nRows=10, nCols=10, iters=1000, alphaType='linear', initialDistribution='Uniform', wType='rook', fileName=None)

Self Organizing Map(SOM)

SOM is an unsupervised neural network proposed by [Kohonen1990] which adjust its weights to represent, on a regular lattice, a data set distribution.

In [Kohonen1990] the neighbourhood of the Best Matching Unit (BMU) is defined in a general form, but in this algorithm it could be any contiguity matrix available for a Layer object (rook, queen, custom).

The original algorithm is commonly used with the output network layer represented by a regular hexagonal or rectangular lattice. In clusterPy we use a rectangular regular lattice (see [Schimidt_Rey_Skupin2010] for the effects of using different output layer topologies in SOM). Finally, the adaptative parameter is taken from the scalar version suggested by [Kohonen1990].

Additionaly In ClusterPy we use contiguity based neighbourhood for the weights updating process. For more information see [Kohonen2001].

layer.cluster('som',vars,<nRows>,<nCols>,<wType>,<iters>,<alphaType>,<initialDistribution>,<wType>,<fileName>)
Parameters:
  • vars (list) – Area attribute(s)
  • nRows (list) – Number of rows in the lattice
  • nCols (list) – Number of columns in the lattice
  • wType (string) – Type of first-order contiguity-based spatial matrix: ‘rook’ or ‘queen’. Default value wType = ‘rook’.
  • iters (integer) – Number of iterations for the SOM algorithm. Default value iters = 1000.
  • alphaType (string) – Name of the scalar-valued decreasing function which maps iterations onto (0,1) float values. This function is used to define how much modify the BMU neighborhood areas. In clusterPy we have to possible functions: ‘linear’ (linear decreasing function), or ‘quadratic’ (quadratic decreasing function). Default value alphaType = ‘linear’.
  • initialDistribution (string) – Data generator process to initialize the neural wights. Default value initialDistribution = ‘uniform’.
  • fileName (string) – Parameter used to export neural output layer topology as a shapefile. Default value fileName = None.

IMPORTANT NOTE:

Since this algorithm does not guarantee spatial contiguity of the resulting regions, clusterPy does not provide the dissolve option. to obtain the solution vector you will need to export the layer with the command “Layer.exportArcData”. The exported shape file will have an additional variable with the solution vector (i.e., ID of the region to which the area has been assigned).

Geo SOM: [Bacao_Lobo_Painho2004]

clusterpy0_9_9.clusterpy.core.toolboxes.cluster.geoSOM.geoSom(iLayer, iVariables, nRows=10, nCols=10, iters=1000, wType='rook', alphaType='linear', initialDistribution='Uniform', fileName=None)

Geo Self Organizing Map(geoSOM)

GeoSOM is an unsupervised neural network proposed by [Bacao_Lobo_Painho2004] , which adjust his weights to represent, on a regular lattice, a data set distribution. The difference between the algorithm suggested by [Bacao_Lobo_Painho2004] and the suggested by [Kohonen1990] is that the first one uses the geographical location of the output network layer to organize the values given in the input Layer.

layer.cluster('geoSom',vars,nRows,nCols,<wType>,<iters>,<alphaType>,<initialDistribution>,<wType>,<fileName>)
Parameters:
  • vars (list) – Area attribute(s)
  • nRows (list) – Number of rows in the lattice
  • nCols (list) – Number of columns in the lattice
  • wType (string) – Type of first-order contiguity-based spatial matrix: ‘rook’ or ‘queen’. Default value wType = ‘rook’.
  • iters (integer) – Number of iterations for the SOM algorithm. Default value iters = 1000.
  • alphaType (string) – Name of the scalar-valued decreasing function which maps iterations onto (0,1) float values. This function is used to define how much modify the BMU neighborhood areas. In clusterPy we have to possible functions: ‘linear’ (linear decreasing function), or ‘quadratic’(quadratic decreasing function). Default value alphaType = ‘linear’.
  • initialDistribution (string) – Data generator process to be used used to initialized the neural wights. Default value initialDistribution = ‘uniform’.
  • fileName (string) – Parameter used to export neural output layer topology as a shapefile. Default value fileName = None.

IMPORTANT NOTE:

Since this algorithm does not guarantee spatial contiguity of the resulting regions, clusterPy does not provide the dissolve option. to obtain the solution vector you will need to export the layer with the command “Layer.exportArcData”. The exported shape file will have an additional variable with the solution vector (i.e., ID of the region to which the area has been assigned).