Topological descriptors
Previous Topic  Next Topic 

 

List of topological descriptors calculated by DRAGON

 

Topological descriptors are based on a graph representation of the molecule. They are numerical quantifiers of molecular topology obtained by the application of algebraic operators to matrices representing molecular graphs and whose values are independent of vertex numbering or labelling. They can be sensitive to one or more structural features of the molecule such as size, shape, symmetry, branching and cyclicity and can also encode chemical information concerning atom type and bond multiplicity.

 

Many topological descriptors calculated by DRAGON are derived from a H-depleted molecular graph and can be divided in different logical blocks.

 

The first block of topological indices calculated by DRAGON is derived by a molecular graph quantity called vertex degree which is the number of connected vertices (non-hydrogen atoms) or by a modified vertex degree taking into account all atom valence electrons called valence vertex degree. The vertex degree of an atom is the corresponding row sum of the adjacency matrix which collects information on pairs of connected atoms in a H-depleted molecular graph. These molecular descriptors are mainly related to molecular branching. They are briefly explained below.

 

 

The second block of topological indices is derived by applying different algebraic operators to the distance matrix which collects topological distances between pairs of atoms. The topological distance between two atoms is the length (i.e. number of involved bonds) of the shortest path between the two atoms. The distance degree of an atom is the corresponding row sum of the distance matrix, i.e. the sum of the topological distances from the considered atom to any other atoms. Topological indices based on topological distances are described below.

 

 

 

where dij is the topological distance between two atoms and nSK is the number of non-hydrogen atoms [A.T. Balaban, Pure & Appl.Chem. 1983, 55, 199-206].

 

 

where W is the Wiener index, nSK is the number of non-hydrogen atoms and s refers to vertex distance degrees.

 

 

where nk is the number of graph vertices having the same atom eccentricity, the sum runs over all the different atom eccentricities and nSK is the number of non-H atoms [D. Bonchev, Information Theoretic Indices for Characterization of Chemical Structures, Research Studies Press, Chichester (UK), 1983].

 

Weighted distance matrices are modified distance matrices accounting contemporarily for the presence of heteroatoms and multiple bonds in the molecule, defined as:

 

 

where wC is a property of the carbon atom, wi the property of the ith atom, p* is the conventional bond order (i.e. 1 for simple bond, 2 for double bond, 3 for triple bond and 1.5 for aromatic bond), the sum runs over all bonds involved in the shortest path between vertices i and j, dij being the topological distance (i.e. the length of the shortest path), and the subscripts b(1) and b(2) represent the two vertices incident to the considered b bond. When more than one shortest path exists between a pair of vertices, the rule adopted by DRAGON is to take the path with the minimum sum of the edge weights.

DRAGON calculates 5 weighted distance matrices using the following atomic properties w: atomic number (Z), atomic mass (m), atomic van der Waals volume (v), atomic Sanderson electronegativity (e), and atomic polarizability (p). 

The matrix weighted by atomic numbers Z is usually known as the Barysz distance matrix [M. Barysz, G. Jashari, R.S. Lall, A.K. Srivastava, N. Trinajstic, On the Distance Matrix of Molecules Containing Heteroatoms in Chemical Applications of Topology and Graph Theory, R.B. King (Ed.), Elsevier, Amsterdam (The Netherlands), pp. 222-230, 1983].

 

 

Some topological indices calculated by DRAGON are derived both from the adjacency matrix and the distance matrix representing a H-depleted molecular graph.

 

 

 

where nSK is the number of non-hydrogen atoms.

This molecular descriptor measures the combined influence of valence, adjacency and distance for each comparable set of vertices. The SMTIV index is calculated in the same way using the valence vertex degree in place of the simple vertex degree.

 

 

where d refers to vertex degrees, nSK to the number of non-hydrogen atoms and dij to the topological distance between two atoms [I. Gutman, J.Chem.Inf.Comput.Sci. 1994, 34, 1087-1089]. The GMTIV index is obtained in the same way as the GMTI index using the valence vertex degree in place of the simple vertex degree.

 

 

where nSK is the number of non-hydrogen atoms, d is the vertex degree and s the vertex distance degree [B. Ren, J.Chem.Inf.Comput.Sci. 1999, 39, 139-143]. It was proposed as a particularly high discriminant molecular descriptor accounting for molecular size and branching.

 

The Laplacian matrix is a square symmetric matrix representing a H-depleted molecular graph, whose diagonal entries are the vertex degrees of molecule atoms and off-diagonal entries corresponding to pairs of bonded atoms are set at –1 otherwise at 0.

 

 

               

 

where QW is the quasi-Wiener index, nBO and nSK are the number of non-H bonds and non-H atoms, respectively, lnSK–1 is the first non-zero eigenvalue [N. Trinajstic, D. Babic, S. Nikolic, D. Plavsic, D. Amic, Z. Mihalic, J.Chem.Inf.Comput.Sci. 1994, 34, 368-376].

 

The distance-path matrix is a square symmetric matrix representing a H-depleted molecular graph, whose off-diagonal entry i-j is the count of all paths of any length that are included in the shortest path from vertex vi to vertex vj [M.V. Diudea, J.Chem.Inf.Comput.Sci. 1996, 36, 535-540]; the diagonal entries are zero.

 

 

The detour matrix is a square symmetric matrix representing a H-depleted molecular graph, whose entry i-j is the length of the longest path from vertex vi to vertex vj [F. Buckley, F. Harary, Distance Matrix in Graphs, Addison-Wesley, Redwood City (CA), 1990; O. Ivanciuc, A.T. Balaban, MATCH (Comm.Math.Comp.Chem.) 1994, 30, 141-152]. The detour-path matrix, analogously defined as the distance-path matrix, is a square symmetric matrix whose off-diagonal entry i-j is the count of all paths of any length that are included within the longest path from vertex vi to vertex vj [M.V. Diudea, J.Chem.Inf.Comput.Sci. 1996, 36, 535-540]; the diagonal entries are zero.

 

 

The distance/detour quotient matrix, derived from detour and distance matrices, is a square symmetric matrix representing a H-depleted molecular graph, whose off-diagonal entries are the ratios of the lengths of the shortest to the longest path between any pair of vertices [M. Randic, J.Chem.Inf.Comput.Sci. 1997, 37, 1063-1071].

 

 

walk in a molecular graph is a sequence of pairwise adjacent edges leading from one vertex to another one; any edge can be traversed several times. A path is a walk without any repeated vertices or edges. The walk or path length is the number of edges traversed by the walk or path.

 

The a parameter used to calculate the Kier shape indices is derived from the ratio of the covalent radius Ri of the ith atom relative to the sp3 carbon atom: 

 

 

 

The only non zero contributions to a are given by heteroatoms or carbon atoms with a hybridization state different from sp3.

 

Atom / Hybrid

R (Å)

Atom / Hybrid

 (Å)

Csp3

0.77

Psp3

1.10

Csp2

0.67

Psp2

1.00

Csp

0.60

Ssp3

1.04

Nsp3

0.74

Ssp2

0.94

Nsp2

0.62

F

0.72

Nsp

0.55

Cl

0.99

Osp3

0.74

Br

1.14

Osp2

0.62

I

1.33

B

0.822

Ni

1.30

Al

1.26

Cu

1.33

Si

1.17

Zn

1.29

Fe

1.34

Sn

1.42

Co

1.23

Gd

1.79

 

 

 

 

where nSK is the number of non-H atoms and Ag is the number of topologically equivalent atoms in the gth class. Each equivalence class is constituted by all atoms having the same electrotopological topological state.

 

 

where nSK is the number of non-H atoms [L.B. Kier, Quant.Struct.-Act.Relat. 1989, 8, 221-224].

 

The Kier benzene-likeliness index (BLI) is calculated by dividing the first-order valence connectivity index X1V by the number of non-H bonds (nBO) of the molecule and then normalising on the benzene molecule [L.B. Kier, L.H. Hall, Molecular Connectivity in Structure-Activity Analysis, Research Studies Press - Wiley, Chichester (UK), 1986]. It was proposed to measure the molecule aromaticity.

 

The electrotopological state indices of Kier and Hall [L.B Kier, L.H. Hall, Pharm.Res. 1990, 7, 801-807] are atomic indices calculated from a H-depleted molecular graph as:

 

 

where Ii is the intrinsic state of the ith atom and DIi is the field effect on the ith atom calculated as perturbation of the intrinsic state of ith atom by all other atoms in the molecule; dij is the topological distance between the ith and the jth atoms; A is the number of non-hydrogen atoms in the molecule. The exponent k is a parameter to modify the influence of distant or nearby atoms for particular studies. In DRAGON it is taken as = 2. The intrinsic state of the ith atom is calculated by:

 

 

where L is the principal quantum number, dn  is the number of valence electrons (valence vertex degree) and d is the number of sigma electrons (vertex degree) of the ith atom in the H-depleted molecular structure.

 

 

 

 

where nBO is the number of non-H bonds, nCIC the number of rings in the molecule, Si and Sj the electrotopological state indices for the two atoms incident to the bth bond. 

Note that the formula of TIE implemented in DRAGON has been modified with respect to the original one in order to obtain more well-founded values for all molecules.

 

The Balaban centric index (BAC) is derived for a H-depleted molecular graph based on the pruning of the graph, a stepwise procedurefor removing all the terminal vertices, i.e. vertices with a vertex degree of one, and the corresponding incident edges. The vertices removed at the kth step are nk and the Balaban centric index is calculated as the sum of the squares of nk numbers over the total number of steps to remove all vertices [A.T. Balaban, Theor.Chim.Acta 1979, 53, 355-375]. This index provides a measure of molecular branching: the higher the value of BAC, the more branched the graph. It is called centric index because it reflects the topology of the graph as viewed from the centre.

Note that whereas the original Balaban centric index was defined only for acyclic graphs, in DRAGON this index is extended to any molecular graph.

 

The lopping centric index (Lop) is calculated as the mean information content derived from the pruning partition of a graph:

 

 

where nk is the number of terminal vertices removed at the kth step and nSK the number of non-H atoms [A.T. Balaban, Theor.Chim.Acta 1979, 53, 355-375].