as a PDF

Transcription

as a PDF
1
Theory and practise of the
g-index
by
L. Egghe1(*),2
1
Universiteit Hasselt (UHasselt), Campus Diepenbeek, Agoralaan, B-3590 Diepenbeek,
Belgium 1
2
Universiteit Antwerpen (UA), Campus Drie Eiken, Universiteitsplein 1, B-2610 Wilrijk,
Belgium
[email protected]
ABSTRACT
The g-index is introduced as an improvement of the h-index of Hirsch to measure the global
citation performance of a set of articles. If this set is ranked in decreasing order of the number
of citations that they received, the g-index is the (unique) largest number such that the top g
articles received (together) at least g 2 citations. We prove the unique existence of g for any
set of articles and we have that g ≥ h .
The general Lotkaian theory of the g-index is presented and we show that
⎛ α −1 ⎞⎟
g = ⎜⎜
⎜⎝ α − 2 ⎠⎟⎟
(*)
α−1
α
1
Tα
Permanent address.
Key words and phrases: g-index, h-index, Lotka, citation performance, Price medallist.
Acknowledgement: The author is grateful to Drs. M. Goovaerts for the preparation of the citation data of the
Price medallists (January 2006).
2
where α > 2 is the Lotkaian exponent and where T denotes the total number of sources.
We then present the g-index of the (still active) Price medallists for their complete careers up
to 1972 and compare it with the h-index. It is shown that the g-index inherits all the good
properties of the h-index and, in addition, better takes into account the citation scores of the
top articles. This yields a better distinction between and order of the scientists from the point
of view of visibility.
I. Introduction
Recently the physicist Hirsch (see Hirsch (2005)) introduced the so-called h-index – see also
Ball (2005), Braun, Glänzel and Schubert (2005), Glänzel (2006a,b), Egghe and Rousseau
(2006). For any general “set of papers” one can arrange these papers in decreasing order of
the number of citations they received. The h-index is then the largest rank h = r such that the
paper on this rank (and hence also all papers on rank 1,…,h) has h or more citations. Hence
the papers on ranks h +1, h + 2, … have not more than h citations.
Although introduced by a physicist, this new science indicator has been well-received in
scientometrics (informetrics). In the above mentioned references it was argued that the hindex is a simple single number incorporating publication as well as citation data (hence
comprising quantitative as well as qualitative or visibility aspects) and hence has an advantage
over numbers such as “number of significant papers” (which is arbitrary) or “number of
citations to each of the (say) q most cited papers” (which again is not a single number). The hindex is also robust in the sense that it is insensitive to a set of uncited (or lowly cited) papers
but also it is insensitive to one or several outstandingly highly cited papers. This last aspect
can be considered as a drawback of the h-index. Let us discuss this point further.
Highly cited papers are, of course, important for the determination of the value h of the hindex. But once a paper is selected to belong to the top h papers, this paper is not “used” any
more in the determination of h, as a variable over time. Indeed, once a paper is selected to the
top group, the h-index calculated in subsequent years is not at all influenced by this paper’s
3
received citations further on: even if the paper doubles or triples its number of citations (or
even more) the subsequent h-indexes are not influenced by this. We think it is an advantage of
the h-index not to take into account the “tail” papers (with low number of citations) but it
should (being a measure of overall citation performance) take into account the citation
evolution of the most cited papers!
In order to overcome this disadvantage, whilst keeping the advantages of the h-index, we
make the following remark: by definition of the h-index, the papers on rank 1,…,h each have
at least h citations, hence these h papers together have at least h 2 citations. But it could well
be (see examples further on) that the first h + 1 papers have together (h + 1) or more citations
2
(here we use the fact that, most probably, the top papers have much more than h citations) and
the same might be true for ranks h + 2 (the top (h + 2) papers having together at least
(h + 2) citations) or even higher.
2
Therefore, in the Letter Egghe (2006a) we introduced a simple variant of the h-index: the gindex.
Definition I.1:
A set of papers has a g-index g if g is the highest rank such that the top g papers have,
together, at least g 2 citations. This also means that the top g + 1 papers have less than (g + 1)
2
papers.
The following proposition (also remarked in Egghe (2006a)) is trivial.
Proposition I.2:
In all cases one has that
g≥h
(1)
Proof:
Since h satisfies the requirement that the top h papers have at least h 2 papers and since g is
the largest number with this property, it is clear that g ≥ h .
4
An example shows the easy calculation of the h-index and the g-index. The data are the
author’s own citation data derived from the Web of Knowledge (WoK). It must be underlined,
however, that the real citation data can be much higher due to several reasons:
-
only source journals, selected by Thomson ISI are used,
-
unclear citations (even to source journals, e.g. “to appear” etc.) are not counted in the
WoK.
In the table below TC stands for the total number of citations for each paper on rank r = 1, 2,...
and ΣTC stands for the cumulative number of citations to the papers on rank 1,..., r (for each
r). The bold face typed numbers give the explanation for the h-index h = 13 and the g-index
g = 19 . Indeed h = 13 is the highest rank such that all papers on rank 1,..., h have at least 13
citations (and hence the papers on rank 14 or higher have not more than 13 citations). Also
g = 19 is the highest rank such that the top 19 papers have at least 192 = 361 citations (here
381 > 361 ); on rank 20 we have 392 < 202 = 400 citations.
Table 1. Ranking of the papers of L. Egghe according to their
number of citations received (source: WoK).
TC
r
ΣTC
r2
47
42
37
36
21
18
17
16
16
16
15
13
1
2
3
4
5
6
7
8
9
10
11
12
47
89
126
162
183
201
218
234
250
266
281
294
1
4
9
16
25
36
49
64
81
100
121
144
13
13
13
12
12
12
13
14
15
16
17
18
307
320
333
345
357
369
169
196
225
256
289
324
12
11
19
20
381
392
361
400
5
In the last section of this article we will compare the h- and g-indexes of the (active) Price
medallists (updated calculations of the h-index as in Glänzel and Persson (2005) and new
calculations of the g-index) showing the advantage of the g-index above the h-index but in the
next section we will give the mathematical theory of the g-index based on Lotka’s law
f ( j) =
C
jα
(2)
j ≥ 1 , C > 0 , α > 2 (it will turn out that, if we let j to be arbitrary large – which we assume
here for the sake of simplicity – we need to take α > 2 ). In case of (2) we will show that (T =
total number of sources (= papers here))
⎛ α −1 ⎞⎟
g = ⎜⎜
⎜⎝ α − 2 ⎠⎟⎟
α−1
α
1
Tα
(3)
, hence by Glänzel (2006b) or Egghe and Rousseau (2006), since one showed there that
1
h = T α , we have
⎛ α −1 ⎞⎟
g = ⎜⎜
⎜⎝ α − 2 ⎠⎟⎟
α−1
α
h>h
(4)
Also the relation of g with the total number A of items (= citations here) is given. Before this
theory is developed we will, firstly, show the general existence theorem for the g-index: for
any set of papers we always have that the g-index exists and is unique.
Note, cf. Braun, Glänzel and Schubert (2005), Egghe (2006a), Egghe and Rousseau (2006),
that any set of papers can be taken here, e.g. the papers of a scientist but also a year’s
production (articles) in a journal can be used.
6
II. Mathematical theory of the g-index
First we will give a mathematically exact definition of the g-index in continuous variables.
II.1 Mathematical definition of the g-index
Let f ( j) ( j ≥ 1) denote the general size-frequency function of the system (which can be more
general than the papers-citation relation: we can work in general information production
processes (IPPs) where we have sources that produce items – cf. Egghe and Rousseau (1990),
Egghe (2005)). We do not suppose f to be Lotkaian at this moment. Let g (r ) (r ∈ [0,T ]) denote
the general rank-frequency function (the function g (r ) should not be confused with the gindex; we keep the f ( j) and g (r ) notation since this has been done in all previous articles and
books on this topic – throughout the text it will be clear whether we deal with the function
g (r ) or with the g-index g). The general (defining) relation between the functions f ( j) and
g (r ) is as follows:
r = g−1 ( j) = ∫
∞
j
f ( j') dj '
(5)
Indeed, if r = g−1 ( j) (the inverse of the function g (r ) ) then g (r ) = j and there are r sources
with an item density value larger than or equal to j. Denote
r
G (r ) = ∫ g (r ') dr '
0
(6)
the cumulative number of items in the sources up to rank r (i.e. the top r sources).
Definition:
The rank r is the g-index: r = g of this system if r is the highest value such that
G (r ) ≥ r 2
(7)
7
Note that this is the exact formulation of the g-index as proposed in Section I in practical
systems.
II.2 Existence theorem for the g-index
Theorem II.2.1
Every general system has a unique g-index.
Proof:
Define, for all r ∈ ]0,T ]
H (r ) =
G (r )
(8)
r
and we define H (0) = lim H (r ) = g (0) . We first prove that H strictly decreases on [0,T ] . Indeed
r →0
>
H '( r ) =
rg (r ) − G ( r )
r2
<0
since
r
rg (r ) < G ( r ) = ∫ g ( r ') dr '
0
since the function g is strictly decreasing (by (5)) for all values of r ∈ ]0,T ] . Since
H (0) = lim H (r ) we hence have that H strictly decreases on [0,T ] . If H (T ) ≥ T then G (T ) ≥ T 2
r →0
>
and since this is the largest possible value, we have the unique g-index g = T . Suppose now
that H (T) < T . Define
F(r ) = H (r ) − r
F(r ) =
G (r )
r
−r
(9)
8
Since H (0) > 0 (since the function g strictly decreases (by (5)) and by (8)) and H (T) < T we
have F(0) > 0 and F(T ) < 0 . Hence, since F is continuous, there is a value r such that
F(r ) = 0 . By (8) and (9) we hence have the existence of a value r such that
G (r ) = r 2
Note that this satisfies (7) and that it is the highest possible value that satisfies (7): indeed, H
strictly decreases, so, for every value r ' > r we have
H ( r ') < H (r )
By (8):
G ( r ')
r'
<
G (r )
r
=r
G ( r ') < rr ' < r '2
contradicting (7). Hence this unique r value is the g-index: r = g . Note that, except if
G (T ) ≥ T 2 , we can prove that the g-index always satisfies (7) with an equality sign instead of
≥.
Now we will give formulae for the g-index in terms of parameters that appear in Lotkaian
informetrics.
II.3 Formulae for the g-index in Lotkaian systems
If G (T ) ≤ T 2 then we know from the proof of Theorem II.2.1 that the g-index satisfies (7)
with an equality sign:
G (g ) = g 2
(10)
9
Otherwise (if G (T ) > T 2 ) we take g = T .
We have the following theorem.
Theorem II.3.1:
Given the law of Lotka
f ( j) =
C
jα
(11)
j ≥ 1 , C > 0 , α > 2 , we have that the g-index equals
⎛ α −1 ⎞⎟
g = ⎜⎜
⎜⎝ α − 2 ⎠⎟⎟
⎛ α −1 ⎞⎟
if ⎜⎜⎜
⎟
⎝ α − 2 ⎠⎟
α−1
α
⎛ α −1 ⎞⎟
T ≤ T and g = T if ⎜⎜
⎜⎝ α − 2 ⎠⎟⎟
1
α
α−1
α
α−1
α
1
Tα
(12)
1
Tα > T .
Here T denotes the total number of sources.
Proof:
First proof:
The first (cf. (5))
r = g−1 ( j) = ∫
=
∞
j
f ( j') dj '
C 1−α
j
α −1
(13)
sources yield a total number of items (since α > 2 )
∫
∞
j
j'f ( j') dj' =
C 2−α
j
α−2
(14)
10
(cf. also Egghe (2005), Chapter II).
So, by (10) we have r = g if
C 2−α
j = g2
α−2
(15)
and if this g satisfies g ≤ T (otherwise take g = T ).
By (13):
1
⎛(α −1) r ⎞⎟1−α
⎟
j = ⎜⎜⎜
⎜⎝ C ⎠⎟⎟
(16)
(16) (for r = g ) in (15) yields
α −2 ⎤
⎡
⎢ C ⎛⎜ α −1⎞⎟ α−1 ⎥
g=⎢
⎟ ⎥
⎜
⎢ α − 2 ⎜⎝ C ⎠⎟ ⎥
⎢⎣
⎥⎦
⎛ α −1 ⎞⎟
g = ⎜⎜
⎜⎝ α − 2 ⎠⎟⎟
using that T =
α−1
α
α−1
α
1
Tα
C
as follows from (13) by taking j = 1 . This value is taken as the gα −1
index if it is ≤ T and we take g = T if it is strictly larger than T.
Second proof:
Now we work directly with formula (10). Note that Lotka’s law (11) is equivalent with
Zipf’s law
g (r ) =
B
rβ
(17)
11
B, β > 0 , r ∈ ]0,T ] where we have the relations
1
⎛ C ⎞⎟α−1
B = ⎜⎜
⎜⎝ α −1⎠⎟⎟
β=
(18)
1
α −1
(19)
(cf. Egghe (2005), Exercise II.2.2.6 but see also the Appendix in Egghe and Rousseau
(2006) where a proof is given.).
Note that by (19) α > 2 is equivalent with 0 < β < 1 . If that is the case, (10) gives
∫
g
0
B
dr = g 2
rβ
hence, since 0 < β < 1
B 1−β
g = g2
1− β
Hence
1
⎛ B ⎞⎟β+1
⎟
g = ⎜⎜
⎜⎝1 − β ⎠⎟⎟
(20)
Now (18) and (19) in (20) again yield formula (12).
Corollary II.3.2:
If g is the g-index and h is the h-index of a Lotkaian system with exponent α > 2 , then
⎛ α −1 ⎞⎟
g = ⎜⎜
⎜⎝ α − 2 ⎠⎟⎟
α−1
α
h
(21)
12
(if this value is ≤ T ; otherwise g = T ).
Proof:
1
This follows readily from (12) and the fact that h = T α , see Egghe and Rousseau (2006) (also
proved, approximatively, in Glänzel (2006b)).
Taking j = 1 in (13) and (14) we see that the total number of sources T equals
the total number of items A equals
C
and that
α −1
C
, hence
α−2
μ=
A α −1
=
T α−2
(22)
equals the average number of items per source (cf. also Egghe (2005), Chapter II). Hence we
have the following corollary
Corollary II.3.3:
If μ is as above we have in case of (21)
g=μ
α−1
α
(23)
h
The g-index in function of α and A is as in Corollary II.3.4.
Corollary II.3.4:
We have
⎛ α −1 ⎞⎟
g = ⎜⎜
⎜⎝ α − 2 ⎠⎟⎟
(if this value is ≤ T ).
α−2
α
1
Aα
(24)
13
Proof:
This follows readily from (22) and (12).
We can also determine the item density j for which we have r = g . In practical cases this
means the number of items in the source at rank g. Note that this is h for the h-index r = h , by
definition of the Hirsch index.
For the g-index we have: if the value in (12) is > T we have j = g (T ) = 1 and if the value in
(12) is ≤ T we substitute r = g in (17), using (18) and (19) and the fact that T =
C
,
α −1
yielding
1
⎛ α − 2 ⎞⎟α
T
j = ⎜⎜
⎜⎝ α −1 ⎠⎟⎟
(26)
We immediately see that j < h which is logical since g > h and the item density is h in case
r = h . Formula (26) presents a concrete formula for the item density cut-off place.
Note that, although α can be any value α > 2 , we do not have lim j = 0 . Indeed since the
α→2
>
validity of (12) is limited to g ≤ T we have, by (12) that
⎛ α −1 ⎞⎟
⎟
⎜⎜⎜
⎝ α − 2 ⎠⎟
α−1
α
1
Tα ≤ T
from which it follows that
μ=
This implies in (26) that
A α −1
=
≤T
T α−2
(27)
14
1
⎛ T 2 ⎞α
j = ⎜⎜ ⎟⎟⎟ ≥ 1
⎜⎝ A ⎠⎟
, by (27).
The case
⎛
⎞
⎜⎜ α −1 ⎟⎟
⎜⎝ α − 2 ⎠⎟
α−1
α
1
Tα > T
(28)
(hence where we take g = T ) occurs in the following case: from (28) it follows that
α −1
>T
α−2
(29)
By (22) we have
μ=
A
>T
T
hence
A > T2
(30)
Equivalently, (29) gives the condition in α :
α<
2T −1
T −1
Note that
2T −1
>2
T −1
(31)
15
so that (31) can occur (with the condition α > 2 ).
Remark II.3.5:
It might seem strange that g = T is possible in this Lotkaian model. Note however that (22)
implies that
α=
2A − T
A−T
So, for every T fixed, if we let A → ∞ we have that α → 2 (but α > 2 ) so that we are within
the limitations of our theory. In this case we have
A = G (T ) > T 2
and hence g = T .
In the next section we will apply the g-index to the publications and citations of the (still
active) Price medallists and compare these g-indexes with the h-indexes of these same data.
III. Calculation and comparison of the h- and gindexes of the (still active) Price medallists
In Glänzel and Persson (2005), the h-indexes for the (still active) Price medallists are
calculated. We could use these numbers and compare them with the here defined g-index.
However for this we need to extend the tables in Glänzel and Persson (2005) (since g ≥ h )
and it is hardly impossible to do this since we should do this for the maximal citing time
August 2005 (since then the tables in Glänzel and Persson (2005) were produced). So the
easiest thing to do is to remake these tables for the present time (January 2006) and make
them long enough so that, on the same tables, the h- as well as the g-index can be calculated.
16
We have opted not to limit the publication year to 1986 or higher (as was the case in Glänzel
and Persson (2005)). Indeed, the h-indexes in Glänzel and Persson (2005) seemed a bit
unnatural in several senses. E. Garfield did not have the highest h-index (which we, normally,
could expect) and H. Small scored lowest of all the Price medallists. The major reason for
these observations is that, by limiting the publication year to 1986 or higher, one cuts away
most publications (and perhaps the highest cited ones) of the relatively older scientists. Since
we want to make a comparison of scientists (and not to draw conclusions on informetrics
fields) we decided not to limit the publication year (except to the evident limit 1972 since
before that date the ISI (now Thomson ISI) data do not exist.
For the same reason we count all publications even if scientists have published in different
domains (e.g. T. Braun in chemistry and L. Egghe in mathematics). These publications were
not used in the Glänzel and Persson study.
Of course, by not limiting the publication period and the publication field, one might argue
that there is a bias towards the older scientists. This is true but, with the h- and g-indexes, we
want to indicate the “overall performance (visibility)” of the scientists as they are viewn today
(in the sense of “lifetime achievement”).
We base ourselves on the Web of Knowledge (WoK) and hence we are limited to the
Thomson ISI data. This means that no citations to non-source journals or conference
proceedings articles or books are counted. In addition, no citations to incomplete references
are counted even if they are to source journal articles (e.g. a citation to JASIST, 2001, to
appear): these are not collected in the WoK “times cited” data. So the actual h- and g-indexes
can be somewhat higher but this effect plays for every scientist so that comparisons are still
possible and also these limitations do not jeopardise the possibility to compare the h- and gindex.
The tables of citation data of the (still active) medallists are found in the Appendix. The table
stops one line below the g-index since this is all we need. The number r denotes the rank of
the publication and TC denotes the total number of citations to the paper on rank r. The
number ΣTC denotes the cumulative number of citations to the first r ranked papers. Finally,
also the table of r 2 values is presented as well as the publication year (PY) of the article on
17
rank r. The h- and g-index determination is highlighted in the tables in the Appendix. Table 2
gives the results in decreasing order of h and g.
Table 2. h- and g-indexes of Price medallists
in decreasing order
Name
h-index
Name
g-index
Garfield
Narin
27
27
Garfield
59
Narin
40
Braun
25
Small
39
Van Raan
19
Braun
38
Glänzel
Moed
Schubert
Small
18
18
18
18
Schubert
30
Martin
16
Glänzel
Martin
Moed
Van Raan
27
27
27
27
Egghe
Ingwersen
Leydesdorff
Rousseau
13
13
13
13
Ingwersen
26
White
25
White
12
Egghe
Leydesdorff
19
19
Rousseau
15
We leave the detailed (subjective) interpretation of Table 2 to the reader but it is clear that the
g-index column is more in line with intuition and with the raw data in the Appendix than the
h-index column. In other words, the g-index, as simple as the h-index (a single measure,
containing publication and citation elements), contains more comparative information from
the raw data than the h-index and resembles more the overall feeling of “visibility” or “life
time achievement”.
A possible interesting measure is
g
, i.e. the relative increase of g with respect to h. The result
h
is presented in Table 3, in decreasing order of
respect to the h- or g-orderings.
g
. Here we see remarkable order changes with
h
18
Table 3.
g
-values of Price medallists in decreasing order
h
Name
g
h
Garfield
2.19
Small
2.17
White
2.08
Ingwersen
2.00
Martin
1.69
Schubert
1.67
Braun
1.52
Glänzel
Moed
1.50
1.50
Narin
1.48
Egghe
Leydesdorff
1.46
1.46
Van Raan
1.42
Rousseau
1.15
19
Appendix
Tables of TC, r, ΣTC , r 2 and PY for each of the (still active) Price medallists and
determination of the h- and g-index.
Garfield E.
TC
r
ΣTC
r2
PY
625
149
138
132
132
129
127
111
109
108
107
105
104
101
96
91
89
88
87
85
80
67
63
41
29
28
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
625
774
912
1044
1176
1305
1432
1543
1652
1760
1867
1972
2076
2177
2273
2364
2453
2541
2628
2713
2793
2860
2923
2964
2993
3021
1
4
9
16
25
36
49
64
81
100
121
144
169
196
225
256
289
324
361
400
441
484
529
576
625
676
1972
1980
1977
1983
1981
1979
1996
1978
1975
1985
1984
1982
1986
1976
1973
1976
1974
1986
1987
1979
1985
1988
1999
1980
1990
1987
27
26
26
23
23
20
19
19
18
18
18
27
28
29
30
31
32
33
34
35
36
37
3048
3074
3100
3123
3146
3166
3185
3204
3222
3240
3258
729
784
841
900
961
1024
1089
1156
1225
1296
1369
1987
1976
1992
1978
1990
1990
1998
1998
1985
1979
1996
20
16
15
14
13
13
13
13
13
12
12
12
12
12
11
11
10
10
9
9
9
9
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
3274
3289
3303
3316
3329
3342
3355
3368
3380
3392
3404
3416
3428
3439
3450
3460
3470
3479
3488
3497
3506
1444
1521
1600
1681
1764
1849
1936
2025
2116
2209
2304
2401
2500
2601
2704
2809
2916
3025
3136
3249
3364
1979
1990
1976
1973
1973
1973
1998
1990
1973
2000
1998
1997
1996
1998
1997
1985
1984
1984
1975
1972
2002
9
9
59
60
3515
3524
3481
3600
1998
1990
TC
r
ΣTC
r2
PY
125
124
78
66
57
57
55
51
43
42
38
37
37
35
35
33
32
31
31
28
27
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
125
249
327
393
450
507
562
613
656
698
736
773
810
845
880
913
945
976
1007
1035
1062
1
4
9
16
25
36
49
64
81
100
121
144
169
196
225
256
289
324
361
400
441
1978
1989
1986
1975
1974
1990
1974
1989
1992
1974
1983
1995
1994
1980
1999
1988
1995
1975
1995
1977
1973
Braun T.
21
27
27
26
22
23
24
1089
1116
1142
484
529
576
1988
1987
2000
26
25
25
23
23
23
23
22
22
22
22
21
21
25
26
27
28
29
30
31
32
33
34
35
36
37
1168
1193
1218
1241
1264
1287
1310
1332
1354
1376
1398
1419
1440
625
676
729
784
841
900
961
1024
1089
1156
1225
1296
1369
1994
1973
1972
1978
1973
1994
1987
1983
1982
1980
1987
1973
1973
20
20
38
39
1460
1480
1444
1521
1982
1982
TC
r
ΣTC
r2
PY
305
239
127
109
86
80
77
75
67
49
44
36
26
26
25
22
22
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
305
544
671
780
866
946
1023
1098
1165
1214
1258
1294
1320
1346
1371
1393
1415
1
4
9
16
25
36
49
64
81
100
121
144
169
196
225
256
289
1973
1974
1978
1974
1977
1985
1985
1985
1999
1979
1980
1980
1981
1986
1976
1997
1993
18
18
15
12
10
9
8
8
18
19
20
21
22
23
24
25
1433
1451
1466
1478
1488
1497
1505
1513
324
361
400
441
484
529
576
625
1974
1994
1999
1986
1989
1975
1998
1987
Small H.
22
7
6
5
5
5
3
3
2
2
2
1
1
1
26
27
28
29
30
31
32
33
34
35
36
37
38
1520
1526
1531
1536
1541
1544
1547
1549
1551
1553
1554
1555
1556
676
729
784
841
900
961
1024
1089
1156
1225
1296
1369
1444
1989
1998
1977
1974
1999
1979
1995
1975
2004
2003
1973
2004
1997
1
1
39
40
1557
1558
1521
1600
1996
1992
Van Raan A.F.J.
TC
r
ΣTC
r2
PY
108
51
49
41
35
32
31
30
25
25
23
22
22
21
20
19
19
19
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
108
159
208
249
284
316
347
377
402
427
450
472
494
515
535
554
573
592
1
4
9
16
25
36
49
64
81
100
121
144
169
196
225
256
289
324
1985
1996
1991
1985
1991
1973
1990
1990
1993
1974
1995
1998
1997
2000
2001
1998
1998
1994
19
18
18
17
17
17
15
14
19
20
21
22
23
24
25
26
611
629
647
664
681
698
713
727
361
400
441
484
529
576
625
676
1994
1998
1993
1993
1985
1980
1993
2001
14
14
27
28
741
755
729
784
1994
1991
23
Martin B.
TC
r
ΣTC
r2
PY
156
74
52
38
35
33
33
30
29
28
24
23
22
20
19
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
156
230
282
320
355
388
421
451
480
508
532
555
577
597
616
1
4
9
16
25
36
49
64
81
100
121
144
169
196
225
1983
1997
1985
1983
2001
1987
1985
1995
1996
1984
1988
1981
1999
1984
1985
18
16
16
16
16
14
14
11
9
9
9
16
17
18
19
20
21
22
23
24
25
26
634
650
666
682
698
712
726
737
746
755
764
256
289
324
361
400
441
484
529
576
625
676
1986
1996
1986
1985
1984
1991
1984
1986
1994
1989
1987
6
4
27
28
770
774
729
784
1982
1992
TC
r
ΣTC
r2
PY
112
95
86
82
73
71
70
63
59
1
2
3
4
5
6
7
8
9
112
207
293
375
448
519
589
652
711
1
4
9
16
25
36
49
64
81
1997
1987
1976
1976
1977
1991
1972
1985
1992
Narin F.
24
55
55
53
52
52
44
41
38
37
35
33
33
29
28
28
28
27
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
766
821
874
926
978
1022
1063
1101
1138
1173
1206
1239
1268
1296
1324
1352
1379
100
121
144
169
196
225
256
289
324
361
400
441
484
529
576
625
676
1978
1973
1975
1991
1981
1977
1980
2000
1980
1999
1989
1987
1994
1996
1977
1976
1984
27
26
24
23
20
19
18
18
17
14
13
12
10
27
28
29
30
31
32
33
34
35
36
37
38
39
1406
1432
1456
1479
1499
1518
1536
1554
1571
1585
1598
1610
1620
729
784
841
900
961
1024
1089
1156
1225
1296
1369
1444
1521
1983
1988
1988
1995
1998
1994
1980
1979
1978
1996
1983
1986
1977
10
9
40
41
1630
1639
1600
1681
1972
1983
TC
r
ΣTC
r2
PY
124
90
78
59
57
40
33
32
27
27
27
26
1
2
3
4
5
6
7
8
9
10
11
12
124
214
292
351
408
448
481
513
540
567
594
620
1
4
9
16
25
36
49
64
81
100
121
144
1989
2002
1986
1978
1990
1979
1988
1983
1988
1987
1984
2000
Schubert A.
25
26
23
23
22
19
13
14
15
16
17
646
669
692
714
733
169
196
225
256
289
1994
1994
1987
1987
1986
18
18
18
18
17
17
17
16
15
14
14
14
18
19
20
21
22
23
24
25
26
27
28
29
751
769
787
805
822
839
856
872
887
901
915
929
324
361
400
441
484
529
576
625
676
729
784
841
2000
1993
1986
1984
2001
1988
1982
1982
2002
1993
1989
1985
13
12
30
31
942
954
900
961
1992
1996
TC
r
ΣTC
r2
PY
124
54
33
32
32
31
28
27
27
27
26
24
23
23
22
22
20
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
124
178
211
243
275
306
334
361
388
415
441
465
488
511
533
555
575
1
4
9
16
25
36
49
64
81
100
121
144
169
196
225
256
289
1989
1988
1988
1995
1983
1995
1995
1988
1987
1984
1994
2001
1994
1987
2002
1987
1994
19
18
18
18
18
17
17
18
19
20
21
22
23
24
594
612
630
648
666
683
700
324
361
400
441
484
529
576
1986
1994
1993
1986
1984
2001
1988
Glänzel W.
26
16
16
25
26
716
732
625
676
1999
1996
15
15
27
28
747
762
729
784
1997
1996
Moed F.H.
TC
r
ΣTC
r2
PY
108
56
54
54
49
41
35
31
26
26
24
23
23
22
22
20
20
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
108
164
218
272
321
362
397
428
454
480
504
527
550
572
594
614
634
1
4
9
16
25
36
49
64
81
100
121
144
169
196
225
256
289
1985
1995
1996
1995
1991
1985
1991
1990
2002
1989
1996
1999
1998
2002
1991
2001
1999
18
17
15
15
13
13
12
12
9
18
19
20
21
22
23
24
25
26
652
669
684
699
712
725
737
749
758
324
361
400
441
484
529
576
625
676
1989
1985
1999
1993
1998
1993
2002
1993
1999
9
9
27
28
767
776
729
784
1996
1996
Leydesdorff L.
TC
r
ΣTC
r2
PY
79
32
26
24
23
1
2
3
4
5
79
111
137
161
184
1
4
9
16
25
2000
1998
1986
1989
1990
27
22
19
17
17
16
15
13
6
7
8
9
10
11
12
206
225
242
259
275
290
303
36
49
64
81
100
121
144
1987
1989
1996
1991
1997
1994
1994
13
13
11
11
11
10
13
14
15
16
17
18
316
329
340
351
362
372
169
196
225
256
289
324
1993
1989
2000
1993
1992
1998
10
9
19
20
382
391
361
400
1997
1992
TC
r
ΣTC
r2
PY
47
42
37
36
21
18
17
16
16
16
15
13
1
2
3
4
5
6
7
8
9
10
11
12
47
89
126
162
183
201
218
234
250
266
281
294
1
4
9
16
25
36
49
64
81
100
121
144
1990
1985
2000
1992
1992
1991
1986
1995
1988
1986
1993
1996
13
13
13
12
12
12
13
14
15
16
17
18
307
320
333
345
357
369
169
196
225
256
289
324
1996
1990
1988
2000
1994
1988
12
11
19
20
381
392
361
400
1987
2000
Egghe L.
Rousseau R.
TC
r
ΣTC
r2
PY
25
1
25
1
1996
28
18
18
16
16
15
15
14
13
13
13
13
2
3
4
5
6
7
8
9
10
11
12
43
61
77
93
108
123
137
150
163
176
189
4
9
16
25
36
49
64
81
100
121
144
2003
1991
1995
1988
1987
1992
1994
2002
1999
1996
1996
13
13
202
169
1993
12
14
214
196
2000
12
12
15
16
226
238
225
256
2000
1990
Ingwersen P.
TC
r
ΣTC
r2
PY
120
93
83
79
52
37
31
29
29
19
17
15
1
2
3
4
5
6
7
8
9
10
11
12
120
213
296
375
427
464
495
524
553
572
589
604
1
4
9
16
25
36
49
64
81
100
121
144
1996
1998
1997
1982
2001
1997
1984
1997
1987
2000
1984
1996
14
10
10
8
7
7
6
6
5
3
3
3
3
13
14
15
16
17
18
19
20
21
22
23
24
25
618
628
638
646
653
660
666
672
677
680
683
686
689
169
196
225
256
289
324
361
400
441
484
529
576
625
1997
2001
1992
1999
1999
1995
2000
1993
2000
2001
2000
2000
1994
3
3
26
27
692
695
676
729
1994
1992
29
White H.D.
TC
r
ΣTC
r2
PY
128
106
103
45
37
28
22
21
20
15
14
1
2
3
4
5
6
7
8
9
10
11
128
234
337
382
419
447
469
490
510
525
539
1
4
9
16
25
36
49
64
81
100
121
1981
1998
1989
1997
1982
1981
1983
1987
2001
1987
1986
14
12
12
12
12
11
10
8
6
5
5
5
5
12
13
14
15
16
17
18
19
20
21
22
23
24
553
565
577
589
601
612
622
630
636
641
646
651
656
144
169
196
225
256
289
324
361
400
441
484
529
576
1985
2003
1996
1981
1981
2003
1990
1986
2001
2004
1986
1984
1977
4
4
25
26
660
664
625
676
2003
1986
IV. Conclusions and open problems
In this paper we studied the g-index being an improvement of the h-index. The g-index g is
the largest rank (where papers are arranged in decreasing order of the number of citations they
received) such that the first g papers have (together) at least g 2 citations. We show that g ≥ h
and that g always uniquely exists. We present formulae for g in Lotkaian informetrics. We
show that
30
⎛ α −1 ⎞⎟
g = ⎜⎜
⎜⎝ α − 2 ⎠⎟⎟
α−1
α
⎛ α −1 ⎞⎟
g = ⎜⎜
⎜⎝ α − 2 ⎠⎟⎟
T
α−1
α
1
α
h
if these values are ≤ T ; otherwise g = T .
Here α is the Lotka exponent and T denotes the total number of sources (in the citation
application this means the total number of ever cited papers).
We then calculate the h- and g-indexes of the (still active) Price medallists. Different than in
Glänzel and Persson (2005) we do not limit the publication period (except for the fact that we
do not use papers older than published in 1972 due to the fact that ISI has no data for them)
nor do we limit the topic to informetrics, hence the complete careers (up to 1972) of the Price
medallists are considered. It is found that the ranked g-index column resembles more the
overall feeling of “visibility” or “life time achievement” than does the ranked h-index column.
We leave open the further exploration of the g-index, including the establishment of the gindex in function of time. In Egghe (2006b) we were able to do this for the h-index based on
the cumulative nnt citation distribution (see Egghe and Rao (2001)) and in a forthcoming
paper we will do the same for the g-index based on a time-dependent Lotkaian theory.
We also leave open the construction of other h- or g-like indexes and the comparison of these
new indexes with the h- and g-index. It would also be interesting to work out more practical
cases (in other fields) of h- and g-index comparisons. Such case studies can learn a lot on the
advantages and/or disadvantages of the h-index and the g-index.
31
References
Ball P. (2005). Index aims for fair ranking of scientists. Nature 436, 900.
Braun T., Glänzel W. and Schubert A. (2005). A Hirsch-type index for journals. The Scientist
19(22), 8.
Egghe L. (2005). Power Laws in the Information Production Process: Lotkaian Informetrics.
Elsevier, Oxford (UK).
Egghe L. (2006a). An improvement of the h-index: the g-index. Preprint.
Egghe L. (2006b). Dynamic h-index: the Hirsch index in function of time. Preprint.
Egghe L. and Rao I.K.R. (2001). Theory of first-citation distributions and applications.
Mathematical and Computer Modelling 34(1-2), 81-90.
Egghe L. and Rousseau R. (1990). Introduction to Informetrics. Quantitative Methods in
Library, Documentation and Information Science. Elsevier, Amsterdam (the
Netherlands).
Egghe L. and Rousseau R. (2006). An informetric model of the Hirsch index. Preprint.
Glänzel W. (2006a). On the opportunities and limitations of the H-index. Science Focus 1, to
appear.
Glänzel W. (2006b). On the H-index – A mathematical approach to a new measure of
publication activity and citation impact. Scientometrics 67(2), to appear.
Glänzel W. and Persson O. (2005). H-index for Price medallist. ISSI Newsletter 1(4), 15-18.
Hirsch J.E. (2005). An index to quantify an individual’s scientific research output.
arxiv:physics/0508025.

Documents pareils