as a PDF
Transcription
as a PDF
1 Theory and practise of the g-index by L. Egghe1(*),2 1 Universiteit Hasselt (UHasselt), Campus Diepenbeek, Agoralaan, B-3590 Diepenbeek, Belgium 1 2 Universiteit Antwerpen (UA), Campus Drie Eiken, Universiteitsplein 1, B-2610 Wilrijk, Belgium [email protected] ABSTRACT The g-index is introduced as an improvement of the h-index of Hirsch to measure the global citation performance of a set of articles. If this set is ranked in decreasing order of the number of citations that they received, the g-index is the (unique) largest number such that the top g articles received (together) at least g 2 citations. We prove the unique existence of g for any set of articles and we have that g ≥ h . The general Lotkaian theory of the g-index is presented and we show that ⎛ α −1 ⎞⎟ g = ⎜⎜ ⎜⎝ α − 2 ⎠⎟⎟ (*) α−1 α 1 Tα Permanent address. Key words and phrases: g-index, h-index, Lotka, citation performance, Price medallist. Acknowledgement: The author is grateful to Drs. M. Goovaerts for the preparation of the citation data of the Price medallists (January 2006). 2 where α > 2 is the Lotkaian exponent and where T denotes the total number of sources. We then present the g-index of the (still active) Price medallists for their complete careers up to 1972 and compare it with the h-index. It is shown that the g-index inherits all the good properties of the h-index and, in addition, better takes into account the citation scores of the top articles. This yields a better distinction between and order of the scientists from the point of view of visibility. I. Introduction Recently the physicist Hirsch (see Hirsch (2005)) introduced the so-called h-index – see also Ball (2005), Braun, Glänzel and Schubert (2005), Glänzel (2006a,b), Egghe and Rousseau (2006). For any general “set of papers” one can arrange these papers in decreasing order of the number of citations they received. The h-index is then the largest rank h = r such that the paper on this rank (and hence also all papers on rank 1,…,h) has h or more citations. Hence the papers on ranks h +1, h + 2, … have not more than h citations. Although introduced by a physicist, this new science indicator has been well-received in scientometrics (informetrics). In the above mentioned references it was argued that the hindex is a simple single number incorporating publication as well as citation data (hence comprising quantitative as well as qualitative or visibility aspects) and hence has an advantage over numbers such as “number of significant papers” (which is arbitrary) or “number of citations to each of the (say) q most cited papers” (which again is not a single number). The hindex is also robust in the sense that it is insensitive to a set of uncited (or lowly cited) papers but also it is insensitive to one or several outstandingly highly cited papers. This last aspect can be considered as a drawback of the h-index. Let us discuss this point further. Highly cited papers are, of course, important for the determination of the value h of the hindex. But once a paper is selected to belong to the top h papers, this paper is not “used” any more in the determination of h, as a variable over time. Indeed, once a paper is selected to the top group, the h-index calculated in subsequent years is not at all influenced by this paper’s 3 received citations further on: even if the paper doubles or triples its number of citations (or even more) the subsequent h-indexes are not influenced by this. We think it is an advantage of the h-index not to take into account the “tail” papers (with low number of citations) but it should (being a measure of overall citation performance) take into account the citation evolution of the most cited papers! In order to overcome this disadvantage, whilst keeping the advantages of the h-index, we make the following remark: by definition of the h-index, the papers on rank 1,…,h each have at least h citations, hence these h papers together have at least h 2 citations. But it could well be (see examples further on) that the first h + 1 papers have together (h + 1) or more citations 2 (here we use the fact that, most probably, the top papers have much more than h citations) and the same might be true for ranks h + 2 (the top (h + 2) papers having together at least (h + 2) citations) or even higher. 2 Therefore, in the Letter Egghe (2006a) we introduced a simple variant of the h-index: the gindex. Definition I.1: A set of papers has a g-index g if g is the highest rank such that the top g papers have, together, at least g 2 citations. This also means that the top g + 1 papers have less than (g + 1) 2 papers. The following proposition (also remarked in Egghe (2006a)) is trivial. Proposition I.2: In all cases one has that g≥h (1) Proof: Since h satisfies the requirement that the top h papers have at least h 2 papers and since g is the largest number with this property, it is clear that g ≥ h . 4 An example shows the easy calculation of the h-index and the g-index. The data are the author’s own citation data derived from the Web of Knowledge (WoK). It must be underlined, however, that the real citation data can be much higher due to several reasons: - only source journals, selected by Thomson ISI are used, - unclear citations (even to source journals, e.g. “to appear” etc.) are not counted in the WoK. In the table below TC stands for the total number of citations for each paper on rank r = 1, 2,... and ΣTC stands for the cumulative number of citations to the papers on rank 1,..., r (for each r). The bold face typed numbers give the explanation for the h-index h = 13 and the g-index g = 19 . Indeed h = 13 is the highest rank such that all papers on rank 1,..., h have at least 13 citations (and hence the papers on rank 14 or higher have not more than 13 citations). Also g = 19 is the highest rank such that the top 19 papers have at least 192 = 361 citations (here 381 > 361 ); on rank 20 we have 392 < 202 = 400 citations. Table 1. Ranking of the papers of L. Egghe according to their number of citations received (source: WoK). TC r ΣTC r2 47 42 37 36 21 18 17 16 16 16 15 13 1 2 3 4 5 6 7 8 9 10 11 12 47 89 126 162 183 201 218 234 250 266 281 294 1 4 9 16 25 36 49 64 81 100 121 144 13 13 13 12 12 12 13 14 15 16 17 18 307 320 333 345 357 369 169 196 225 256 289 324 12 11 19 20 381 392 361 400 5 In the last section of this article we will compare the h- and g-indexes of the (active) Price medallists (updated calculations of the h-index as in Glänzel and Persson (2005) and new calculations of the g-index) showing the advantage of the g-index above the h-index but in the next section we will give the mathematical theory of the g-index based on Lotka’s law f ( j) = C jα (2) j ≥ 1 , C > 0 , α > 2 (it will turn out that, if we let j to be arbitrary large – which we assume here for the sake of simplicity – we need to take α > 2 ). In case of (2) we will show that (T = total number of sources (= papers here)) ⎛ α −1 ⎞⎟ g = ⎜⎜ ⎜⎝ α − 2 ⎠⎟⎟ α−1 α 1 Tα (3) , hence by Glänzel (2006b) or Egghe and Rousseau (2006), since one showed there that 1 h = T α , we have ⎛ α −1 ⎞⎟ g = ⎜⎜ ⎜⎝ α − 2 ⎠⎟⎟ α−1 α h>h (4) Also the relation of g with the total number A of items (= citations here) is given. Before this theory is developed we will, firstly, show the general existence theorem for the g-index: for any set of papers we always have that the g-index exists and is unique. Note, cf. Braun, Glänzel and Schubert (2005), Egghe (2006a), Egghe and Rousseau (2006), that any set of papers can be taken here, e.g. the papers of a scientist but also a year’s production (articles) in a journal can be used. 6 II. Mathematical theory of the g-index First we will give a mathematically exact definition of the g-index in continuous variables. II.1 Mathematical definition of the g-index Let f ( j) ( j ≥ 1) denote the general size-frequency function of the system (which can be more general than the papers-citation relation: we can work in general information production processes (IPPs) where we have sources that produce items – cf. Egghe and Rousseau (1990), Egghe (2005)). We do not suppose f to be Lotkaian at this moment. Let g (r ) (r ∈ [0,T ]) denote the general rank-frequency function (the function g (r ) should not be confused with the gindex; we keep the f ( j) and g (r ) notation since this has been done in all previous articles and books on this topic – throughout the text it will be clear whether we deal with the function g (r ) or with the g-index g). The general (defining) relation between the functions f ( j) and g (r ) is as follows: r = g−1 ( j) = ∫ ∞ j f ( j') dj ' (5) Indeed, if r = g−1 ( j) (the inverse of the function g (r ) ) then g (r ) = j and there are r sources with an item density value larger than or equal to j. Denote r G (r ) = ∫ g (r ') dr ' 0 (6) the cumulative number of items in the sources up to rank r (i.e. the top r sources). Definition: The rank r is the g-index: r = g of this system if r is the highest value such that G (r ) ≥ r 2 (7) 7 Note that this is the exact formulation of the g-index as proposed in Section I in practical systems. II.2 Existence theorem for the g-index Theorem II.2.1 Every general system has a unique g-index. Proof: Define, for all r ∈ ]0,T ] H (r ) = G (r ) (8) r and we define H (0) = lim H (r ) = g (0) . We first prove that H strictly decreases on [0,T ] . Indeed r →0 > H '( r ) = rg (r ) − G ( r ) r2 <0 since r rg (r ) < G ( r ) = ∫ g ( r ') dr ' 0 since the function g is strictly decreasing (by (5)) for all values of r ∈ ]0,T ] . Since H (0) = lim H (r ) we hence have that H strictly decreases on [0,T ] . If H (T ) ≥ T then G (T ) ≥ T 2 r →0 > and since this is the largest possible value, we have the unique g-index g = T . Suppose now that H (T) < T . Define F(r ) = H (r ) − r F(r ) = G (r ) r −r (9) 8 Since H (0) > 0 (since the function g strictly decreases (by (5)) and by (8)) and H (T) < T we have F(0) > 0 and F(T ) < 0 . Hence, since F is continuous, there is a value r such that F(r ) = 0 . By (8) and (9) we hence have the existence of a value r such that G (r ) = r 2 Note that this satisfies (7) and that it is the highest possible value that satisfies (7): indeed, H strictly decreases, so, for every value r ' > r we have H ( r ') < H (r ) By (8): G ( r ') r' < G (r ) r =r G ( r ') < rr ' < r '2 contradicting (7). Hence this unique r value is the g-index: r = g . Note that, except if G (T ) ≥ T 2 , we can prove that the g-index always satisfies (7) with an equality sign instead of ≥. Now we will give formulae for the g-index in terms of parameters that appear in Lotkaian informetrics. II.3 Formulae for the g-index in Lotkaian systems If G (T ) ≤ T 2 then we know from the proof of Theorem II.2.1 that the g-index satisfies (7) with an equality sign: G (g ) = g 2 (10) 9 Otherwise (if G (T ) > T 2 ) we take g = T . We have the following theorem. Theorem II.3.1: Given the law of Lotka f ( j) = C jα (11) j ≥ 1 , C > 0 , α > 2 , we have that the g-index equals ⎛ α −1 ⎞⎟ g = ⎜⎜ ⎜⎝ α − 2 ⎠⎟⎟ ⎛ α −1 ⎞⎟ if ⎜⎜⎜ ⎟ ⎝ α − 2 ⎠⎟ α−1 α ⎛ α −1 ⎞⎟ T ≤ T and g = T if ⎜⎜ ⎜⎝ α − 2 ⎠⎟⎟ 1 α α−1 α α−1 α 1 Tα (12) 1 Tα > T . Here T denotes the total number of sources. Proof: First proof: The first (cf. (5)) r = g−1 ( j) = ∫ = ∞ j f ( j') dj ' C 1−α j α −1 (13) sources yield a total number of items (since α > 2 ) ∫ ∞ j j'f ( j') dj' = C 2−α j α−2 (14) 10 (cf. also Egghe (2005), Chapter II). So, by (10) we have r = g if C 2−α j = g2 α−2 (15) and if this g satisfies g ≤ T (otherwise take g = T ). By (13): 1 ⎛(α −1) r ⎞⎟1−α ⎟ j = ⎜⎜⎜ ⎜⎝ C ⎠⎟⎟ (16) (16) (for r = g ) in (15) yields α −2 ⎤ ⎡ ⎢ C ⎛⎜ α −1⎞⎟ α−1 ⎥ g=⎢ ⎟ ⎥ ⎜ ⎢ α − 2 ⎜⎝ C ⎠⎟ ⎥ ⎢⎣ ⎥⎦ ⎛ α −1 ⎞⎟ g = ⎜⎜ ⎜⎝ α − 2 ⎠⎟⎟ using that T = α−1 α α−1 α 1 Tα C as follows from (13) by taking j = 1 . This value is taken as the gα −1 index if it is ≤ T and we take g = T if it is strictly larger than T. Second proof: Now we work directly with formula (10). Note that Lotka’s law (11) is equivalent with Zipf’s law g (r ) = B rβ (17) 11 B, β > 0 , r ∈ ]0,T ] where we have the relations 1 ⎛ C ⎞⎟α−1 B = ⎜⎜ ⎜⎝ α −1⎠⎟⎟ β= (18) 1 α −1 (19) (cf. Egghe (2005), Exercise II.2.2.6 but see also the Appendix in Egghe and Rousseau (2006) where a proof is given.). Note that by (19) α > 2 is equivalent with 0 < β < 1 . If that is the case, (10) gives ∫ g 0 B dr = g 2 rβ hence, since 0 < β < 1 B 1−β g = g2 1− β Hence 1 ⎛ B ⎞⎟β+1 ⎟ g = ⎜⎜ ⎜⎝1 − β ⎠⎟⎟ (20) Now (18) and (19) in (20) again yield formula (12). Corollary II.3.2: If g is the g-index and h is the h-index of a Lotkaian system with exponent α > 2 , then ⎛ α −1 ⎞⎟ g = ⎜⎜ ⎜⎝ α − 2 ⎠⎟⎟ α−1 α h (21) 12 (if this value is ≤ T ; otherwise g = T ). Proof: 1 This follows readily from (12) and the fact that h = T α , see Egghe and Rousseau (2006) (also proved, approximatively, in Glänzel (2006b)). Taking j = 1 in (13) and (14) we see that the total number of sources T equals the total number of items A equals C and that α −1 C , hence α−2 μ= A α −1 = T α−2 (22) equals the average number of items per source (cf. also Egghe (2005), Chapter II). Hence we have the following corollary Corollary II.3.3: If μ is as above we have in case of (21) g=μ α−1 α (23) h The g-index in function of α and A is as in Corollary II.3.4. Corollary II.3.4: We have ⎛ α −1 ⎞⎟ g = ⎜⎜ ⎜⎝ α − 2 ⎠⎟⎟ (if this value is ≤ T ). α−2 α 1 Aα (24) 13 Proof: This follows readily from (22) and (12). We can also determine the item density j for which we have r = g . In practical cases this means the number of items in the source at rank g. Note that this is h for the h-index r = h , by definition of the Hirsch index. For the g-index we have: if the value in (12) is > T we have j = g (T ) = 1 and if the value in (12) is ≤ T we substitute r = g in (17), using (18) and (19) and the fact that T = C , α −1 yielding 1 ⎛ α − 2 ⎞⎟α T j = ⎜⎜ ⎜⎝ α −1 ⎠⎟⎟ (26) We immediately see that j < h which is logical since g > h and the item density is h in case r = h . Formula (26) presents a concrete formula for the item density cut-off place. Note that, although α can be any value α > 2 , we do not have lim j = 0 . Indeed since the α→2 > validity of (12) is limited to g ≤ T we have, by (12) that ⎛ α −1 ⎞⎟ ⎟ ⎜⎜⎜ ⎝ α − 2 ⎠⎟ α−1 α 1 Tα ≤ T from which it follows that μ= This implies in (26) that A α −1 = ≤T T α−2 (27) 14 1 ⎛ T 2 ⎞α j = ⎜⎜ ⎟⎟⎟ ≥ 1 ⎜⎝ A ⎠⎟ , by (27). The case ⎛ ⎞ ⎜⎜ α −1 ⎟⎟ ⎜⎝ α − 2 ⎠⎟ α−1 α 1 Tα > T (28) (hence where we take g = T ) occurs in the following case: from (28) it follows that α −1 >T α−2 (29) By (22) we have μ= A >T T hence A > T2 (30) Equivalently, (29) gives the condition in α : α< 2T −1 T −1 Note that 2T −1 >2 T −1 (31) 15 so that (31) can occur (with the condition α > 2 ). Remark II.3.5: It might seem strange that g = T is possible in this Lotkaian model. Note however that (22) implies that α= 2A − T A−T So, for every T fixed, if we let A → ∞ we have that α → 2 (but α > 2 ) so that we are within the limitations of our theory. In this case we have A = G (T ) > T 2 and hence g = T . In the next section we will apply the g-index to the publications and citations of the (still active) Price medallists and compare these g-indexes with the h-indexes of these same data. III. Calculation and comparison of the h- and gindexes of the (still active) Price medallists In Glänzel and Persson (2005), the h-indexes for the (still active) Price medallists are calculated. We could use these numbers and compare them with the here defined g-index. However for this we need to extend the tables in Glänzel and Persson (2005) (since g ≥ h ) and it is hardly impossible to do this since we should do this for the maximal citing time August 2005 (since then the tables in Glänzel and Persson (2005) were produced). So the easiest thing to do is to remake these tables for the present time (January 2006) and make them long enough so that, on the same tables, the h- as well as the g-index can be calculated. 16 We have opted not to limit the publication year to 1986 or higher (as was the case in Glänzel and Persson (2005)). Indeed, the h-indexes in Glänzel and Persson (2005) seemed a bit unnatural in several senses. E. Garfield did not have the highest h-index (which we, normally, could expect) and H. Small scored lowest of all the Price medallists. The major reason for these observations is that, by limiting the publication year to 1986 or higher, one cuts away most publications (and perhaps the highest cited ones) of the relatively older scientists. Since we want to make a comparison of scientists (and not to draw conclusions on informetrics fields) we decided not to limit the publication year (except to the evident limit 1972 since before that date the ISI (now Thomson ISI) data do not exist. For the same reason we count all publications even if scientists have published in different domains (e.g. T. Braun in chemistry and L. Egghe in mathematics). These publications were not used in the Glänzel and Persson study. Of course, by not limiting the publication period and the publication field, one might argue that there is a bias towards the older scientists. This is true but, with the h- and g-indexes, we want to indicate the “overall performance (visibility)” of the scientists as they are viewn today (in the sense of “lifetime achievement”). We base ourselves on the Web of Knowledge (WoK) and hence we are limited to the Thomson ISI data. This means that no citations to non-source journals or conference proceedings articles or books are counted. In addition, no citations to incomplete references are counted even if they are to source journal articles (e.g. a citation to JASIST, 2001, to appear): these are not collected in the WoK “times cited” data. So the actual h- and g-indexes can be somewhat higher but this effect plays for every scientist so that comparisons are still possible and also these limitations do not jeopardise the possibility to compare the h- and gindex. The tables of citation data of the (still active) medallists are found in the Appendix. The table stops one line below the g-index since this is all we need. The number r denotes the rank of the publication and TC denotes the total number of citations to the paper on rank r. The number ΣTC denotes the cumulative number of citations to the first r ranked papers. Finally, also the table of r 2 values is presented as well as the publication year (PY) of the article on 17 rank r. The h- and g-index determination is highlighted in the tables in the Appendix. Table 2 gives the results in decreasing order of h and g. Table 2. h- and g-indexes of Price medallists in decreasing order Name h-index Name g-index Garfield Narin 27 27 Garfield 59 Narin 40 Braun 25 Small 39 Van Raan 19 Braun 38 Glänzel Moed Schubert Small 18 18 18 18 Schubert 30 Martin 16 Glänzel Martin Moed Van Raan 27 27 27 27 Egghe Ingwersen Leydesdorff Rousseau 13 13 13 13 Ingwersen 26 White 25 White 12 Egghe Leydesdorff 19 19 Rousseau 15 We leave the detailed (subjective) interpretation of Table 2 to the reader but it is clear that the g-index column is more in line with intuition and with the raw data in the Appendix than the h-index column. In other words, the g-index, as simple as the h-index (a single measure, containing publication and citation elements), contains more comparative information from the raw data than the h-index and resembles more the overall feeling of “visibility” or “life time achievement”. A possible interesting measure is g , i.e. the relative increase of g with respect to h. The result h is presented in Table 3, in decreasing order of respect to the h- or g-orderings. g . Here we see remarkable order changes with h 18 Table 3. g -values of Price medallists in decreasing order h Name g h Garfield 2.19 Small 2.17 White 2.08 Ingwersen 2.00 Martin 1.69 Schubert 1.67 Braun 1.52 Glänzel Moed 1.50 1.50 Narin 1.48 Egghe Leydesdorff 1.46 1.46 Van Raan 1.42 Rousseau 1.15 19 Appendix Tables of TC, r, ΣTC , r 2 and PY for each of the (still active) Price medallists and determination of the h- and g-index. Garfield E. TC r ΣTC r2 PY 625 149 138 132 132 129 127 111 109 108 107 105 104 101 96 91 89 88 87 85 80 67 63 41 29 28 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 625 774 912 1044 1176 1305 1432 1543 1652 1760 1867 1972 2076 2177 2273 2364 2453 2541 2628 2713 2793 2860 2923 2964 2993 3021 1 4 9 16 25 36 49 64 81 100 121 144 169 196 225 256 289 324 361 400 441 484 529 576 625 676 1972 1980 1977 1983 1981 1979 1996 1978 1975 1985 1984 1982 1986 1976 1973 1976 1974 1986 1987 1979 1985 1988 1999 1980 1990 1987 27 26 26 23 23 20 19 19 18 18 18 27 28 29 30 31 32 33 34 35 36 37 3048 3074 3100 3123 3146 3166 3185 3204 3222 3240 3258 729 784 841 900 961 1024 1089 1156 1225 1296 1369 1987 1976 1992 1978 1990 1990 1998 1998 1985 1979 1996 20 16 15 14 13 13 13 13 13 12 12 12 12 12 11 11 10 10 9 9 9 9 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 3274 3289 3303 3316 3329 3342 3355 3368 3380 3392 3404 3416 3428 3439 3450 3460 3470 3479 3488 3497 3506 1444 1521 1600 1681 1764 1849 1936 2025 2116 2209 2304 2401 2500 2601 2704 2809 2916 3025 3136 3249 3364 1979 1990 1976 1973 1973 1973 1998 1990 1973 2000 1998 1997 1996 1998 1997 1985 1984 1984 1975 1972 2002 9 9 59 60 3515 3524 3481 3600 1998 1990 TC r ΣTC r2 PY 125 124 78 66 57 57 55 51 43 42 38 37 37 35 35 33 32 31 31 28 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 125 249 327 393 450 507 562 613 656 698 736 773 810 845 880 913 945 976 1007 1035 1062 1 4 9 16 25 36 49 64 81 100 121 144 169 196 225 256 289 324 361 400 441 1978 1989 1986 1975 1974 1990 1974 1989 1992 1974 1983 1995 1994 1980 1999 1988 1995 1975 1995 1977 1973 Braun T. 21 27 27 26 22 23 24 1089 1116 1142 484 529 576 1988 1987 2000 26 25 25 23 23 23 23 22 22 22 22 21 21 25 26 27 28 29 30 31 32 33 34 35 36 37 1168 1193 1218 1241 1264 1287 1310 1332 1354 1376 1398 1419 1440 625 676 729 784 841 900 961 1024 1089 1156 1225 1296 1369 1994 1973 1972 1978 1973 1994 1987 1983 1982 1980 1987 1973 1973 20 20 38 39 1460 1480 1444 1521 1982 1982 TC r ΣTC r2 PY 305 239 127 109 86 80 77 75 67 49 44 36 26 26 25 22 22 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 305 544 671 780 866 946 1023 1098 1165 1214 1258 1294 1320 1346 1371 1393 1415 1 4 9 16 25 36 49 64 81 100 121 144 169 196 225 256 289 1973 1974 1978 1974 1977 1985 1985 1985 1999 1979 1980 1980 1981 1986 1976 1997 1993 18 18 15 12 10 9 8 8 18 19 20 21 22 23 24 25 1433 1451 1466 1478 1488 1497 1505 1513 324 361 400 441 484 529 576 625 1974 1994 1999 1986 1989 1975 1998 1987 Small H. 22 7 6 5 5 5 3 3 2 2 2 1 1 1 26 27 28 29 30 31 32 33 34 35 36 37 38 1520 1526 1531 1536 1541 1544 1547 1549 1551 1553 1554 1555 1556 676 729 784 841 900 961 1024 1089 1156 1225 1296 1369 1444 1989 1998 1977 1974 1999 1979 1995 1975 2004 2003 1973 2004 1997 1 1 39 40 1557 1558 1521 1600 1996 1992 Van Raan A.F.J. TC r ΣTC r2 PY 108 51 49 41 35 32 31 30 25 25 23 22 22 21 20 19 19 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 108 159 208 249 284 316 347 377 402 427 450 472 494 515 535 554 573 592 1 4 9 16 25 36 49 64 81 100 121 144 169 196 225 256 289 324 1985 1996 1991 1985 1991 1973 1990 1990 1993 1974 1995 1998 1997 2000 2001 1998 1998 1994 19 18 18 17 17 17 15 14 19 20 21 22 23 24 25 26 611 629 647 664 681 698 713 727 361 400 441 484 529 576 625 676 1994 1998 1993 1993 1985 1980 1993 2001 14 14 27 28 741 755 729 784 1994 1991 23 Martin B. TC r ΣTC r2 PY 156 74 52 38 35 33 33 30 29 28 24 23 22 20 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 156 230 282 320 355 388 421 451 480 508 532 555 577 597 616 1 4 9 16 25 36 49 64 81 100 121 144 169 196 225 1983 1997 1985 1983 2001 1987 1985 1995 1996 1984 1988 1981 1999 1984 1985 18 16 16 16 16 14 14 11 9 9 9 16 17 18 19 20 21 22 23 24 25 26 634 650 666 682 698 712 726 737 746 755 764 256 289 324 361 400 441 484 529 576 625 676 1986 1996 1986 1985 1984 1991 1984 1986 1994 1989 1987 6 4 27 28 770 774 729 784 1982 1992 TC r ΣTC r2 PY 112 95 86 82 73 71 70 63 59 1 2 3 4 5 6 7 8 9 112 207 293 375 448 519 589 652 711 1 4 9 16 25 36 49 64 81 1997 1987 1976 1976 1977 1991 1972 1985 1992 Narin F. 24 55 55 53 52 52 44 41 38 37 35 33 33 29 28 28 28 27 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 766 821 874 926 978 1022 1063 1101 1138 1173 1206 1239 1268 1296 1324 1352 1379 100 121 144 169 196 225 256 289 324 361 400 441 484 529 576 625 676 1978 1973 1975 1991 1981 1977 1980 2000 1980 1999 1989 1987 1994 1996 1977 1976 1984 27 26 24 23 20 19 18 18 17 14 13 12 10 27 28 29 30 31 32 33 34 35 36 37 38 39 1406 1432 1456 1479 1499 1518 1536 1554 1571 1585 1598 1610 1620 729 784 841 900 961 1024 1089 1156 1225 1296 1369 1444 1521 1983 1988 1988 1995 1998 1994 1980 1979 1978 1996 1983 1986 1977 10 9 40 41 1630 1639 1600 1681 1972 1983 TC r ΣTC r2 PY 124 90 78 59 57 40 33 32 27 27 27 26 1 2 3 4 5 6 7 8 9 10 11 12 124 214 292 351 408 448 481 513 540 567 594 620 1 4 9 16 25 36 49 64 81 100 121 144 1989 2002 1986 1978 1990 1979 1988 1983 1988 1987 1984 2000 Schubert A. 25 26 23 23 22 19 13 14 15 16 17 646 669 692 714 733 169 196 225 256 289 1994 1994 1987 1987 1986 18 18 18 18 17 17 17 16 15 14 14 14 18 19 20 21 22 23 24 25 26 27 28 29 751 769 787 805 822 839 856 872 887 901 915 929 324 361 400 441 484 529 576 625 676 729 784 841 2000 1993 1986 1984 2001 1988 1982 1982 2002 1993 1989 1985 13 12 30 31 942 954 900 961 1992 1996 TC r ΣTC r2 PY 124 54 33 32 32 31 28 27 27 27 26 24 23 23 22 22 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 124 178 211 243 275 306 334 361 388 415 441 465 488 511 533 555 575 1 4 9 16 25 36 49 64 81 100 121 144 169 196 225 256 289 1989 1988 1988 1995 1983 1995 1995 1988 1987 1984 1994 2001 1994 1987 2002 1987 1994 19 18 18 18 18 17 17 18 19 20 21 22 23 24 594 612 630 648 666 683 700 324 361 400 441 484 529 576 1986 1994 1993 1986 1984 2001 1988 Glänzel W. 26 16 16 25 26 716 732 625 676 1999 1996 15 15 27 28 747 762 729 784 1997 1996 Moed F.H. TC r ΣTC r2 PY 108 56 54 54 49 41 35 31 26 26 24 23 23 22 22 20 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 108 164 218 272 321 362 397 428 454 480 504 527 550 572 594 614 634 1 4 9 16 25 36 49 64 81 100 121 144 169 196 225 256 289 1985 1995 1996 1995 1991 1985 1991 1990 2002 1989 1996 1999 1998 2002 1991 2001 1999 18 17 15 15 13 13 12 12 9 18 19 20 21 22 23 24 25 26 652 669 684 699 712 725 737 749 758 324 361 400 441 484 529 576 625 676 1989 1985 1999 1993 1998 1993 2002 1993 1999 9 9 27 28 767 776 729 784 1996 1996 Leydesdorff L. TC r ΣTC r2 PY 79 32 26 24 23 1 2 3 4 5 79 111 137 161 184 1 4 9 16 25 2000 1998 1986 1989 1990 27 22 19 17 17 16 15 13 6 7 8 9 10 11 12 206 225 242 259 275 290 303 36 49 64 81 100 121 144 1987 1989 1996 1991 1997 1994 1994 13 13 11 11 11 10 13 14 15 16 17 18 316 329 340 351 362 372 169 196 225 256 289 324 1993 1989 2000 1993 1992 1998 10 9 19 20 382 391 361 400 1997 1992 TC r ΣTC r2 PY 47 42 37 36 21 18 17 16 16 16 15 13 1 2 3 4 5 6 7 8 9 10 11 12 47 89 126 162 183 201 218 234 250 266 281 294 1 4 9 16 25 36 49 64 81 100 121 144 1990 1985 2000 1992 1992 1991 1986 1995 1988 1986 1993 1996 13 13 13 12 12 12 13 14 15 16 17 18 307 320 333 345 357 369 169 196 225 256 289 324 1996 1990 1988 2000 1994 1988 12 11 19 20 381 392 361 400 1987 2000 Egghe L. Rousseau R. TC r ΣTC r2 PY 25 1 25 1 1996 28 18 18 16 16 15 15 14 13 13 13 13 2 3 4 5 6 7 8 9 10 11 12 43 61 77 93 108 123 137 150 163 176 189 4 9 16 25 36 49 64 81 100 121 144 2003 1991 1995 1988 1987 1992 1994 2002 1999 1996 1996 13 13 202 169 1993 12 14 214 196 2000 12 12 15 16 226 238 225 256 2000 1990 Ingwersen P. TC r ΣTC r2 PY 120 93 83 79 52 37 31 29 29 19 17 15 1 2 3 4 5 6 7 8 9 10 11 12 120 213 296 375 427 464 495 524 553 572 589 604 1 4 9 16 25 36 49 64 81 100 121 144 1996 1998 1997 1982 2001 1997 1984 1997 1987 2000 1984 1996 14 10 10 8 7 7 6 6 5 3 3 3 3 13 14 15 16 17 18 19 20 21 22 23 24 25 618 628 638 646 653 660 666 672 677 680 683 686 689 169 196 225 256 289 324 361 400 441 484 529 576 625 1997 2001 1992 1999 1999 1995 2000 1993 2000 2001 2000 2000 1994 3 3 26 27 692 695 676 729 1994 1992 29 White H.D. TC r ΣTC r2 PY 128 106 103 45 37 28 22 21 20 15 14 1 2 3 4 5 6 7 8 9 10 11 128 234 337 382 419 447 469 490 510 525 539 1 4 9 16 25 36 49 64 81 100 121 1981 1998 1989 1997 1982 1981 1983 1987 2001 1987 1986 14 12 12 12 12 11 10 8 6 5 5 5 5 12 13 14 15 16 17 18 19 20 21 22 23 24 553 565 577 589 601 612 622 630 636 641 646 651 656 144 169 196 225 256 289 324 361 400 441 484 529 576 1985 2003 1996 1981 1981 2003 1990 1986 2001 2004 1986 1984 1977 4 4 25 26 660 664 625 676 2003 1986 IV. Conclusions and open problems In this paper we studied the g-index being an improvement of the h-index. The g-index g is the largest rank (where papers are arranged in decreasing order of the number of citations they received) such that the first g papers have (together) at least g 2 citations. We show that g ≥ h and that g always uniquely exists. We present formulae for g in Lotkaian informetrics. We show that 30 ⎛ α −1 ⎞⎟ g = ⎜⎜ ⎜⎝ α − 2 ⎠⎟⎟ α−1 α ⎛ α −1 ⎞⎟ g = ⎜⎜ ⎜⎝ α − 2 ⎠⎟⎟ T α−1 α 1 α h if these values are ≤ T ; otherwise g = T . Here α is the Lotka exponent and T denotes the total number of sources (in the citation application this means the total number of ever cited papers). We then calculate the h- and g-indexes of the (still active) Price medallists. Different than in Glänzel and Persson (2005) we do not limit the publication period (except for the fact that we do not use papers older than published in 1972 due to the fact that ISI has no data for them) nor do we limit the topic to informetrics, hence the complete careers (up to 1972) of the Price medallists are considered. It is found that the ranked g-index column resembles more the overall feeling of “visibility” or “life time achievement” than does the ranked h-index column. We leave open the further exploration of the g-index, including the establishment of the gindex in function of time. In Egghe (2006b) we were able to do this for the h-index based on the cumulative nnt citation distribution (see Egghe and Rao (2001)) and in a forthcoming paper we will do the same for the g-index based on a time-dependent Lotkaian theory. We also leave open the construction of other h- or g-like indexes and the comparison of these new indexes with the h- and g-index. It would also be interesting to work out more practical cases (in other fields) of h- and g-index comparisons. Such case studies can learn a lot on the advantages and/or disadvantages of the h-index and the g-index. 31 References Ball P. (2005). Index aims for fair ranking of scientists. Nature 436, 900. Braun T., Glänzel W. and Schubert A. (2005). A Hirsch-type index for journals. The Scientist 19(22), 8. Egghe L. (2005). Power Laws in the Information Production Process: Lotkaian Informetrics. Elsevier, Oxford (UK). Egghe L. (2006a). An improvement of the h-index: the g-index. Preprint. Egghe L. (2006b). Dynamic h-index: the Hirsch index in function of time. Preprint. Egghe L. and Rao I.K.R. (2001). Theory of first-citation distributions and applications. Mathematical and Computer Modelling 34(1-2), 81-90. Egghe L. and Rousseau R. (1990). Introduction to Informetrics. Quantitative Methods in Library, Documentation and Information Science. Elsevier, Amsterdam (the Netherlands). Egghe L. and Rousseau R. (2006). An informetric model of the Hirsch index. Preprint. Glänzel W. (2006a). On the opportunities and limitations of the H-index. Science Focus 1, to appear. Glänzel W. (2006b). On the H-index – A mathematical approach to a new measure of publication activity and citation impact. Scientometrics 67(2), to appear. Glänzel W. and Persson O. (2005). H-index for Price medallist. ISSI Newsletter 1(4), 15-18. Hirsch J.E. (2005). An index to quantify an individual’s scientific research output. arxiv:physics/0508025.