Observation and assessment of  international activities in research and development 
by means of  frequency analyses of the world-greatest bibliographic databases

 

3   Methodology  for analysing very large bibliographic database systems

To the table of contents
 
Subjects of analysis are the quantities of the document descriptions (document as synonym to publication) contained in the bibliographic databases.

The subject area "Scientometric" differentiates with reference to analysis of publication quantities between citation analysis (i.e. analysis of authors’ citation behavior), and fre- quency analysis. The Citation analysis will not be discussed in this paper.

Frequency analyses of bibliographic databases yield a quantitative measure of productivity of authors of a country, an institution or a research-centre, etc.
For this purpose at first will be determined in the leading databases the total number of document descriptions with reference to a defined topic and then be analysed the frequences of equal text elements in the different fields of these document-descriptions.
The most suitable fields of the document-descriptions are: Controlled Terms (so-called Descriptors and Non-descriptors), Classification Codes, Publication Year, Country of Author, Corporate Source, Type of Publication. Also other description-elements be possible .

Example of a practical task
The frequencies of publications to the theme-area of  "Nanotechnology" are determined separately for
 - a particular year (eg. all publications in 1996);
 - a group of specific institutions (eg. all publications of US-universities);
 - a given type of publication (eg. all research-reports).

 
By combinations of descriptive elements (fields) it is possible to get combinations of analysis informations.
Example of a practical task 
It will be determined the frequency of research-reports about "Nanotechnology" which have published US American universities in 1994.

 
By comparison of publication-frequences in different times  frequence-developments are recognizable which can be represented in graphical form (graph).
It is possible by means of  inserted trend-curves  to prognosticate the further publication development for a short period (1-2 years).
Example of a practical task
Showed are the frequence-changes of  publications in the area of Nanotech- nology, broken down into publication years and publication types, for the years 1990 - 1996 
      - at US American universities,
      - at German universities.
A trend curve gives a prognosis (supposition) for the years 1997 and 1998 over the further development of the publication frequences.
To the table of contents
The analyses take place in a database cluster. That cluster or complex unites those databases which are, with reference to the analysis-topic, the most large and representative bibliographic databases in the world.
This means that the analysis at the same time is being performed on a much larger quantity of publications than an analysis with separate searches in single databases

It is possible to achieve on this way  a clearly higher number of hits respectively document-descriptions which identify relevant publications.  Duplications that may occur if relevant publications are cited in more than one database of the database-cluster are found and discarded by a program routine.
This method draws a very accurate picture of the real frequency distributions of publications.

Example
Frequency-analyses  of  publications to the subject “thin organic films” (a topical research field in the solid-state physics) by means of retrievals  in the databases CA; BIOSIS, MEDLINE, COMPENDEX, EMBASE, INSPEC, and JICST-EPLUS, give the following results:
 
Analysis of the single databases
Analysis of the database-cluster
 Number of analysed records   between 2,3 millions (JICST)
  and 12,3 millions (CA)
48,3 millions
 Number of hits   between 100 (EMBASE) 
  and 1300 (INSPEC)
3000

Data of may 1995

To the table of contents
 
Graph 1 The example "Thin organic films" shows the difference between separate analyses of the individual databases and a common analysis of the same databases, which are united to a database complex (database cluster).

                                                                                                                                                                                  Graph 1

Comment: The database-cluster in graph 1 comprises all seven databases given in table 1 (part 2 of this text). For a better representation in graph 1  only three of these databases are shown separate. From the hits found in the database cluster duplicates of document-descriptions have been removed.
To the table of contents
The results of database analyses are being edited as text, list, data-table andgraphic representation. These results were offered to the users as a combination of all four types.
Example of a text
"On the research-field „Thin Organic Films" shows a joint analysis of the most large and relevant international online databases for the period from 1985-1994 a leading position of the USA, whose publication frequencies several times surpass those of other high-tech countries. Ranking of the next following six countries is: Japan, China, Germany, France, Great Britain, Russia. A trend worth to noting is the steep rise in the number of publications of Chinese authors commencing in the early 1990s.  ... "
Example of a list
Ranking of Chinese research-institutes with reference to publications about the theme 
“Thin organic films"
 
Research-Institute 
Publications
Academy of Sciences, Physics Institute, Beijing
96
Beijing University, Physics Institute
58
Tsinghua Univ. Beijing, Materials Science Institute
17
Nanking University, Physics Institute 
16
      Data of may 1995
Example of a data table
Publication frequencies  to the theme “Thin organic films”
 
Year
USA 
Japan
China
Germany
France
GB 
1989
50
13
2
7
7
6
1990
46
24
7
10
4
5
1991
56
19
18
3
10
6
1992
89
31
25
9
10
4
1993
49
26
22
11
7
6
1994
67
26
30
20
8
10
Total
357
139
104
60
46
37
Data of May 1995
Example of a graphic representation
Publications on the research field  of “Thin organic films”, ordered according to the country of author and the year of publication  (graph 2).

                                                                                                                                                                                                Graph 2


All of the figures refer exclusively to publications that “Thin organic films” 
have as the main topic.
To the top of this page

 
To chapter 4                      To the table of contents