Espirales. Revista multidisciplinaria de investigación cientíca, vol 5, No. 36

Enero-marzo 2021. e-ISSN 2550-6862. págs 1-16

DOI: 10.31876/er.v5i36.775

Recibido: 03 de marzo de 2020

Aprobado: 17 de octubre de 2021

Cite this:

Morales, C., Radicelli, C., & Pomboza, M. (2021). Data

analysis tools for the study of scientic citations.

Espirales. Revista Multidisciplinaria de investigación

cientíca, 5(36), 1-16.

Data analysis tools for the study of scientic

citations*

Herramientas de análisis de datos para el estudio de las citaciones cientícas

Cristian Morales Alarcón**, Ciro Radicelli García***, Margarita Pomboza Floril****

Abstract

This research work made a study of scientic citations,

with the aim of identifying aspects that may inuence the

citations of a higher educational institution. We analyzed

219 records of publications of the National University of

Chimborazo and 10304 records of manuscripts from

Ecuador. This work had a qualitative approach and a

systemic design. As a result, it was found that the impact

of scientic publications is reected by the number of

citations that have the documents published by the

higher education institutions; in this sense, publications

with larger citations are not related to the number of

authors or volume of the published magazine, but they are

supported by a quality research and correspond mostly

to applied sciences.

Key words: Scientic publications, analysis of data,

research methodology, higher education, quality in

education.

Original article derived from the project: “Design

of strategies for continuous improvement in

academic and research management at UNACH,

using data mining techniques.”

Master in Information Systems Management and

Business Intelligence. Agricommerce Cía. Ltda.,

Riobamba, Ecuador.

E-mail: cristianmorales18m@gmail.com.

ORCID: 0000-0002-0197-0581.

Google Scholar

***

PhD in Telecommunications. Universidad Nacional

de Chimborazo, Riobamba, Ecuador.

E-mail: cradicelli@unach.edu.ec.

ORCID: 0000-0001-9188-0514.

Google Scholar

****

PhD in Design, Manufacturing and Management

of Industrial Projects. Universidad Nacional de

Chimborazo, Riobamba, Ecuador.

E-mail: margaritapomboza@unach.edu.ec.

ORCID: 0000-0002-4820-493X.

Google Scholar

Data analysis tools for the study of scientic citations

Espirales. Revista multidisciplinaria de investigación cientíca, vol 5, No. 36

Enero-marzo 2021. e-ISSN 2550-6862. págs 1-16

Resumen

Este trabajo de investigación realizó un estudio de citaciones cientícas

con el objetivo de identicar aspectos que puedan inuir en las

citaciones de una institución de educación superior. Se analizaron 219

registros de publicaciones de la Universidad Nacional de Chimborazo

y 10304 registros de manuscritos del Ecuador. Este trabajo tuvo un

enfoque cualitativo y un diseño sistémico. Como resultado se obtuvo

que el impacto de las publicaciones cientícas se ve reejado por

el número de citas que tienen los documentos publicados por las

instituciones de educación superior; en este sentido las publicaciones

con mayores citas no se encuentran relacionadas al número de autores

ni al volumen de la revista publicada sino a una investigación de calidad

y corresponden en su mayoría a ciencias aplicadas.

Palabras clave: publicaciones cientícas, análisis de datos, metodología

de investigación, educación superior, calidad en la educación.

Introduction

At present, the explosion, assimilation and intensive use of knowledge has led to what has been

called the knowledge society, in which the management of information, documentation and

knowledge are emerging as a strategic component in the Institutions of Higher Education (HEI).

In this sense, in the HEI of Ecuador from the year 2003, self-evaluation, evaluation and

institutional recategorization processes are executed, directed by the Council for Evaluation,

Accreditation and Quality Assurance of Higher Education (CEAACES for its acronym in Spanish),

now Higher Education Quality Assurance Council (CACES for its acronym in Spanish), which

have led to quality measurements in different areas among which are considered those

related to the scientic production of knowledge of both teachers and students belonging to

research groups. Here it is specically analyzed the number of publications in journals with

high global impact, the production of regional impact and the publication of books and book

chapters (CEAACES, 2018), which is contemplated in the Institutional Evaluation Model of

Universities and Polytechnic Schools.

High-impact scientic publications refer to the quality indicator (Radicelli et al., 2018), which

in turn is measured by the number of publications in indexed journals of the ISI Web of

Knowledge and SCImago scientic databases Journal Rank (Ganga, Paredes, & Pedraja, 2015).

Cristian Morales Alarcón, Ciro Radicelli García, Margarita Pomboza Floril

Espirales. Revista multidisciplinaria de investigación cientíca, vol 5, No. 36

Enero-marzo 2021. e-ISSN 2550-6862. págs 1-16

In addition, the number of citations that scientic articles published in a specic journal have

at a given time is considered, so in order to measure said quality, there are entities such as

the Institute for Scientic Information (ISI), attached to the company Thomson Reuters, that

uses the Journal Citation Report (JCR), which is nothing more than a database that presents

detailed gures about publications and their citations. However, in addition to the JCR index,

there are other databases that measure the quality of published documents, such as Scopus,

which is attached to Elsevier, and which is mainly run by the SCImago research group of

Spain (Valderrama, 2012), which use the SCImago Journal and Country Rank (SJR) and the

SCImago Institutions Ranking (SIR) as indicators.

The volumes of data that are stored in databases, allow a complete processing of the

information, for which it works in phases such as pre-processing, data mining itself and the

post-processing of said information. In this sense, to facilitate the retrieval and delivery of

information carried out by personnel who work with large volumes of data, such as librarians,

the horizons have been opened towards other professions that are called to cooperate, thus

we now have designers systems, data providers, publishers, vendors, archivists, engineers

and specialists in electronic text encoding, among others; whose opinions and experiences

will allow the development of adequate interfaces to facilitate the location, manipulation,

retrieval and use of digital information.

In reference to the aforementioned, Valcárcel (2004) mentions that the “minería de datos” (or

commonly called Data Mining), refers to the process of extracting knowledge from databases,

with the aim of discovering anomalous and/or interesting situations, as well as trends, patterns

and sequences in the data. For their part, Molina and Ribiero (2001) clarify that mining is the

integration of a set of areas whose purpose is to identify knowledge obtained from databases

that provide a bias towards decision-making. Likewise, Molina (2002) indicates that data mining

is a non-trivial process of valid, novel, potentially useful and understandable identication of

understandable patterns that are hidden in the data.

Thus, for this purpose, new tools have been created in order to facilitate access to the

accumulation of information that is generated daily, one of the most used being text mining,

which offers the possibility of exploring large amounts of non-organized texts, in addition

to establishing patterns and extracting useful knowledge. Text mining then refers to the

examination of a collection of documents in order to discover information that is not explicit

in the analyzed text (Nasukawa, Kawano, & Arimura, 2001).

The importance of text mining lies in the effectiveness of its predictive models, which have

saved time and money; as well as the improvement of the capacity to respond to the needs

of the interested parties, it is thus that the use of computer tools used for the discovery and

processing of information will improve the knowledge management process.

Data analysis tools for the study of scientic citations

Espirales. Revista multidisciplinaria de investigación cientíca, vol 5, No. 36

Enero-marzo 2021. e-ISSN 2550-6862. págs 1-16

It is also important to mention that information constitutes, under current conditions, an

economic resource highly valued not only for its intrinsic properties, but also because it allows

the improvement of the use of the rest of the resources of the organizations, therefore the

search of regularities or patterns found in a text, based on machine learning techniques, are of

great help for the discovery of knowledge that does not exist in the text, but that arises when

relating the content of several texts.

All these applications are perfectly transferable, for example, to the management of information

that occurs within the libraries of the HEI, which are called upon to resize the function of the

entity, both inside and outside of it. Thus, there are numerous and multiple approaches to a

denition of the text mining knowledge management tool, where it is intended to use machine

learning techniques, considered one of the many branches of computational linguistics, in

order to nd the patterns previously mentioned generally in unstructured texts such as those

commonly used by organizations, such as reports, emails, meeting minutes, among others,

that is, information stored in unstructured textual form.

This work focused on analyzing scientic publications and their respective citations, considering

the information registered in the Scopus database, both in the case of UNACH, and for scientic

publications made in Ecuador, this considering a period of approximately 6 years (2013 to

2019), for which data analytics and text mining tools were used in order to identify aspects

that may inuence citations from a HEI.

Materials and Methods

This work had a qualitative approach due to the fact that signicant research areas or topics

were determined, where to discover, rene and answer the research questions, the data collection

and analysis was rst carried out. This work followed a systemic design which was also

based on the CRISP-DM data mining methodology, since it highlights the use of steps which

are followed in an order until the desired end is reached. In the context of this research, the

methodological process detailed below was followed:

(i) data collection and analysis: For data collection, the Scopus scientic database was used

to obtain UNACH publications, as well as the scientic publications made in Ecuador, the

aforementioned was done using the period from 2013 to 2019. For this purpose, only journal

articles and book chapters were considered, below, the search strings used are shown:

AFFILORG (Chimborazo) AND (LIMIT-TO (DOCTYPE, “ar”) OR LIMIT-TO (DOCTYPE, “ip”) OR

LIMIT-TO (DOCTYPE, “ch”)) AND (LIMIT-TO (AF-ID, “Universidad Nacional de Chimborazo”

60108604) OR LIMIT-TO (AF-ID, “National University of Chimborazo” 114160995) OR LIMIT-

TO (AF-ID, “National University of Chimborazo” 118104741) OR LIMIT-TO (AF-ID, “Universidad

Nacional de Chimborazo” 119728963)).

Cristian Morales Alarcón, Ciro Radicelli García, Margarita Pomboza Floril

Espirales. Revista multidisciplinaria de investigación cientíca, vol 5, No. 36

Enero-marzo 2021. e-ISSN 2550-6862. págs 1-16

AFFILCOUNTRY (“Ecuador”) AND (LIMIT-TO (DOCTYPE, “ar”) OR LIMIT-TO (DOCTYPE, “ch”)

OR LIMIT-TO (DOCTYPE, “ip”)) AND (LIMIT-TO (PUBYEAR, 2020) OR LIMIT-TO (PUBYEAR,

2019) OR LIMIT-TO (PUBYEAR, 2018) OR LIMIT-TO (PUBYEAR, 2017) OR LIMIT-TO (PUBYEAR,

2016) OR LIMIT-TO (PUBYEAR, 2015) OR LIMIT-TO (PUBYEAR, 2014) OR LIMIT-TO (PUBYEAR,

2013)) AND (EXCLUDE (PUBYEAR, 2020)).

The analytical and synthetic methods were also used, because the study of the information

provided will allow, through the analysis of the information, to synthesize the behavior of the

study phenomenon.

(ii) Data preparation: It was carried out through an exploratory investigation, because specic

aspects related to the citations were analyzed both in UNACH in particular, as well as in

Ecuador in general. In addition, the necessary elds for the respective analyzes were chosen

and aspects of data quality were corrected, in terms of incomplete, missing, or erroneous data.

(iii) Preliminary descriptive analysis of the data: Descriptive research was used in order to

represent the data found and observe their behavior through tables and graphs. A deductive

method was also applied, because after a stage of repeated observation, analysis and

classication of the particular facts, generic computational models were obtained for future

application. On the other hand, to perform the analysis of the data of the scientic publications

of UNACH and Ecuador, the QlikView software was used, which is a tool that allows advanced

data visualizations, but also the mining text tools Andatos and WordStat.

(iv) Correlational analysis of the quantitative variables: For which the databases returned by

Scopus were examined.

(v) Text mining analysis: Corresponding to the data related to the citations of UNACH in

particular and of Ecuador in general.

(vi) Scientic induction: Due to the fact that when analyzing the data obtained in a particular way,

methodological aspects were derived in order to increase the number of citations. Explanatory

research was also used for this work, since it is intended to nd the causes that originate the

phenomenon.

Results

Once the data had been collected from the Scopus scientic database, the following phases

were developed: (i) transformation, where the lter and copy of the elds is carried out; (ii)

cleaning, where erroneous values are eliminated and subsequently replaced; and (iii) generation,

where new variables useful for the study are generated.

Data analysis tools for the study of scientic citations

Espirales. Revista multidisciplinaria de investigación cientíca, vol 5, No. 36

Enero-marzo 2021. e-ISSN 2550-6862. págs 1-16

Descriptive analysis

With the databases ready for the study, a descriptive analysis was carried out using the QlikView

software, using a total of 219 records corresponding to the scientic manuscripts published

by UNACH research staff in the period 2013 to 2019, obtaining a total of 217 journal articles

and only two book chapters.

Figure 1 describes the number of citations of UNACH in Scopus by type of document, here

it can be observed that the journal articles represent 99.68 % of the total of the historical

citations of the University, in contrast to the chapters of books that correspond only to 0.32 %.

Figure 1. Number of citations of UNACH in Scopus by type of document. Source: author’s own elaboration.

Figure 2 shows the history of the scientic publications produced by UNACH, where a growing

trend is observed in the number of works, its largest year of production being 2018 with 70

publications, in contrast to 2013 in which there was only one published manuscript. It should

be noted that so far in 2019 there are already 19 papers belonging to Scopus.

Figure 2. History of the number of scientic publications of UNACH in SCOPUS. Source: author’s own

elaboration.

Cristian Morales Alarcón, Ciro Radicelli García, Margarita Pomboza Floril

Espirales. Revista multidisciplinaria de investigación cientíca, vol 5, No. 36

Enero-marzo 2021. e-ISSN 2550-6862. págs 1-16

In contrast to the number of scientic publications produced by UNACH, the historical number

of citations does not correspond to a growing trend. Figure 3 shows that for 2015 the number

of citations reached 327, which represents the highest value in the graph; in contrast to 2013,

which only had 17 citations. When analyzing this graph, it can also be inferred that the number

of citations depends on the quality of the publications and not on the number of effective

publications. Thus, for example, for the year 2014, with only 13 publications, 72 citations were

obtained; while, with the 70 publications of 2018, there were only 19 citations.

Figure 3. History of the number of citations of the scientic publications of UNACH. Source: author’s own

elaboration.

Of a total of 219 scientic publications of UNACH, 58.9 % have not been cited even once,

that is, more than half of the manuscripts of the institution indexed in Scopus have not

generated the expected expectation in the research community worldwide; on the other

hand the remaining 41.10 %, corresponding to 90 publications have been cited at least once.

This is represented in Figure 4.

Figure 4. Percentage of UNACH publications that have obtained citation. Source: author’s own elaboration.

Data analysis tools for the study of scientic citations

Espirales. Revista multidisciplinaria de investigación cientíca, vol 5, No. 36

Enero-marzo 2021. e-ISSN 2550-6862. págs 1-16

Correlational analysis

Next, a correlational analysis of the quantitative variables was carried out through the RapidMiner

software, which could inuence the number of citations of the documents, rstly, the number

of citations vs. the number of authors was considered (Figure 5), in order to nd out if the

self-citations could have inuenced the number of citations; however, there is no relevant

correlation between these two variables. Two atypical values have also been found, the rst

referring to the publications that stand out in relation to the rest of the manuscripts, where

the correlation has been veried with the data of all the scientic publications of Ecuador

(10304 records). And the second considering that the number of authors with citations of

zero amounts to a gure of up to 2.314.

Figure 5. Correlation between the variables, number of citations vs. number of authors. Source: author’s own

elaboration.

In addition, a correlation was made between the number of citations vs. the number of the

volume of the journal in which the publication was made (Figure 6). In this case, it has been

considered that the second variable can affect the visibility of research papers as well as

the number of citations. This correlation has been veried with the data of all the scientic

publications of Ecuador, in which a relevant relationship has not been observed, due to the

fact that in a large number of volumes (9386) there are zero citations.

Cristian Morales Alarcón, Ciro Radicelli García, Margarita Pomboza Floril

Espirales. Revista multidisciplinaria de investigación cientíca, vol 5, No. 36

Enero-marzo 2021. e-ISSN 2550-6862. págs 1-16

Figure 6. Correlation between the variables, number of citations vs. number of the journal volume. Source:

author’s own elaboration.

Text mining analysis

Through the WordStat tool, a word cloud was generated in which it is observed that the topics

on which the academic staff of UNACH have published most frequently are education and

health, followed by studies related to physical activity and computer systems.

In Figure 7 it can be seen that the topic of education tops the list with the highest number of

publications, in addition to having a high number of papers that do not obtain any citation (23

manuscripts). It should be noted that only two publications have a high number of citations, 11

and 13 respectively, this out of a total of 35 citations, that is, 4 % of the total scientic papers

have generated 69 % of citations. Similarly, of the 219 UNACH publications analyzed, only 19

papers present citations with an index greater than 10, which represents 416 citations out of

a total of 617, that is, 9 % of publications have generated 67 % of citations.

Figure 7. Crosstab of the topics with the highest occurrence of publications and number of citations. Source:

author’s own elaboration.

Figure 8 shows the correlation between the publications (cases) and the number of citations,

here values can be seen that stand out from the rest of the topics that one might think appear

more frequently. In this sense, Table 1 shows the keywords that have the highest citation index

and that also have to do with applied sciences. Table 2 shows the scientic publications with

the highest number of citations in UNACH, while Table 3 shows the publications with the

highest number of citations in Ecuador.

Data analysis tools for the study of scientic citations

Espirales. Revista multidisciplinaria de investigación cientíca, vol 5, No. 36

Enero-marzo 2021. e-ISSN 2550-6862. págs 1-16

Figure 8. Topics found with the highest citation vs. the number of appearances in the cases. Source: author’s

own elaboration.

Table 1. Ranking of words by number of citations

Topic Citations Nº

Properties 57

Oxidative 55

Mechanical 52

Fabrics 50

Fuel 50

Microbial 50

Testing 50

Textiles 50

Natural 49

Cell 42

Source: author’s own elaboration.

Cristian Morales Alarcón, Ciro Radicelli García, Margarita Pomboza Floril

Espirales. Revista multidisciplinaria de investigación cientíca, vol 5, No. 36

Enero-marzo 2021. e-ISSN 2550-6862. págs 1-16

Table 2. Ranking of articles in the publications of UNCH

Nº Article title Number of citations

1 Strawberry as a health promoter: An evidence-based review 94

Development of durable cementitious composites using sisal and ax fabrics

for reinforcement of masonry structures

Physical and oxidative stability of whey protein oil-in-water emulsions produced

by conventional and ultra high-pressure homogenization: Effects of pressure

and protein concentration on emulsion characteristics

Effects of fabric parameters on the tensile behaviour of sustainable cementitious

composites

Flax and polyparaphenylene benzobisoxazole cementitious composites for the

strengthening of masonry elements subjected to eccentric loading

Municipal waste liquor treatment via bioelectrochemical and fermentation

(H2+CH4) processes: Assessment of various technological sequences

Relative importance of phenotypic trait matching and species’ abundances in

determining plant–Avian seed dispersal interactions in a small insular community

Lipophilic antioxidants prevent lipopolysaccharide-induced mitochondrial

dysfunction through mitochondrial biogenesis improvement

Single chamber microbial fuel cell (SCMFC) with a cathodic microalgal biolm:

A preliminary assessment of the generation of bioelectricity and biodegradation

of real dye textile wastewater

Uncontacted Waorani in the Yasuní Biosphere Reserve: Geographical Validation

of the Zona Intangible Tagaeri Taromenane (ZITT)

Source: author’s own elaboration.

Table 3. Ranking of articles in publications from all over Ecuador

Nº Article title

Number of

citations

1 The International Classication of Headache Disorders, 3rd edition (beta version) 3595

Trends in adult body-mass index in 200 countries from 1975 to 2014: A pooled

analysis of 1698 population-based measurement studies with 19.2 million participants

1100

Data analysis tools for the study of scientic citations

Espirales. Revista multidisciplinaria de investigación cientíca, vol 5, No. 36

Enero-marzo 2021. e-ISSN 2550-6862. págs 1-16

Nº Article title

Number of

citations

Global surveillance of cancer survival 1995-2009: Analysis of individual data for 25

676 887 patients from 279 population-based registries in 67 countries (CONCORD-2)

841

Global, regional, and national incidence, prevalence, and years lived with disability for

328 diseases and injuries for 195 countries, 1990-2016: A systematic analysis for

the Global Burden of Disease Study 2016

630

5 Global conservation outcomes depend on marine protected areas with ve key features 545

Worldwide trends in body-mass index, underweight, overweight, and obesity from

1975 to 2016: a pooled analysis of 2416 population-based measurement studies in

128·9 million children, adolescents, and adults

422

Global, regional, and national comparative risk assessment of 84 behavioural,

environmental and occupational, and metabolic risks or clusters of risks, 1990-2016:

A systematic analysis for the Global Burden of Disease Study 2016

407

8 Hyperdominance in the Amazonian tree ora 407

Uptake of pre-exposure prophylaxis, sexual practices, and HIV incidence in men and

transgender women who have sex with men: A cohort study

382

Global, regional, and national disability-adjusted life-years (DALYs) for 333 diseases

and injuries and healthy life expectancy (HALE) for 195 countries and territories, 1990-

2016: A systematic analysis for the Global Burden of Disease Study 2016

354

Source: author’s own elaboration.

Discussion

Data mining is an exploitation mechanism, consisting of the search for valuable information

in large volumes of data, which is known as Big Data; in this sense its algorithms aim to extract

valuable information for the making of decisions (Botta-Ferret, & Cabrera-Gato, 2007), and if

it is also considered that organizations have mostly unstructured data, rather than structured

data, text mining is justied due to the advantages it can provide when it comes to improve

the productivity of the data obtained (Pérez, & Cardoso, 2010).

Despite the fact that data mining has been used in eminently technical or business areas,

according to the authors of this research, it is undoubted that the future of education in general

will have to use, and is in fact using, methods of data analysis to improve the eciency and

effectiveness of its processes. And it is precisely that one of the most important and used

methods for this purpose is data mining, with the aim of making sense of the large amount

Cristian Morales Alarcón, Ciro Radicelli García, Margarita Pomboza Floril

Espirales. Revista multidisciplinaria de investigación cientíca, vol 5, No. 36

Enero-marzo 2021. e-ISSN 2550-6862. págs 1-16

of information that is currently stored (Riquelme, Ruiz, & Gilbert, 2006). In this sense, initially

data mining was used in elds such as computing (Jaramillo, Cardona, & Fernández, 2015),

health (González, & Pérez, 2013), business branches (Gordillo, Martínez, & Sthepens, 2012),

construction (Castro et al., 2014), and even on issues as specic as the detection and prevention

of money laundering and the nancing of terrorism. But today its use in the educational eld

has become very fashionable, and as they describe it Márquez, Romero and Ventura (2012),

it is a “very promising solution […] in education” (p. 45).

When speaking then of data mining in education, there are various studies, among which

are from the investigation of learning patterns (Ballesteros, Sánchez, & García, 2013), going

through the extraction of student dropout proles (Timarán, Calderón, & Jiménez, 2013), until

reaching such particular topics as the use of data mining for the enrollment process in private

higher education institutions (Estrada et al., 2016).

In the correlation analysis carried out to the number of citations with respect to the number

of authors and the volume of the journals, although there is no relevant relationship, said

correlation allowed to discover publications that stood out with a high number of citations,

which leads to think that certain areas of study and publications are more sought after for

the purposes of citation.

In addition, the text mining carried out on the basis of keywords of UNACH publications, allowed

to corroborate that there were factors that determine the number of citations; in this sense

it could be observed that several areas had a high number of citations, especially those that

belonged to applied sciences.

Conclusions

Scientic publications are part of the educational and especially university work, where the

impact of said publications is reected by the number of citations that scientic documents

published in HEI have, thus the increase in publications in UNACH and in Ecuador in general is

considerable; however this increase alone is not enough, and should be accompanied by the

number of citations, that is, the published works generate expectations in the world research

community. In this analysis, it was possible to observe research works that have never been

cited despite the fact that they were published a few years ago. Thus, the publications with

the highest citations are not related to the number of authors or the volume of the published

journal, but are supported by quality research of several years and correspond mostly to

applied sciences.

In addition, for an academic work to generate the desired interest, HEI in Ecuador should

implement strategies that allow the development of research supported by a well-dened

Data analysis tools for the study of scientic citations

Espirales. Revista multidisciplinaria de investigación cientíca, vol 5, No. 36

Enero-marzo 2021. e-ISSN 2550-6862. págs 1-16

structure, which in turn facilitates the creation of projects focused on solving the main problems

of the society and the generation of base knowledge for future research.

References

Ballesteros, A., Sánchez, D., & García, R. (2013). Minería

de datos educativa: una herramienta para la

investigación de patrones de aprendizaje sobre

un contexto educativo. Revista Latinoamericana

de Física Educativa, 7(4), 662-668.

Botta-Ferret, E., & Cabrera-Gato, J. (2007). Minería de

textos: una herramienta útil para mejorar la

gestión del bibliotecario en el entorno digital.

ACIMED, 14(4), 1-6.

Castro, A. et al. (2014). Use of Data Mining in Managing

Geographical Information. Información

Tecnológica, 25(5), 95-102.

CEAACES. (2018). Modelo de evaluación institucional

de universidades y escuelas politécnicas

2018. Recovered from http://uisrael.edu.ec/

wp-content/uploads/2019/03/Modelo-

evaluacion-preliminar-universidades-escuelas-

politecnicas2018-min.pdf?x23864.

Estrada, R. et al. (2016). Contributions to the Enrollment

Process with Data Mining in Private Higher

Education Institutions. Revista Electrónica

Educare, 20(3), 1-21.

Ganga, F., Paredes, L., & Pedraja, L. (2015). The importance

of academic publications: Some problems and

recommendations to keep in mind. Idesia,

33(4), 111-119.

Cristian Morales Alarcón, Ciro Radicelli García, Margarita Pomboza Floril

Espirales. Revista multidisciplinaria de investigación cientíca, vol 5, No. 36

Enero-marzo 2021. e-ISSN 2550-6862. págs 1-16

González, L., & Pérez, Y. (2013). Spatial data mining and its

application in health and epidemiology studies.

Revista Cubana de Información en Ciencias de

la Salud, 24(4), 1-13.

Gordillo, J., Martínez, E., & Stephens, C. (2012). Inferring

Market Strategies: Applying Data-Mining to

Analysis of Financial Markets. Computación y

Sistemas, 16(2), 221-231.

Jaramillo, S., Cardona, S., & Fernández, A. (2015). Data Mining

Streams of Social Networks, A Tool to Improve the

Library Services. Información, Cultura y Sociedad,

33, 63-74.

Márquez, C., Romero, C., & Ventura, S. (2012). Predicting

of school failure using data mining techniques.

IEEE-RITA, 7(3), 109-117.

Molina, L., & Ribiero, S. (2001). Descubrimiento conocimiento

para el mejoramiento bovino usando técnicas de

data mining. In IV Congreso Catalán de Inteligencia

Articial, Societat Catalana de Comunicació,

Barcelona, España.

Molina, L. (2002). Data mining: torturando a los datos

hasta que confiesen. Recovered from https://

www.uoc.edu/web/esp/art/uoc/molina1102/

molina1102.html.

Nasukawa, T., Kawano, H., & Arimura, H. (2001). Base technology

for text mining. Journal ofJapanese Society for

Articial Intelligence, 16(2), 201-211.

Pérez, M., & Cardoso, C. (2010). Minería de texto para la

categorización automática de documentos.

Cuadernos de la Facultad, 5, 11-45.

Radicelli, C. et al. (2018). Análisis del ranking SCImago

de universidades ecuatorianas: el caso de

la Universidad Nacional de Chimborazo. In

Data analysis tools for the study of scientic citations

Espirales. Revista multidisciplinaria de investigación cientíca, vol 5, No. 36

Enero-marzo 2021. e-ISSN 2550-6862. págs 1-16

Aplicaciones, experiencias y desafíos de la TIC

en el Ecuador (pp. 15-34). Riobamba, Ecuador:

Universidad Nacional de Chimborazo.

Riquelme, J., Ruiz, R., & Gilbert, K. (2006). Minería de datos:

conceptos y tendencias. Revista Iberoamericana

de Inteligencia Articial, 10(29), 11-18.

Timarán, R., Calderón, A., & Hidalgo, A. (2017). Aplicación

de los árboles de decisión en la identicación de

patrones de lesiones fatales por causa externa

en el municipio de Pasto, Colombia. Universidad

y Salud, 19(3), 359-365.

Valcárcel, V. (2004). Data Mining and Knowledge Discovery.

Revista de la Facultad de Ingeniería Industrial,

7(2), 83-86.

Valderrama, J. (2012). SCImago. Formación Universitaria,

5(5), 1.