LEXRANK WITH THRESHOLD TO IDENTIFY EF-ISF NAVIGATION PATTERNS IN XMOOC OF OPEN EDX

Ancona Anacona, Fabián Andrés; Solarte Sarasty, Mario Fernando; Ramírez González, Gustavo Adolfo; Ancona Anacona, Fabián Andrés; Solarte Sarasty, Mario Fernando; Ramírez González, Gustavo Adolfo

doi:10.22395/rium.v20n39a5

Services on Demand

Journal

Article

Indicators

Cited by SciELO
Access statistics

Revista Ingenierías Universidad de Medellín

Print version ISSN 1692-3324On-line version ISSN 2248-4094

Rev. ing. univ. Medellín vol.20 no.39 Medellín July/Dec. 2021 Epub June 28, 2022

https://doi.org/10.22395/rium.v20n39a5

Article

LEXRANK WITH THRESHOLD TO IDENTIFY EF-ISF NAVIGATION PATTERNS IN XMOOC OF OPEN EDX ^*

LEXRANK CON UMBRAL PARA IDENTIFICAR PATRONES DE NAVEGACIÓN EF-ISF EN XMOOC DE OPEN EDX

Fabián Andrés Ancona Anacona^**
http://orcid.org/0000-0002-8962-7293

Mario Fernando Solarte Sarasty^***
http://orcid.org/0000-0002-2475-0688

Gustavo Adolfo Ramírez González^****
http://orcid.org/0000-0002-8406-0777

^{^**} Mg Telematics Engineering, Telematics Engineering Group, University of Cauca, Popayán, Colombia. E-mail: fanacona@unicauca.edu.co . Orcid: https://orcid.org/0000-0002-7518-257X.

^{^***} PhD Telematics Engineering, Telematics Engineering Group, University of Cauca, Tulcán sector, office 404. Popayán, Colombia. Tel: (+57) 2-809800, ext 2175. E-mail: msolarte@unicauca.edu.co. Orcid: https://orcid.org/0000-0002-3600-7592.

^{^****} PhD Telematics Engineering, Telematics Engineering Group, University of Cauca, Tulcán sector, office 405. Popayán, Colombia. Tel: (+57) 2-809800, ext 2127. E-mail: gramirez@unicauca.edu.co. Orcid: https://orcid.org/0000-0002-1338-8820.

Abstract

The use of the Open edX platform to offer xMOOC courses by different universities in the world has led to a growth in the participation of students in the courses, thus generating a set of navigation patterns when interacting with the xMOOC which are recorded in the tracking.log file. So far there is no study that identifies EF-ISF navigation patterns of the set of patterns and therefore this article proposes the use of the LexRank with Threshold algorithm for the identification of the EF-ISF navigation patterns.

Keywords: lexRank with threshold; navigation; EF-ISF navigation pattern; tracking.log; xMOOC

Resumen

El uso de la plataforma Open edX para ofrecer cursos xMOOC por diferentes universidades del mundo ha llevado a un crecimiento en la participación de los estudiantes en los cursos, generando de esta forma un conjunto de patrones de navegación al interactuar con los xMOOC que se registran en el archivo tracking.log. Hasta el momento no hay un estudio que identifique patrones de navegación EF-ISF del conjunto de patrones. Por esta razón, en este documento se propone el uso del algoritmo LexRank con Umbral para la identificación de los patrones de navegación EF-ISF.

Palabras clave: LexRank con Umbral; navegación; patrón de navegación EF-ISF; tracking.log; xMOOC

INTRODUCTION

The Massive Open Online Courses (MOOC) are a proposal to universalize education, to offer it in a freeway and with quality to people in any place in the world [¹, ²].

On the Open edX platform xMOOCs are offered. The platform has a static structure and it publishes the learning material developed to a large extent by the teacher(s) in charge of the course. In the xMOOCs there are students who navigate freely without necessarily following what is suggested by the content creators or the structure of the platform; there are other students who prefer to navigate as imposed by a teacher or the online learning environment. Student navigations generate a set of navigations that create navigation patterns, which are recorded in a text file called tracking.log for the Open edX platform [³-⁶].

The xMOOCs offered by different universities in the world have led to a growth in the participation of students in the courses [⁷], generating an increase in the navigation patterns registered in the tracking.log file. So far there is no study that identifies the navigation patterns according to the weighting of the relative frequency of an event (EF-ISF) [³-⁴, ⁸]. For this reason, this article proposes the use of the LexRak with Threshold algorithm to identify the EF-ISF navigation patterns registered in the tracking.log file of the xMOOC of the Open edX platform.

This article is structured as follows: section 1 describes in a general way the structure of the tracking.log file; section 2 is a representation of students with the vector space model and the EF-ISF; section 3 describes the LexRank with Threshold algorithm; section 4 constitutes an example for identification of EF-ISF navigation patterns; and section 5 comprises the conclusions and future work.

1. TRACKING.LOG OF THE OPEN EDX PLATFORM

The tracking.log is a JavaScript Object Notation (JSON) file that saves a record each time an activity or event is performed by the student(s) in the xMOOC course of the Open edX platform. The file has a structure general for all events, which is shown in figure 1 [³-⁴, ⁶].

Source: [⁴]

Figure 1 General structure of the Open edX traking.log file

2. STUDENT REPRESENTATION

This article uses the Vector Space Model, allowing to find the relation of the navigation events of the students in an xMOOC course, registered in the tracking.log file of the Open edX platform. In order to give a value to the navigation events, the weighting based on the relative frequency of an event (EF-ISF) is used, where represents the weight of the event ith of the student ,as can be seen in equation (1) [³-⁴, ⁹, ¹⁰].

(1)

The calculation of similarity between the students and is performed with the similarity of cosines, as shown in equation (2) [³, ⁹, ¹⁰].

(2)

The representation in the multidimensional space of the set of student vectors is made with the Matrix of Events by Students ( Matrix EF ISFmxn ) and the similarity of the same ones is made with the Matrix of similarity of Cosines (MatrixOfSimilaritynxn); the elements ij of the Matrixes ∈ R [³, ¹⁰].

3. LEXRANK WITH THRESHOLD ALGORITHM

The LexRank with Threshold algorithm is used to automatically generate summaries of one or multiple documents [³, ¹⁰]. This article proposes the use of the Algorithm for the identification of EF-ISF navigation patterns.

The LexRank with Threshold is based on the concept of prestige in social networks. A social network is a map of relationships between entities (students, organizations) that interact. Social networks are commonly represented in the form of graphs, where the nodes represent the entities and the links represent the relationships between the nodes [³, ¹⁰-¹¹].

A set of students can be seen as a network of related students; some are more similar to each other, while others may share little information with the rest of the students. If a student is very similar to the other students, this can be considered as the most central or representative. That is why there are two key points to support this definition of centrality; first, how to define the similarity between two students and second, how to calculate the global centrality of a student given his similarity with other students [³, ¹⁰-¹¹].

To define similarity the set of students is initially represented in the vector space model and the weighing of events EF-ISF from equation (1) is used. The similarity between two students is defined by the cosine similarity shown in equation (2); then the set of students is represented as a graph through an adjacency matrix (MatrixOf- Similarity); each value of the matrix corresponds to the cosine similarity between the students [³, ¹⁰-¹¹].

Subsequently, for grade centrality, the student node must take into account the votes of each node and where those votes come from. This can be considered if each node has a centrality value distributed between the node itself and its neighbors; as shown in equation (3) [ ³, ¹⁰-¹¹].

(3)

Where p(u) is the centrality of the u node, adj[u] is the set of nodes that are adjacent to u, and deg (v) is the degree of the v node. This equation can be written in matrix notation as follows, see equation (4) [ ³, ¹⁰-¹¹].

(4)

Where matrix B is obtained from the adjacency matrix of the similarity graph by dividing each element by the sum of the corresponding row, see equation (5) [ ³, ¹⁰-¹¹].

(5)

The sum of a row is equal to the degree of the corresponding node and each student is at least similar to himself, for this reason, the sums of the rows are different from zero. The equation (4) establishes that pᵀ is the own left vector of the matrix B with the own value corresponding to 1, to guarantee the existence of an own vector that can be identified and calculated in a unique way, the following needs to be taken into account [ ³, ¹⁰-¹¹].

A stochastic matrix X is the transition matrix of a Markov chain; an element X (i, j) specifies the probability of transition from a state i to a state j. By probability axioms, all rows of a stochastic matrix must add 1. (i, j), is the probability of state i to reach state j in n transitions. A Markov string with stochastic matrix X converges to a stationary distribution [ ³, ¹⁰-¹¹].

(6)

Where 1 = (1, 1, ... , 1), and the vector r is called the stationary distribution of the Markov chain; each element of the vector r gives the asymptotic probability of ending in the corresponding long-term state, regardless of the starting state. A Markov string is irreducible if any state is accessible from any other state, that is, for all i, j there is an n such that Xn (i, j) ≠ 0. A Markov string is aperiodic if for all i, gcd gcd {n∶ Xn (i, i) > 0} = 1. By Perron-Frobenius theorem, an irreducible and aperiodic Markov chain converges to a single stationary distribution [ ³, ¹⁰-¹¹].

Since the similarity matrix B in equation (4) satisfies the properties of a stochastic matrix; it can be treated as a Markov chain. The centrality vector P corresponds to the stationary distribution of B. However, it must be ensured that the similarity matrix is irreducible and aperiodic. To solve this, a low probability is saved for jumping to any node in the graph, which makes the graph irreducible and aperiodic. If you assign an uniform probability to jump to any node in the graph, you get the following modified version of equation (3), which is known as the LexRank with Threshold algorithm, see equation (7) [ ³, ¹⁰-¹¹].

(7)

Where N is the total number of nodes in the graph and d is a “damping factor”, which is usually chosen in the range [0.1, 0.2]. Equation (7) can be written in matrix form as shown in equation (8) [ ³, ¹⁰-¹¹].

(8)

Where U is a square matrix of NxN with all elements equal to 1/N. The transition kernel [dU + (1 - d)B] of the resulting Markov chain is a mixture of two U and B kernels. A random walker in this Markov chain chooses one of the adjacent states of the current state with probability 1 - d, or jumps to any state in the graph, including the current state, with probability d[ ³, ¹⁰-¹¹].

Below is the pseudocode of the LexRank with Threshold algorithm for the identification of EF-ISF navigation patterns in a set of students, as described in Algorithm 1 [ ³, ¹⁰-¹¹].

Note: Adapted from [¹¹].

Algorithm 1 Score calculation LexRank with Threshold

The Power Method describes how to calculate the Stationary Matrix of a Markov Chain, this is shown in Algorithm 2.

Note: Adapted from [¹¹].

Algorithm 2 Power Method

4. IDENTIFICATION OF EF-ISF NAVIGATION PATTERNS WITH THE LEXRANK WITH THRESHOLD ALGORITHM

This section shows an example of the process of identifying EF-ISF navigation patterns with the LexRank with Threshold algorithm. For this we use two events generated by a student of the course of Everyday Astronomy of Group B, from the year 2017 first academic period of the University of Cauca [⁴].

For the example the following parameters are used: two events of a student, threshold = 0.9, damping factor = 0.15 and tolerance error = 1.

The events are obtained from the tracking.log file of the Selene platform; some fields will be taken from them and the student will be called anonymous; this is shown in table 1.

Table 1 Parts of the tracking.log file of two events of a student

First event

username: anónimo

name: pause_video

time: 2017-05-19T03:23:26.966429+00:00

referer: http://selene.unicauca.edu.co/courses/course-v1:Unicauca+AstronomiaCotidianaGrupoB+2017-I/courseware/9ee2d4e6ba4f4c8cb5a1aea3b66220a8/83d11edf15c446a5be18be0014144fcb/event:

Second event

username: anónimo

name: load_video

time: 2017-05-19T03:21:00.319791+00:00

referer: http://selene.unicauca.edu.co/courses/course-v1:Unicauca+AstronomiaCotidianaGrupoB+2017-I/-Vcourseware/9ee2d4e6ba4f4c8cb5a1aea3b66220a8/83d11edf15c446a5be18be0014144fcb/event:

Source: [⁴].

Seven fields are taken from the fragments of the record and they will be called events, as shown in table 2 [⁴].

Table 2 Log events

event1 = pause_video,

event2 = AstronomiaCotidianaGrupoB

event3 = 9ee2d4e6ba4f4c8cb5a1aea3b66220a8

event4 = 83d11edf15c446a5be18be0014144fcb

event5 = 0xIv1RoSXNk

event6 = load_video

event7 = P2uUPX2y8Ks

Source: [⁴].

Applying equation (1), the following EF-ISF Matrix is obtained, see Matrix 1 [⁴].

Source: [⁴]

Matrix 1 EF-ISF Matrix

With the EF-ISF Matrix and the equation (2), the Cosine similarity Matrix is calculated; applying the threshold = 0.9 and distributing the centrality of the student, the following Stochastic Matrix is created, see Matrix 2.

Source: own elaboration

Matrix 2 Stochastic Matrix

The Stochastic Matrix is transformed into the Matrix X, which is irreducible and aperiodic with damping factor = 0.15, see Matrix 3.

Source: own elaboration

Matrix 3 Matrix X

Then calculate the Stationary Matrix with tolerance error = 1, this is shown in Matrix 4.

Source: own elaboration

Matrix 4 Stationary Matrix

Based on the results of the Stationary Matrix, it is evident that the student’s two EF-ISF navigation patterns have the same value, from which one can conclude that both have equal importance. The EF-ISF navigation patterns are now shown and they were determined with the LexRank with Threshold algorithm, see table 3.

Table 3 EF-ISF navigation patterns of the anonymous student

EF-ISF Navigation1 Patterns AstronomiaCotidianaGrupoB -
>9ee2d4e6ba4f4c8cb5a1aea3b66220a8 -> 83d11edf15c446a5be18be0014144fcb ->pause_video - >0xIv1RoSXNk
EF-ISF Navigation2 Patterns
AstronomiaCotidianaGrupoB - >9ee2d4e6ba4f4c8cb5a1aea3b66220a8 -> 83d11edf15c446a5be18be0014144fcb ->load_video - >P2uUPX2y8Ks

Source: own elaboration.

5. CONCLUSIONS AND FUTURE WORK

With the LexRank with Threshold Algorithm you can determine the EF-ISF navigation patterns of the set of student navigation patterns recorded in the xMOOC tracking.log file of the Open edX platform.

The largest number ∈R in the Stationary Matrix identifies a single EF-ISF navigation pattern, which represents the set of EF-ISF navigation patterns from the set of student navigation patterns recorded in the xMOOC tracking.log file of the Open edX platform.

In the Stationary Matrix, the EF-ISF navigation patterns are represented with a value of ∈R This value indicates the importance of each EF-ISF navigation pattern in the set of EF-ISF navigation patterns.

As future work, we proposed the implementation of LexRank with Threshold Algorithm for the identification of navigation patterns EF-ISF, the set of navigation patterns of students recorded in the xMOOC tracking.log file of the Open edX platform.

ACKNOWLEDGEMENTS

The authors are grateful for the implementation and dissemination of the results set out in this article and for the support received by the project MOOC-Maker Construction of Management Capacities of MOOCs in Higher Education (561533-EPP-1-2015- 1-ESEPPKA2-CBHE-JP) funded by the European Commission through the Erasmus+ Programme.

They would also like to thank the VRI 49694 MOOCMENTES project Capacity Building for MOOC Management for Vocational Training, Rural Development and New Generations of Rural Students in Improving their Transit to Higher Education, co-financed within the framework of rural partnerships by the Ministry of National Education of Colombia

REFERENCES

[1] J. DeBoer and L. Breslow, “Work in Progress: Student Behaviors Using Feedback in a Blended Physics Undergraduate Classroom,” in Proceedings of the Third (2016) ACM Conference on Learning@ Scale, pp. 229-232. doi: 10.1145/2876034.2893421. [ Links ]

[2] M. Solarte, G. A. Ramírez, and D. A. Jaramillo, “Hábitos de ingreso y resultados en las evaluaciones en cursos en línea masivos con reconocimiento académico,” Ingeniería e Innovación, vol. 5, no. 1, pp. 51-59, 2017. doi: 10.21897/23460466.1105. [ Links ]

[3] F. Anacona, M. Solarte , and G. Ramírez, “Descubrimiento de patrones de navegación en Open edx-una aproximación arquitectónica,” Ingeniería e Innovación, vol. 5, no. 1, pp. 43-50, 2017. doi: 10.21897/23460466.1103. [ Links ]

[4] F. Anacona , M. Solarte , and G. González, “Modelo de Espacio Vectorial con ponderación basada en frecuencia relativa de eventos de navegación en una instancia de Open edX para caracterización del estudiantado,” en Segunda Conferencia Internacional MOOC-MAKER, Medellín, 2018, vol. 2224, pp. 87-95. [ Links ]

[5] P. J. Guo and K. Reinecke, “Demographic differences in how students navigate through MOOCs,” in Proceedings of the first ACM conference on Learning@ scale conference, pp. 21-30, 2014. doi: 10.1145/2556325.2566247 [ Links ]

[6] R. K. Yadav, “Understanding Logs in edX for Monitoring Student Progress,” Ph.D. dissertation, Indian Institute of Technology, Bombay, 2014. [ Links ]

[7] C. Alario-Hoyos, M. Pérez-Sanagustín, M. Morales, C. D. Kloos, R. Hernández-Rizzardini, M. Román, et al., “MOOC-Maker: Tres Años Construyendo Capacidades de Gestión de MOOCs en Latinoamérica,” in Segunda Conferencia Internacional MOOC-MAKER, Medellín, vol. 2224, pp. 5-14, 2018. [ Links ]

[8] M. Burbano, F. Anacona , M. Solarte , and G. Ramírez, “Informe sobre tecnologías Web Semántica y Social en cursos MOOC,” MOOC-Maker, vol. WDP 1.14,pp. 1-17, 2016. [ Links ]

[9] R. M. Alguliev, R. M. Aliguliyev, M. S. Hajirahimova, and C. A. Mehdiyev, “MCMR: Maximum coverage and minimum redundant text summarization model,” Expert Systems with Applications, vol. 38, no. 12, pp. 14514-14522, 2011. doi: 10.1016/j.eswa.2011.05.033 [ Links ]

[10] F. Anacona , C. Cobos, and M. Mendoza, “Algoritmo para generación automática de resúmenes extractivos genéricos de múltiples documentos basados en consensos,” B.S. Thesis, Univ. del Cauca, Popayán, 2015. [ Links ]

[11] G. Erkan and D. R. Radev, “LexRank: graph-based lexical centrality as salience in text summarization,” Journal of artificial intelligence research ., vol. 22, pp. 457-479, 2004. doi: 10.1613/jair.1523. [ Links ]

* Paper Type: Research (Paper in extended version of the Seminario en Innovaciones Educativas Sinnem. Oct 18th, 2018). The research was funded by the MOOC-MAKER Project (561533-EPP-1-2015-1-ESEPPKA2-CBHE-JP) and the MOOCMENTES project of the VRI 49694.

How to cite: Ancona Anacona, F. A., Solarte Sarasty, M. F., & Ramírez González, G. A. (2021). LexRank with threshold to identify EF-ISF navigation patterns in xMOOC of Open edX.Revista Ingenierías Universidad De Medellín,20(39), 85-96. https://doi.org/10.22395/rium.v20n39a5

Received: October 25, 2019; Accepted: May 28, 2020

This is an open-access article distributed under the terms of the Creative Commons Attribution License

Services on Demand

Journal

Article

Indicators

Related links

Share

Revista Ingenierías Universidad de Medellín

Print version ISSN 1692-3324On-line version ISSN 2248-4094

Rev. ing. univ. Medellín vol.20 no.39 Medellín July/Dec. 2021 Epub June 28, 2022

https://doi.org/10.22395/rium.v20n39a5