<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>0123-3033</journal-id>
<journal-title><![CDATA[Ingeniería y competitividad]]></journal-title>
<abbrev-journal-title><![CDATA[Ing. compet.]]></abbrev-journal-title>
<issn>0123-3033</issn>
<publisher>
<publisher-name><![CDATA[Facultad de Ingeniería, Universidad del Valle]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S0123-30332024000100018</article-id>
<article-id pub-id-type="doi">10.25100/iyc.v26i1.13230</article-id>
<title-group>
<article-title xml:lang="en"><![CDATA[Human action detection for inventory control using computer vision.]]></article-title>
<article-title xml:lang="es"><![CDATA[Detección de acción humana para el control de inventario utilizando visión por computadora]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Bernal-Baquero]]></surname>
<given-names><![CDATA[Francisco]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Martínez]]></surname>
<given-names><![CDATA[Darwin E.]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
</contrib-group>
<aff id="Af1">
<institution><![CDATA[,Sergio Arboleda University Engineering Department ]]></institution>
<addr-line><![CDATA[Bogotá ]]></addr-line>
<country>Colombia</country>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>04</month>
<year>2024</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>04</month>
<year>2024</year>
</pub-date>
<volume>26</volume>
<numero>1</numero>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://www.scielo.org.co/scielo.php?script=sci_arttext&amp;pid=S0123-30332024000100018&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.co/scielo.php?script=sci_abstract&amp;pid=S0123-30332024000100018&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.co/scielo.php?script=sci_pdf&amp;pid=S0123-30332024000100018&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="en"><p><![CDATA[Abstract Computer vision (CV) can be a process that facilitates some tasks in inventory management, through this process a permanent analysis of an inventory can be performed and thus keep record of all movements made, delivering an instant report when required. This means an improvement in security, since by keeping a strict control of the existing elements in the inventory it is possible to know if an element belongs or not to an inventory or when an element is removed or added, after this need for inventory control, the need arises to design an intelligent system that can facilitate inventory control. Through the combination of 2 frameworks, the creation of an algorithm capable of performing the identification and counting of objects, as well as the identification of the hand to determine when a human manipulation is performed to the inventory. To achieve this objective, two algorithms were used: MediaPipe and YOLOv5 combined with the COCO dataset, the first one was used for hand detection and the second one identifies and counts the objects. After testing the algorithm, it was determined that the hand recognition of MediaPipe had an accuracy of 96% and the detection and classification of objects using YOLO was 43.7%. Challenges for the algorithm were overlapping, occlusion/self-occlusion of objects, or loss of focus of items due to the sensor.]]></p></abstract>
<abstract abstract-type="short" xml:lang="es"><p><![CDATA[Resumen La visión por computadora (VC) puede ser un proceso que facilite algunas tareas en la gestión de inventarios, por medio de este proceso se puede realizar un análisis permanente de un inventario y así mantener registro de todos los movimientos realizados, entregando un reporte instantáneo cuando sea requerido. Esto supone una mejora en la seguridad, ya que al mantener un control estricto de los elementos existentes en el inventario se puede saber si un elemento pertenece o no a un inventario o cuando se retira o agrega un elemento, tras esta necesidad de control de inventario, surge la necesidad de diseñar un sistema inteligente que pueda facilitar el control de inventarios. Mediante la combinación de 2 frameworks, se realiza la creación de un algoritmo capaz de realizar la identificación y conteo de objetos, así como la identificación de la mano para determinar cuándo se realiza una manipulación humana al inventario. Para lograr este objetivo, se utilizaron dos algoritmos: MediaPipe y YOLOv5 combinado con el dataset de COCO, el primero se usó para la detección de manos y el segundo identifica y cuenta los objetos. Después de las pruebas realizadas al algoritmo se determina que el reconocimiento de manos de MediaPipe tuvo una precisión del 96% y la detección y clasificación de objetos usando YOLO fue de 43.7%. Teniendo como retos el algoritmo la superposición, la oclusión/auto oclusión de los objetos, o la pérdida de foco de los elementos debido al sensor.]]></p></abstract>
<kwd-group>
<kwd lng="es"><![CDATA[Conteo de objetos]]></kwd>
<kwd lng="es"><![CDATA[Detección de manos]]></kwd>
<kwd lng="es"><![CDATA[Detección de objetos]]></kwd>
<kwd lng="es"><![CDATA[MediaPipe]]></kwd>
<kwd lng="es"><![CDATA[YOLO]]></kwd>
<kwd lng="en"><![CDATA[Hand detection]]></kwd>
<kwd lng="en"><![CDATA[Object counting]]></kwd>
<kwd lng="en"><![CDATA[Object detection]]></kwd>
<kwd lng="en"><![CDATA[Mediapipe]]></kwd>
<kwd lng="en"><![CDATA[YOLO]]></kwd>
</kwd-group>
</article-meta>
</front><back>
<ref-list>
<ref id="B1">
<label>1</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Athanasios]]></surname>
<given-names><![CDATA[V]]></given-names>
</name>
<name>
<surname><![CDATA[Nikolas]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
<name>
<surname><![CDATA[Anastasios]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
<name>
<surname><![CDATA[Eftychios]]></surname>
<given-names><![CDATA[P]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Deep Learning for Computer Vision A Brief Review]]></article-title>
<source><![CDATA[Computational Intelligence and Neuroscience]]></source>
<year>2018</year>
<volume>2018</volume>
<page-range>13</page-range></nlm-citation>
</ref>
<ref id="B2">
<label>2</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Christopher Bradley]]></surname>
<given-names><![CDATA[K]]></given-names>
</name>
</person-group>
<source><![CDATA[Computer Vision for Inventory Management]]></source>
<year>2020</year>
<publisher-name><![CDATA[Lousiana Tech University]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B3">
<label>3</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Felipe]]></surname>
<given-names><![CDATA[R]]></given-names>
</name>
<name>
<surname><![CDATA[Leaned]]></surname>
<given-names><![CDATA[Q]]></given-names>
</name>
<name>
<surname><![CDATA[Frank]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[David]]></surname>
<given-names><![CDATA[C]]></given-names>
</name>
<name>
<surname><![CDATA[Enrique]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Software component for weapons recognition in]]></article-title>
<source><![CDATA[X-ray images]]></source>
<year>2017</year>
<volume>11</volume>
<numero>2</numero>
<issue>2</issue>
<page-range>162</page-range></nlm-citation>
</ref>
<ref id="B4">
<label>4</label><nlm-citation citation-type="">
<collab>MediaPipe team</collab>
<source><![CDATA[MediaPipe Framework]]></source>
<year></year>
</nlm-citation>
</ref>
<ref id="B5">
<label>5</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Fan]]></surname>
<given-names><![CDATA[Z]]></given-names>
</name>
<name>
<surname><![CDATA[Valentin]]></surname>
<given-names><![CDATA[B]]></given-names>
</name>
<name>
<surname><![CDATA[Andrey]]></surname>
<given-names><![CDATA[V]]></given-names>
</name>
<name>
<surname><![CDATA[Andrei]]></surname>
<given-names><![CDATA[T]]></given-names>
</name>
<name>
<surname><![CDATA[George]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Chuo-Ling]]></surname>
<given-names><![CDATA[C]]></given-names>
</name>
</person-group>
<source><![CDATA[MediaPipe Hands: On-device Real-time Hand Tracking]]></source>
<year>2020</year>
</nlm-citation>
</ref>
<ref id="B6">
<label>6</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Joseph]]></surname>
<given-names><![CDATA[R]]></given-names>
</name>
<name>
<surname><![CDATA[Santosh]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
<name>
<surname><![CDATA[Ross]]></surname>
<given-names><![CDATA[G]]></given-names>
</name>
<name>
<surname><![CDATA[Ali]]></surname>
<given-names><![CDATA[F]]></given-names>
</name>
</person-group>
<source><![CDATA[You Only Look Once: Unified, Real-Time Object Detection]]></source>
<year>2015</year>
</nlm-citation>
</ref>
<ref id="B7">
<label>7</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Christian]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Wei]]></surname>
<given-names><![CDATA[L]]></given-names>
</name>
<name>
<surname><![CDATA[Yangqing]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Pierre]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Scott]]></surname>
<given-names><![CDATA[R]]></given-names>
</name>
<name>
<surname><![CDATA[Dragomir]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
</person-group>
<source><![CDATA[Going Deeper with Convolutions]]></source>
<year>2014</year>
</nlm-citation>
</ref>
<ref id="B8">
<label>8</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Glenn]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[why do I need to train from the pt model you have trained? · Issue #2990 · ultralytics/yolov5]]></article-title>
<source><![CDATA[GitHub]]></source>
<year></year>
</nlm-citation>
</ref>
<ref id="B9">
<label>9</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bambach]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Lee]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Crandall]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
<name>
<surname><![CDATA[Yu]]></surname>
<given-names><![CDATA[C]]></given-names>
</name>
</person-group>
<source><![CDATA[EgoHands Object Detection Dataset]]></source>
<year></year>
</nlm-citation>
</ref>
<ref id="B10">
<label>10</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Narendra]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
<name>
<surname><![CDATA[Sinisa]]></surname>
<given-names><![CDATA[T]]></given-names>
</name>
</person-group>
<source><![CDATA[Learning the Taxonomy and Models of Categories Present in Arbitrary Images]]></source>
<year>2007</year>
<conf-name><![CDATA[ 11th International Conference on Computer Vision]]></conf-name>
<conf-date>2007</conf-date>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B11">
<label>11</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ming Jin]]></surname>
<given-names><![CDATA[C]]></given-names>
</name>
<name>
<surname><![CDATA[Zaid]]></surname>
<given-names><![CDATA[O]]></given-names>
</name>
<name>
<surname><![CDATA[Mohamed]]></surname>
<given-names><![CDATA[H]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A review of hand gesture and sign language recognition technique]]></article-title>
<source><![CDATA[International Journal of Machine Learning and Cybernetics]]></source>
<year>2017</year>
<volume>10</volume>
<page-range>131-53</page-range></nlm-citation>
</ref>
<ref id="B12">
<label>12</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Loïc]]></surname>
<given-names><![CDATA[C]]></given-names>
</name>
<name>
<surname><![CDATA[Benoît]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
<name>
<surname><![CDATA[Philippe]]></surname>
<given-names><![CDATA[T]]></given-names>
</name>
</person-group>
<source><![CDATA[Object Detection with Spiking Neural Networks on Automotive Event Data]]></source>
<year>2022</year>
</nlm-citation>
</ref>
<ref id="B13">
<label>13</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Pedro]]></surname>
<given-names><![CDATA[F]]></given-names>
</name>
<name>
<surname><![CDATA[Ross]]></surname>
<given-names><![CDATA[B]]></given-names>
</name>
<name>
<surname><![CDATA[David]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
<name>
<surname><![CDATA[Deva]]></surname>
<given-names><![CDATA[R]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Object Detection with Discriminatively Trained Part-Based Models]]></article-title>
<source><![CDATA[IEEE Transactions on Pattern Analysis and Machine Intelligence]]></source>
<year>2010</year>
<volume>32</volume>
<numero>9</numero>
<issue>9</issue>
<page-range>1627-45</page-range></nlm-citation>
</ref>
<ref id="B14">
<label>14</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Sanja]]></surname>
<given-names><![CDATA[F]]></given-names>
</name>
<name>
<surname><![CDATA[Ales]]></surname>
<given-names><![CDATA[L]]></given-names>
</name>
</person-group>
<source><![CDATA[Towards Scalable Representations of Object Categories: Learning a Hierarchy of Parts]]></source>
<year>2007</year>
<conf-name><![CDATA[ Conference on Computer Vision and Pattern Recognition]]></conf-name>
<conf-date>2007</conf-date>
<conf-loc>Minneapolis, MN, USA </conf-loc>
<page-range>1-8</page-range></nlm-citation>
</ref>
<ref id="B15">
<label>15</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Arpita]]></surname>
<given-names><![CDATA[H]]></given-names>
</name>
<name>
<surname><![CDATA[Akshit]]></surname>
<given-names><![CDATA[T]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Real-time Vernacular Sign Language Recognition using MediaPipe and Machine]]></article-title>
<source><![CDATA[Learning]]></source>
<year>2021</year>
<volume>2</volume>
<numero>5</numero>
<issue>5</issue>
<page-range>9-17</page-range></nlm-citation>
</ref>
<ref id="B16">
<label>16</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Nathasia]]></surname>
<given-names><![CDATA[F]]></given-names>
</name>
<name>
<surname><![CDATA[Michael]]></surname>
<given-names><![CDATA[V]]></given-names>
</name>
<name>
<surname><![CDATA[Seto]]></surname>
<given-names><![CDATA[B]]></given-names>
</name>
<name>
<surname><![CDATA[Abdul]]></surname>
<given-names><![CDATA[H]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Hand Gesture Recognition as Signal for Help using Deep Neural Network]]></article-title>
<source><![CDATA[International Journal of Emerging Technology and Advanced Engineering]]></source>
<year>2022</year>
<volume>12</volume>
<numero>2</numero>
<issue>2</issue>
<page-range>37-47</page-range></nlm-citation>
</ref>
</ref-list>
</back>
</article>
