The study of phonological proximity makes it possible to establish a basis for future decision-making in the treatment of sign languages. Knowing how close a set of signs are allows the interested party to decide more easily its study by clustering, as well as the teaching of the language to third parties based on similarities. In addition, it lays the foundation for strengthening disambiguation modules in automatic recognition systems. To the best of our knowledge, this is the first study of its kind for Costa Rican Sign Language (LESCO, for its Spanish acronym), and forms the basis for one of the modules of the already operational system of sign and speech editing called the International Platform for Sign Language Edition (PIELS). A database of 2665 signs, grouped into eight contexts, is used, and a comparison of similarity measures is made, using standard statistical formulas to measure their degree of correlation. This corpus will be especially useful in machine learning approaches. In this work, we have proposed an analysis of different similarity measures between signs in order to find out the phonological proximity between them. After analyzing the results obtained, we can conclude that LESCO is a sign language with high levels of phonological proximity, particularly in the orientation and location components, but they are noticeably lower in the form component. We have also concluded as an outstanding contribution of our research that automatic recognition systems can take as a basis for their first prototypes the contexts or sign domains that map to clusters with lower levels of similarity. As mentioned, the results obtained have multiple applications such as in the teaching area or the Natural Language Processing area for automatic recognition tasks.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited