Co-Creation for Sign Language Processing and Translation Technology

Lepp, Lisa; Shterionov, Dimitar; De Sisto, Mirella; Chrupała, Grzegorz

doi:10.3390/info16040290

Open AccessArticle

Co-Creation for Sign Language Processing and Translation Technology

Department of Cognitive Science and Artificial Intelligence, Tilburg University, 5037 AB Tilburg, The Netherlands

^*

Authors to whom correspondence should be addressed.

Information 2025, 16(4), 290; https://doi.org/10.3390/info16040290

Submission received: 3 February 2025 / Revised: 7 March 2025 / Accepted: 11 March 2025 / Published: 4 April 2025

(This article belongs to the Special Issue Human and Machine Translation: Recent Trends and Foundations)

Download Versions Notes

Abstract

Sign language machine translation (SLMT)—the task of automatically translating between sign and spoken languages or between sign languages—is a complex task within the field of NLP. Its multi-modal and non-linear nature require the joint efforts of sign language (SL) linguists, technical experts, and SL users. Effective user involvement is a challenge that can be addressed through co-creation. Co-creation has been formally defined in many fields, e.g., business, marketing, educational, and others; however, in NLP and in particular in SLMT, there is no formal, widely accepted definition. Starting from the inception and evolution of co-creation across various fields over time, we develop a relationship typology to address the collaboration between deaf, hard of hearing, and hearing researchers and the co-creation with SL users. We compare this new typology to the guiding principles of participatory design for NLP. We then assess 111 articles from the perspective of involvement of SL users and highlight the lack of involvement of the sign language community or users in decision-making processes required for effective co-creation. Finally, we derive formal guidelines for co-creation for SLMT which take the dynamic nature of co-creation throughout the life cycle of a research project into account.

Keywords:

sign language machine translation; co-creation; sign language community; relationship typology; techniques; sign languages; machine translation

1. Introduction

MT has found a dominant place in the workflow of (professional and non-professional) translators and translation services. Since its inception in the 1950s, MT has been rapidly evolving to reach unprecedented qualities thanks to the advances in AI, deep learning, and other related (sub)fields, the innovations of the core processing technology (CPUs being replaced by GPUs), the raising volumes and quality of data, and other aspects. These advances have had a significant impact on text-to-text translation; however, when it comes to sign languages (SLs) used primarily by deaf and hard of hearing (HoH) individuals, the advances are very limited. Historically, it has been the case that researchers use the distinctive categories such as “hearing”, “HoH”, or “deaf”, with lower letters instead of with capitals, such as Deaf, which is showing the cultural and social gains of being deaf. As we try to place this article within a historical framework, we stick to the use of these words with lower letters, while we acknowledge that this division should originally be between signers and non-signers. Within the group of signers, different sub-categories can be present. Hearing does not imply that a person cannot sign, just as being deaf does not automatically means that they can sign. However, as we see in the current paper, researchers without knowledge or skills in SLs are making decisions that are in contrast to what is needed or benefiting the sign language community (SLC). These limitations can be attributed to many technical factors, e.g., data, processing capacity, methodological gaps, and others, but a large one is the human factor. Research on and with SLs, and especially in the context of developing SL technology (SLT), has not only been predominantly hearing-led, but has also involved deaf people in an ineffective and often unethical way. While this trend is changing, there is a need for setting up a framework and guiding principles for effective and mutually beneficial collaboration between researchers and users—hearing, deaf, and HoH individuals. Such a framework should be able to answer two main questions. First, how does one establish a relationship between the researcher/developer and the citizen/end-user/community, which can allow the active, ethical and fair involvement of both parties into the ideation, design, production and/or delivery of research and development activities for the accomplishment of a common goal? Second, how does one maintain such a relationship in order for both parties (as well as society in general) to benefit? The latter relates to more granular decisions such as how to determine who owns the data, how to ensure privacy during data collection and utilization, who decides what proper SL use is, which kind of groups of SL users need to be involved, how much diversity is included, what the effect of such kind of projects on the SLC is, and in the end, who is really benefiting from such projects. The involvement of users (or customers, consumers, etc.) in a project’s life cycle has been put into frames as participatory research (users being involved as domain experts), co-design (users participate in identifying/defining solutions), and co-creation (a mutually beneficial engagement between different participants that facilitates future opportunities) in a large volume of prior work. This work stems predominantly from the fields of marketing and business (as discussed in Section 2). In NLP, and even more specifically in MT, there is very limited research on the topic of co-creation, with the work of Caselli et al. [1] being the most elaborate and indicative on the topic.

In [2], Harder et al. [2] define a participation relationship typology which links the degree of involvement of different participants (researchers and users) in a collaborative project to the degrees of societal impact and (mutual) benefit. We adopt this typology and adapt it to the specifics of SLP and SLMT with respect to the involved participants—deaf, HoH, and hearing. We follow its layered, multidimensional structure to assess the degree of involvement of SLCs in SLP and SLMT projects and to then formally define co-creation for SL processing (SLP) and machine translation (SLMT) research and development. To illustrate its applicability in the field of NLP and MT, we align it with the guidelines for participatory research set in [1].

The proposed definition sets a foundation to allow researchers and practitioners in the field to establish the right communication and collaboration protocols and engage in effective co-operation. We provide an initial framework for the development of unified standards in the field of SLMT and beyond. As a derivative of Harder et al. [2]’s typology, our framework can (and should) be adopted and adapted to other fields.

This article is structured as follows: in Section 2, we first provide an overview of co-creation as defined and used in many different fields. We discuss the relationship between researcher and the end-user, focusing on the typology of [2] in Section 3. In Section 4, we discuss which adaptions are necessary for a proper usage of the existing typology for the SL user; we then review 111 articles on SLMT and manually classify them based on the new typology. We wrap up the article by proposing formal guidelines for co-creation in the context of linguistic (theoretical) research and (practical) development in the field of SLMT in Section 6.

2. Co-Creation

Co-creation, a term that has gained significant attention recently, has been used in many fields—including business, marketing, education, and others. Prahalad et al. [3] used this term for the first time to point out the process of involving consumers in the creation of the project value and their competition for value [4]. Co-creation evolved from cooperative design—a movement in the 1960s in Sweden and Denmark in which workers participated in the design of IT systems that impacted their work [5]. Later, in the 1970s, in the US, a similar methodology was adopted under the name of participatory design, and in the 1980s urban planners adopted the concept in the form of collaborative planning. In 2010, co-creation was the focus of the book “The Power of Co-creation” by Ramaswamy [6]—a business-oriented manuscript, which describes that co-creation is about involving people in a collaborative process, wherein businesses and customers are working together to create a value, product, or experiences which in turn boost the overall value and the benefits of the (collaboration) network involved. In the current article, we adopt the definition of co-creation pinned by Frow et al. [7]:

Definition 1.

Co-creation is a process which establishes and maintains a dialogue between users and developers not only in the research phase, but also in the phases of ideation, prototyping, and implementation.

As with many other methodologies, co-creation is not an “all or nothing” phenomena (either you co-create or you do not), but multi-layered with varying levels of involvement of the users. In the context of SLMT research, some SLMT projects involve the SLC for evaluation of models [8,9,10,11,12], for data collection [13,14,15], in experimental situations [16,17,18], or in workshops [16] and interviews [19]. Others attempt to cooperate with or integrate the SLC in their research [20,21,22,23]. However, even when users are involved, the size and diversity of the population is of importance and often the included population is not a representation of the (variation in between the) SLC. For example, in [24], it is not clear whether the participants were deaf, and in [25] the participants were not native signers.

To define a co-creation methodology for SLP and in particular SLMT, we review works related to different forms of co-creation, as defined and used in various fields. We review works on participatory research, participatory or collaborative design (co-design), and co-creation. Later, we establish a co-creative approach to SLP research that is in line with existing works, fits the use cases of SLP, and can be positioned among these works.

2.1. Marketing/Business Perspective

Traditionally, innovators, developers, businesses, etc., produce goods or services for customers to buy or use with little to no involvement of the target user groups during the development process. This obsolete approach evolved to allow iterative dialogues between producer and consumer that can create value for both parties [26], and it has been referred to as co-production in the business literature (see [27] for a detailed overview of fundamental literature related to co-production) and later, as remarked in [28], as co-creation. As noted by Vargo and Lusch [29], the latter term better aligns with the concept of service-dominant logic, a marketing framework suggested by Prahalad et al. [3], which sees customers as actors who create and compete for value [3]. Consumers as a source of competence bring their knowledge and skills, their willingness to learn and experiment and engage in active dialogues. This competence, in addition, brings the consumer a competitive advantage, allowing them to judge and negotiate terms and prices, therefore extracting value from enterprises [3].

2.2. Social Domain

Within the social domain, co-creation refers to the collaboration between a variety of actors actively joining forces to tackle jointly defined challenges [30]. Actors might belong to various sectors of the society, often spanning academia, government, industry, and societal partners.

2.3. Community-Based Participatory Research (CBPR)

The main goal of CBPR is to bring academics and community members into research partnerships [31]. In several fields like healthcare, the public domain, the social domain, and the public sector, working together with the citizen or customer became the new, modern norm [32]. The goal of cooperation with the community in the public domain, for example, is to meet societal needs by fundamentally altering the relationships, roles, and rules among the stakeholders involved, through an open process of participation, exchange, and collaboration with relevant parties, including end-users, thus transcending organizational boundaries and jurisdictions [32].

2.4. Responsible Research and Innovation (RRI)

Science and innovation outputs resonate across direct and indirect stakeholders spreading over whole societies. To increase positive impacts and lower the negative ones, the concept of RRI implies a shift “from solutions developed internally within the research community and only tolerated passively by society towards ones that are taking citizens and other actors actively into consideration as part of the development of solutions that are more apt to achieve desirable results with a high impact” [33].

2.5. In NLP and MT

The importance of involving user communities is also quite relevant for the field of NLP. However, in the field of NLP, the term “co-creation” is often used in the context of human–AI collaborations for effective content creation. Examples involve the recent work by Sharma et al. [34], Konen et al. [35], Ding et al. [36], among others, who investigate how large language models (LLMs) and (human) content creators can collaborate most effectively. Co-creative approaches have also been employed for specific tasks such as poetry generation [37], literature synthesis [38], interpreting [39], and others (the ACL anthology search of 7-Oct-2024 results in 146 relevant works).

Another form of co-creation in NLP is by involving users as knowledge resource for training models. However, to refer to this process as co-creation, following the definition of co-creation as an ongoing dialog, researchers should consider works that continuously involve users in the various model, tool, or project development stages. Caselli et al. [1] highlight that too often language data are gathered in a way that does not consider nor involve language users. This is a broad problem in NLP, which is not only limited to language minorities but also affects well-represented languages. Caselli et al. [1] call for shifting the perspective from language as data to language as people, given that language is produced by humans and has a deep function in human life. They propose nine guiding principles inspired by participatory design. These principles underline that design is a continuous process, recognize the importance of communication with communities and of creating understanding and trust, and aim to stir NLP towards producing what the communities need and not “replacing humans”, including community feedback in the NLP pipeline. In addition, the work acknowledges that making one checklist that is valid for every community and every circumstances is impossible. These guidelines are presented in detail in Appendix A.

It is worth noting that participatory design has been adopted by the early work of [40] which involves engineers and users in “linguistic engineering” process through a tool, called GEPPETTO, designed to support both roles. However, the embedded practices did not exceed the scope of the work of Ciravegna et al. [40]; parts of the developed tool were integrated in the work of [41].

2.6. Participatory Research, Co-Design, and Co-Creation

While focusing on co-creation, the above also covers definitions related to participatory research, participatory or collaborative design (or co-design) and co-creation. It is important to note that these are not the same, and despite their overlaps, we ought to acknowledge their differences. Participatory research is the process of involving users as domain experts at all levels of research process. Participatory design is about engaging with end-users as experts in their individual experiences. The contrast between participatory research and participatory design (or co-design) is the purpose of involvement—in the former, end-users aid the research and development process, while in the latter they contribute to identifying possible solutions. Co-creation, as defined above, is a form of engagement between businesses and end-users that is also about identifying new and emergent value opportunities for both parties (more details on the differences and similarities between these three terms can be found in Josh Morrow’s post [42]).

3. The Essential Elements of Co-Creation

Section 2 presents different definitions from different fields and for different purposes often revolving around several goals. This indicates that the formalization as well as application of co-creation are not universal but strongly depend on the involved stakeholders and use cases.

We identify two common threads: (i) a shared/common goal and (ii) a formal and sustainable relationship between end-users or consumers and researchers, developers or businesses. The shared goal is the starting point of every cooperative research project as clearly pointed out by Stier and Smit [30]. Each partner or stakeholder needs to define their problem to make sure that the objectives of all parties are aligned. Academics typically refer to an issue as a research problem, whereas a civil servant might describe it as a political or health problem [30]. Stier and Smit [30] also stress the importance of being frank early in the collaboration process with what one sees as one’s role in the co-creation. In doing so, it may be necessary to convey the objectives and underpinning values of one’s work and organization. Unclear roles can place responsibilities on the wrong participants raising ethical dilemmas. Building a sustainable relationship with the citizen/end-user/actor/community, and trying to involve them actively into the design, production or delivery, e.g., not only as passive participant, but as an actively contributing and valuable partner with a joint responsibility, is key to co-creation.

In [2], Harder et al. [2] set up a framework of a participation relationship between researcher (A) and society (B) spread over three dimensions—depth, breadth, and scope of participation—organized in a typology of six levels. These capture the typical processes, attitudes, assumptions (of A), and actions (of A) towards the users or society (B). Table 1 gives an overview of typology in [2] of the relationships between A and B in terms of the typical processes.

In Levels −1 (Denigration) and 0 (Neglect), there is no involvement of the user community; only researchers are involved in development and decision-making. Society’s interests or needs are not investigated nor taken into account. While Level 0 covers various forms of negligence of the researchers towards the actual needs and wishes of the users, −1 points towards a disruptive attitude with negative impact. On Level 1 (Learning About), the knowledge of society is acknowledged, but there is no influence (of the society/user community) on the developed product or the development process; Level 2 (Learning From) considers an active participation, but without influence on major decision points. Although this level number indicates positive processes to include the society, it is still on the lower part of the ladder, with substantial decision-making power in the hands of the researcher(s), where automatically the society has less power because of the fact that they cannot influence the decision-making process. According to Level 3 (Learning Together), discussions about relevant topics are held between A and B, there is consensus about the outcomes, and most decisions are jointly made. Level 4 (Learning As One) represents a full partnership where each stakeholder is entirely involved, and their skills, knowledge, and life experience are valued and employed to succeed in a shared goal.

In the context of these levels, different dimensions are clearly defined as follows:

Depth refers to the extent of control over decision-making by the involved participants. It also refers to the amount of power that each party has in the project. Harder et al. [2] use the term depth for the first time in an educational setting, where they talk about lower and higher status of the stakeholders.
Breadth refers to the extent of diversity that the groups covers. Who are the stakeholders? Participation can be divided into different groups, such as leader, wider society, advisers, technical teams, and so on. The idea behind this division into several groups is to include as much diversity as possible. However, this does not mean that participants from one category cannot “join” other groups. These boundaries need to be discussed with the society itself.
Scope refers to the various stages of the decision-making. This aspect of participation contains the initiation, the planning of the design, its implementation, the reflection, the communication, and the expected outcomes, not only on the short- but also the long-term ones.

Harder et al. [2] provide a generic framework that highlights the different involvement of users and the degree of collaboration between actors A and B, as well as the power (or influence) they have over the decision-making process(es); these are expressed along multiple dimensions (the typology of Harder et al. [2] covers three dimensions—breadth, depth, and scope. Our adaptation adds two new dimensions—directness of impact and growth—mostly expressed within Level 1 and Level 4, respectively), which can directly be used as a guide to enhance participants’ positions and leverage each other’s strengths. Motivated to reduce the historical disconnect between researchers in the field of MT and MT users, we adapt this framework for the specific case of SLMT, leading to the definition of co-creation (in SLMT) and guiding principles for its application.

4. Co-Creation for SLMT Research

In this section, we define co-creation for SLMT through (i) the definition of a common goal; (ii) an adaptation of Harder et al. [2]’s typology into a basic and advanced versions for SLC participation, and (iii) embedding of the guidelines of [1].

4.1. Common Goal

As noted above, co-creation is a process which revolves around a common goal. Ultimately, such a goal would act as a gauge for co-creative activities. For example, aiming at a co-creative development of a translation application for Indian languages, e.g., Kannada, Hindi, and Tamil, L1 speakers of these languages as well as speakers of all three should be involved, but L1 speakers of, e.g., Bulgarian (with no knowledge of these languages) would have little to contribute. Alternatively, aiming at a co-creative development of SL learning support application for hearing people should involve both hearing as users and signers as experts.

In Definition 2, we formulate a description of a high-level common goal broad enough to cover the plethora of existing and potential future SL translation technological solutions:

Definition 2.

The research and development of technological solutions to the understanding and processing of sign languages and the translation between sign and spoken languages (and by proxy between sign and sign languages) to facilitate or improve the communication between hearing, HoH, and deaf individuals as well as to improve the access and dissemination of multilingual, multi-modal (video, audio and text) information.

4.2. Basic Relationship Typology for SLC Participation

In SLMT research and development, there has been a significant separation between researchers and users when it comes to hearing, HoH, and deaf individuals. Putting it into the frame of Harder et al. [2] typology, Role A has been predominantly assumed by hearing researchers, sometimes with little to no background in sign languages; Role B has been often, but not always, assumed by signers. We explicitly address the involvement of SLCs placing deaf, HoH, and hearing participants in the corresponding roles (A or B), acknowledging their differences, strengths and weaknesses in the specific use case, i.e., SLMT.

Both groups of participants—A and B—can generally involve deaf, HoH, and hearing individuals. Actor A—as used by [2]—refers to the academics who conduct the research. In our case, these are professionals who conduct research in the field of SLMT, SLNLP, and SLP technology, a role which predominantly has been assumed by hearing researchers as evident by the literature review in Section 5.2. Therefore, in our typology, Role A is assumed by the hearing researcher. As noted earlier, in general, A should involve all three categories of actors. That is, researchers (A) do not necessarily need to be hearing and in an ideal situation deaf, HoH, and hearing researchers work together. There is a (slow) ongoing shift where more and more HoH and deaf researchers participate in SLMT and SLP research and, although rarely still, assume leading roles.

Similarly, Actor B can also be any user—signer or non-signer. With our primary objective of developing a framework for involving SLCs in SLP and SLMT research, we set Actor B to include the SL user and any SLC member, including HoH and deaf academic researchers (not involved as Actor A). We ought to stress that while the term Sign Language Community might raise the idea that an SLC covers all variations in SL fluency, equal access to a visual language, and educational possibilities, this is not the case and such communities can be quite diverse (see clarifications by the US National Association of the Deaf [43]). Therefore, our decision to place as Actor B members of SLCs and SL users is so that even SL users that are outside of an SLC, e.g., (Hearing) Children of Deaf Adults (CODAs), as well as other individuals who belong to an SLC are acknowledged. (Similar to SpLs, SLs are living languages with dialects and regional variations, particularly in vocabulary. Additionally, there may be homesigns, familysigns, villagesigns, and individual signs, among others. To date, most existing SLMT models, methods, and databases are based on standardized SL, which makes it difficult to recognize and process these variations. Since we advocate for co-creation to leverage the strengths of individual perspectives, it is crucial to consider these intersigner variations when developing SLMT models, methods, and databases.) However, we acknowledge that our typology does not dive into these differences within SLCs and among SL users. We leave this to future work.

After setting up Actors A and B, we ought to address the fact that the original typology cannot clearly quantify the amount of power distributed between hearing and HoH or deaf researchers, as the focus is mainly on co-creation. It is worth noting that the distribution of power, diversity, and involvement in different stages ranges from one extreme point where all the power of decision-making is in the hands of the hearing researchers (Actors A) who compose the complete consortium or collaboration network and are solely involved in all stages of the project to a balance point where consortia are built from deaf and hearing researchers (Actors A) who collaborate on equal terms, involve the SL user (Actor B) in all stages of the project, and all stakeholders (Actors A and B) can excerpt the power of decision-making. A collaboration set in Level 3 and Level 4 implies that a conversation is held with the SL users, the SLCs, and their NADs, ensuring that the SL users and the SLCs are adequately represented and involved.

We adapt Harder et al. [2] typology to take into account to the aforementioned points; our adapted typology is shown in Table 2.

4.3. Advanced Relationship Typology for SLC Participation

The 4th level as presented in Table 2 does not capture the long-term potential for impact such a collaboration may have on the various participants. To align with Principles 2, 3 and 9 of Caselli et al. [1] in Section 4.4, we need the relationship typology to better reflect the notion of dynamics. To achieve this, we propose an extension of the typology with another dimension—growth:

Growth refers to the impact of a project on the development (potential) of the different actors and actor subcategories as professionals or in society.

To cover this dimension, we fragment Level 4 into three categories. Level 4.a: Learning as one; Level 4.b: Growing as one; and Level 4.c: Working as one. These new levels assume that the hearing, HoH, and deaf researchers and the SL users have equal positions, work on equal basis, with a shared amount of participation over the complete depth, breadth, and scope into the life cycle of research, and can maintain mutually beneficial collaborative network beyond the scope of a project.

Level 4.a. Learning as one. Establishing collaboration between researchers and SL users from the beginning of the project can maximize the knowledge exchange (e.g., seminars on different topics from both communities). When driven by a common goal, such collaboration has the potential to produce outputs that are beneficial and relevant to all stakeholders who can, in parallel, acquire cross-disciplinary knowledge and expertise. In terms of scope, all types of stakeholder should be involved in all stages of the decision-making process. However, in terms of depth, this level does not distinguish how much power each stakeholder has in the process. Furthermore, it is unclear how diverse the group of the consortium is, i.e., its breadth, who the (SL) user is, and above all, it is unclear what the knowledge transfer flow is, that is, it has the pitfall where it can be to the (hearing) researchers, without the reciprocal transfer to the larger user community (or society in general). Regardless, ensuring and supporting participation, engagement, and development of DHH researchers helps mitigate this and support a much-needed pipeline for DHH experts in this field.
Level 4.b. Growing as one. This level suggests that in addition to learning as one, as in 4.a, potential avenues for creating value arise. These opportunities emerge as a result of changes—particularly in the power dimension—that present new possibilities for innovation, profit, and improvement. Thus, DHH and hearing researchers and SL users work together on equal basis, are both integrated into the scope of the research cycle, and are presented with opportunities to grow (professionally and societally); however, the SL user is not involved in the execution of every relevant step and/or the societal diversity is not representative.
Level 4.c. Working as one. DHH and hearing researchers and SL users have a full consensus about the practices, the design is a continuous and reciprocal process, and both the hearing and the SL users are equally integrated into the scope, depth, and breadth of the research project.

Another aspect that needs a more nuanced representation—linked to the depth and breadth of a research project—relates to the direct and indirect impact of the project. NLP and MT projects address a very socially relevant and practically applied problem—the problem of language. Following our preliminary literature analysis, we notice that the articles that do not involve the community have either a direct impact or use case or have indirect impact or no immediate use case. In general, any research output impacts with varying extent the user community, which is even more so for the SLCs, as, for example, video recordings of signers lack privacy, or the use of 3D avatars links to ethical issues. Work that does not involve the users can either have a very significant direct impact, e.g., being part of a bigger research, or have an indirect and/or small impact. The development of sign language translation gloves which have a specific use case without the deaf community is an example of the former case, while the development of a new method for sign language recognition (based on existing corpora) for the purposes of MT is an example of the latter case. These variations should be acknowledged in a realistic, practically applicable typology as failing to do so would create a misguided classification or organization. To achieve this in our typology, we propose splitting Level −1 in two: Level −1.a covering the lack of involvement or unwillingness to involve a diverse, representative user group when the product has a direct impact; and Level −1.b covering the lack of involvement of a diverse, representative user group when the product has no direct impact or is part of a bigger project. As per Harder et al. [2], −1.a may have a detrimental impact on society, while −1.b does not.

This advanced typology is summarised in Table 3.

4.4. Alignment with Participatory Design Guiding Principles

As outlined in Section 2, Caselli et al. [1] call for better involvement of language users in NLP research. They advocate for change in perspective from language as data to language as people. To achieve this, Caselli et al. [1] propose nine guidelines inspired by participatory design. These are presented in detail in Appendix A and briefly listed here:

1.: Participatory design is about consensus and conflict.
2.: Design is an inherently disordered and unfinished process.
3.: Communities are often not determined a priori.
4.: Data and communities are not separate things.
5.: Community involvement is not scraping.
6.: Never stop designing.
7.: Text is a means rather than an end.
8.: The thin red line between consent and intrusion.
9.: The need to combine research goals, funding and societal political dynamics.

We align these guidelines with our four-level typology. We aim to show how our framework embeds them and can be used to judge the extent to which these are considered in the assessment of a project.

1.: Consensus and conflict are embedded in the communication between A and B throughout Levels 1 to 4, where on Level 1 there is barely any consensus and conflicts remain unresolved while on Level 4 consensus is achieved and conflicts are resolved.
2.: To capture the concept of a continuous, reflexive, and ongoing design process, our typology assumes a frequency and volume of knowledge exchange and user community expansion. Levels −1 and 0 are on the one far end, where such exchange is inexistent, while Level 4 assumes exchange of various types of knowledge that cover the plethora of expertise and expansion of the community with the growth of the project.
3.: As per the previous point, the complete set of user communities is not determined a priori, but rather through the development process (an interesting example is the involvement of deaf–blind participants in the SignON project (https://signon-project.eu/ accessed on 3 February 2025), which was not defined at the start of the project).
4.: The assumption that communities are only data providers raises the question of where the separation line between SL user and researchers is or in which cases the SL user indeed only provides data. In the last case, we can categorize this on Level 2. Levels 3 and 4 imply that the community can be involved in data production but users can take other roles, too. With Levels −1 to 1, the community is not involved in data generation.
5.: According to the community involvement is not scraping principle, ethical, equal, respectful, and reciprocal social interactions are necessary for the creation or development of a tool for a specific community. Ethical engagement and expectation management should be a process conducted on Level 3 (as learning from each others’ needs) and Level 4 (in discussion with each other). We further split Level 4 into three categories. Our Level 4.b. and Level 4.c assume the necessity of working together as equals, with clear ethical practices already described; Level 4.a. assumes these are still to be developed and set in place. Ideally, working on equal levels is the most desired arrangement; however, in most of the current SLMT projects, this step is not implemented nor discussed (evident from our analysis presented in Section 5).
6.: As acknowledged above, the interaction with the community should be continuous and frequent in order to never stop designing for a better solution. By including SLCs, technical and resource issues can be decreased and participant effort can be recognized as labor.
7.: We stress the original formulation of the 7th principle of [1] Text is a means rather than an end. In order to capture different modalities of language, e.g., text, audio, video, we rephrase this principle as Language is a means rather than an end. This principle can be reflected in Levels 2 to 4. Within Level 1 and below, the lack of communication and developing solutions without the involvement of the SLC utilize language data without reflecting on its impact on the community. This principle is most prominent on Level 4b (growing as one) and Level 4c (working as one). We ought to note that in most of the current SLMT work, this principle is comparable with Level 2, as the researchers need the SLC for a switch in perspective, or Level 3, wherein both parties have a discussion and consensus about which perspective is followed.
8.: The thin red line between consent and intrusion is a principle embedded in the lower levels already—in −1 and 0 –. This line is crossed as the development of technology without the proper involvement could be considered intrusion (plenty are the examples of intrusive technology such as SL gloves which is not accepted by SLCs); from Level 1, as soon as some form of recognition of language as people is formed, and onward, this principle is being considered in its positive form.
9.: The complex dynamics of funding (for projects that support co-creation with the community) as well as goal formation for the research projects, and the community itself impact collaboration. Until recently, the majority of SLMT projects are not supported by a national or international grands, and thus are localized within a research team. As such, they fall on Level 1 or Level 2. For active and effective collaboration, e.g., Level 4, this principle should transcend our typology and be adopted by funding bodies and agencies. In our typology, we do not specifically integrate this principle. However, we acknowledge the need for forming collaborative teams for which a common framework with sufficient financial, societal, and political support as a prerequisite for Level 3 and 4 collaborations.

4.5. Definition of Co-Creation in SLMT

The definition of the shared goal in Section 4.1, the adaptation of the basic typology of [2] in Section 4.3, and the embedded guidelines of [1] in Section 4.4 lead to the following definition of co-creation for SLMT:

Definition 3.

A collaboration between SL user and/or SLC members and researchers focusing on achieving a common goal, on identifying new and emerging value opportunities for both parties through continuous, reflexive, and iterative knowledge exchange, discussions and consensus-building across the various stages of the research project.

4.6. Assessment Criteria

We present specific evaluation criteria that aid in determining whether a research project fits on a particular level.

1.: Level −1: If no SL user is involved in any research stage, yet the work directly impacts the SL user and/or the SLCs, it is categorized as Level −1a. For example, at this level, the study is regarded where hearing participants learned ASL in a 3-hour-long tutorial who then produced (these signs as) data for the development of MT systems [44]. This work has a direct societal impact, especially when considering the potential issues with inaccurate data being captured, the overlooking of the complexity of and variations in SLs, and the needs of the SLCs. In contrast, when a project has no direct impact on the SL user or the SLCs, it is classified as Level −1b. An example of work on Level −1b could be a project on the development of a new SL recognition model which can be used in an SLMT pipeline—if the work itself does not involve SL users, then it is regarded as Level −1; the fact that its potential of impact is restricted to its application in an SLMT project places it in subcategory b, i.e., Level −1b.
2.: Level 0: If the SL user is involved but only to a limited extent, e.g., one stage, of the MT research life cycle, with no evidence of integrating the views of the user in the project or with evidence of ignoring these views or neglecting the wider community (e.g., focusing on very limited subsample of users), the project is classified as Level 0. For example, if the project involves a single deaf participant who translates a written text into SL is problematic, not only because translation of written (spoken) content is not original SL data but mostly because the limited number of participants assumes that this individual is a perfect representative of a large group of SL users or of an SLC, therefore neglecting their differences and diverse views.
3.: Level 1: When SL users are involved in two or more stages of the MT research life cycle but are not involved in the decision-making process, it is categorized as Level 1. For example, in the study by [18], researchers developed a bilingual corpus annotated and verified by SL linguists and involved deaf students in the evaluation process. This approach ensures the SLC is engaged in one or more tasks across the breadth and scope of the project. However, these participants were not involved in the decision-making process and had less influence compared to the researchers.
4.: Level 2: If SL user and/or SLCs are involved in multiple or all phases of the MT research life cycle whose ideas, opinions and/or views partially influence the decision-making process but the leading researchers have the final say. Then, the project is classified as Level 2. For example, deaf participants may be contacted through Deaf Studies programs to provide translation input, offer advice on SL grammar and linguistics, and evaluate the translated content.
5.: Level 3: If SL users and SLCs are involved in most or all stages of the project, provide ideas, opinions, and views which are taken into account, and most of the decisions are made with their consensus, then such work is assigned to Level 3.
6.: Level 4: If a consortium includes hearing, HoH, and deaf researchers/developers, with complementary and necessary skills who jointly contribute to achieving the common objective, address relevant issues together, continuously exchange knowledge and engage the SL users and the SLCs at various stages of the project and on regular bases with their input being considered, discussed, and integrated in the project (i.e., lead to or make a decision), it is generally classified as Level 4. When the collaboration is implemented with no view for future opportunities (and user involvement), we regard it as Level 4a. When SL users and SLCs collaborate on equal footing with hearing researchers, opening potential avenues for creating value (i.e., there is a notable shift in the power dynamic), reflecting new opportunities for innovation, profit, and progress, but there are cases of misrepresentation or lack of diversity, then such a project is regarded as Level 4b. In Level 4c, in comparison to the criteria for Level 4b, a project should have a wider span of users covering all nuances and diversity of SL users and SLCs (or at least be open to and provide the possibility for such wider user community integration).

5. Literature Review

In 2023 and 2024, two review articles of SLMT were published [45,46]. These presented an overview of a large volume of the literature on SLMT focusing on the technological solutions, different approaches, and historical advancements. These two review articles provide an extensive technical overview of SLMT, but little is noted on the inclusion of the deaf community (and the user community in general. To the best of our knowledge, these are the most complete overviews of related work to date. That is why we conducted an additional overview and analyzed the SLMT-related papers that were considered in articles [45,46] from the perspective of SL user involvement. We then categorized them according to our newly proposed typology.

5.1. Selection and Filtering Criteria

In total, we reviewed 127 papers from [46] and [45]—57 and 70, accordingly.

The initial analysis yielded a combined number of 193 articles. Next, we screened these papers by reading the (sub)titles, abstract, and participants, and we filtered out 66 papers. The remaining 127 articles were thoroughly read to identify to what extent the SLC has been involved in the research life cycle. From our final analysis, we excluded 16 more articles, resulting in 111 articles. Our criteria were the following:

The paper needs to be mentioned in [45] or [46];
The paper needs to be open-access;
The study should focus on SLMT;
The study should focus on SLs or on the translation from SLs to SpLs or in reverse, but not only on spoken languages;
It must be clear in how much and to what extent the SL user was involved.

After these exclusion steps, the remaining 111 papers, 57.5 % of the original 193 papers, were considered in the following discussion. We list the 111 papers in Supplementary Materials.

In our analysis, we looked to what extent SL users were involved in the different stages of the research and development described in these articles and how this involvement spread out over the newly proposed typology, following the assessment criteria in Section 4.6. Our review unveiled that SL users have not been involved beyond Level 2. Table 4 shows the total amount of reviewed papers categorized over the different levels; in Section 5.2, we present a more detailed analysis of the literature review.

5.2. Distribution of Articles over Levels of Involvement

We analyzed the SL users involvement in the different stages of the projects described in 111 papers. We ought to note that despite the negative connotation of Levels −1 and 0, our typology merely assesses the process of co-creation and does not question the validity or the robustness of the conducted research, except if explicitly noted so. Table 4 offers an overview of the total amount of reviewed papers categorized over the different participation levels. In general, what we observe is that the SL users have not been involved in Level 3 and Level 4.

Below, we outline how the articles were categorized according to the typological levels, provide examples of common assumptions made to explain this categorization, and discuss how the SLC and SL users can be engaged in a more ethical and efficient manner.

Articles presenting work that only focuses on the MT process or on the model and technique’s comparison without the involvement of SL users (nor assessing the impact on the user community in general) and is conducted only by hearing researchers were categorized at Level −1. As described in Section 4, this level includes hearing researchers who make decisions without the SLC being involved (and sometimes contrary to the SLCs interests). In 75% (i.e., 83 out of 111 articles) of the reviewed articles, SL users were not involved in any of the tasks or stages of the research life cycle. An example of this is study [44], in which 11 hearing participants were asked to learn Americam Sign Language (ASL) signs in a 3-hour tutorial; these were then recorded and used in the development of the proposed method and its analysis. The implementation of the proposed method and its analysis not only relies on data that are not representative of natural ASL, but also overlooks the complexity of SLs and the needs of SL communities. The assumption that watching a 3-hour tutorial is sufficient to learn a new language in a different modality brings into question the validity and robustness of the research results. (Despite the limited scope of the presented work, SLs have extensive vocabulary, complex grammar, and, even more, operate on different simultaneous produced parameters, e.g., hands, face, torso, etc.) Additionally, it raises ethical concerns, suggesting the potential marginalization and discrimination of the SLC. This example is therefore categorized as Level −1a. As discussed in Section 4.3, work that does not involve the users can have a direct impact or an indirect impact on the user community (and on society in general): Level −1a or Level 1b. From the 83 articles categorized under Level −1, 4 are labeled as −1a and 79 are as −1b (see Table 5).

In less than 4.8 % (e.g., 4 of the 83 articles) of the same cases, there are human participants (that are not the researchers themselves) involved who are not representative of the primary user group, i.e., the SL user and/or SLCs. An example of this is [47], wherein the authors created the RWTH-Phoenix-weather database based on hearing interpreter services. We categorize this under Level −1b as (1) the data are significantly influenced by the written text, with the researchers selecting the appropriate participants; (2) the outputs of the data are used in further formalizations and systems of different MT topics related to recognition and/or production which makes further research based on natural language processing less reliable; (3) the recordings should have been provided by a combination of deaf and hearing interpreters, as the translation was mostly based on written text; and (4) because of the latter points, the impact on the SLC is indirect.

At Level 0, we consider research that involves SL users, but their participation is limited to tasks such as data collection, recording, or annotation, without contributing to ideation, research, or development. SL users are not involved in the decision-making; the research team comprises only non-signers. Based on this criterion, we conclude that 13 out of the 111 reviewed articles, i.e., 11.7%, are at Level 0. An example is Morrissey’s article [48], which includes one native Irish SL-signer for data collection—for the recordings of the dialogue and its manual translation. While the concept of recording a native signer to create a corpus is valuable, as it captures natural language production, the number of participants raises concerns. With only one participant, it is difficult to capture language variation, not only dialectal variants but also variations related to diversity and educational level of the participants. Another example is the work of Massó and Badia [49]. While they create a corpus involving a native deaf signer, their corpus creation approach is problematic. First, it is a written text (that is, originally created in spoken language) which is then translated into sign language; thus, the signed content is not original, which, according to [50], is a suboptimal situation. Second, they involve only one person, and therefore do not take into account any kind of variation. And third, there is no information about whether this person was familiar with the domain of translation, which can impact the translation output. Additionally, their experiments do not include human evaluation, which further undermines the validity of their results. For this reason, we categorize their paper at Level 0.

The criteria for Level 1 are to contain deaf signer involvement in both content/data creation, e.g., recordings or annotations, and in system assessment and feedback on the system (without their direct involvement in the development process). There are 13 articles, i.e., 11.7%, which fall in this category. In these articles, the authors have different approaches: some contacted students who are native signers, i.e., individuals who were born deaf [9] to grade the utility of the proposed approach, while others made a combination between hearing evaluators that are experienced with a signed language and native signers [12] for the evaluation of the translated sentences. In [18], Su and Wu develop a bilingual corpus annotated and verified by SL linguists, involving 10 deaf students in the evaluation process. What these example show is that the SLC is involved in one or more different tasks over the breadth and scope in the research cycle. Even more important, it is an example of the power imbalance in the relationship between the participants, i.e., students evaluating content for their professors.

Level 2 builds on Level 1 with the addition of the criterion that different National Associations of the Deaf (NADs) are involved. These organizations are specialized in their SLCs, and therefore can provide better expertise, guidance, and communication with, for, and about the SLC. Only 2 out of the 111 papers align with this criterion, i.e., less then 1.8%. In both [15,51], NADs are involved. Approaches differ: in [15], deaf colleagues and members of the SLCs were contacted via Deaf Studies university programs to make a choice of a domain for SLT, asking their cooperation for the human translation, advice on the SL grammar and linguistics, and to evaluate the translated output. In Jantunen et al. [51], a section is also devoted to co-engineering, participation, and culture.

For an article to be classified at Level 3 or at Level 4, the research should involve the SLCs or SL users in the decision-making process for an agreement based on discussion (Level 3) and have a balanced research team or consortium composed of signers and non-signers (Level 4). None of the articles we reviewed fulfilled these criteria.

6. Proposal of Formal Guidelines for Adopting Co-Creation in SLMT Projects

Following the review of co-creation applied in different fields (including NLP) summarized in Section 2, the newly proposed typology (Section 4), the definition of co-creation (Definition 3), and assessment criteria (Section 4.6), as well as the review of SLMT-related literature in Section 5, here, we set up co-creation guidelines for the adoption of co-creative activities for SLP and SLMT projects to reach Level (4a), Level (4b), or even Level (4c).

6.1. Challenges

Positionality and privileges of hearing, non-signing researchers. (There are other forms of biases that should be avoided. However, these are not in the scope of our work and therefore not discussed here.) As we noted in our literature review (Section 5), sign language projects have been led by hearing, non-signing individuals creating bias in the landscape of SLMT research. Although established traditions and legacy educational outcomes for DHH people in the field are still favoring the aforementioned group of researchers, we are noticing a shift towards more inclusive research, which should be promoted and needs to become the default practice. Similar to [52], our study shows that while some individuals show curiosity to deep knowledge and understanding of the existing biased and systematic oppression, there is still a strong imbalance where technical, (primarily) hearing researchers create new or maintain existing power structures, as shown in our review of the 111 papers. This leads to exclusion or diminished inclusion of the SLC members and SL users in the research process.
Inclusion for the sake of inclusion. Including SL users in predominantly hearing-led projects without genuinely considering their unique perspectives is both ineffective and unethical. We categorize this approach as Level 0. As Holcomb et al. [52] note,
- deafness is earnestly viewed as a benefit and a valuable contribution to the world, a concept known as “Deaf Gain”. In other words, it is argued that comparing hearing people to deaf people should be understood as comparing apples to oranges. Each has its own unique advantages and disadvantages, but both are valuable, can thrive in environments that support their natures, and can enrich the human experience in positive ways [52].
Size of the user population. SLC, and SL users in general, are a small population. As such, many individuals are requested to participate in such kind of projects over and over again. This results in research fatigue, where the same population is repeatedly asked to participate in technical projects without receiving any tangible benefits from their contributions. As a result, SL users become fatigued by these constant requests [53].
Adhering to ethical protocols. While in our work we address the topic of co-creation (with SL users and the SLC), we do not delve in the topic of ethical protocols for research in or with the SLC, such as the guiding principles outlined by the Sign Language Linguistic Society—SLLS—in their Ethics Statement https://slls.eu/slls-ethics-statement/, accessed on 3 February 2025. We acknowledge that such protocols need to be in place for any research involving human participants, and therefore, SLMT projects require the assessment of ethical committees prior to their commencement so that ethical, fair, transparent, and sustainable collaboration with linguistic and cultural minority groups is ensured.

6.2. Proposals

We support Harder et al. [2] suggestion and encourage researchers to tailor our newly proposed typology to their specific use case by defining the initial participants in Groups A and B while remaining flexible in adjusting these groups as the project progresses.
We propose that ongoing and planned activities include regular self-evaluations based on the proposed typology to assess the level at which their work is categorized.
SLP requires multi-disciplinary research, bringing together researchers with diverse backgrounds and expertise. To conduct research in such multi-disciplinary environment, which imposes difficulties on aligning objectives, agreemeng on methodologies, and so on, all participants recommend starting the conversation with SL users and SLCs in the ideation phase, as well as to building a trans-disciplinary network with other (deaf) researchers or disciplines, such as, for example, Deaf Studies and NADs.
As mentioned in Section 4, the original typology of [2] does not allow us to show the proportions or power between hearing and HoH or deaf researchers, as the focus is mainly on co-creation with the society. Our typology is based on the history of technological hearing-led projects, and we categorize deaf and Hard-of-Hearing researchers under the concept of SLCs. This is obviously incorrect, as shown in our typology in Table 3 by the distinction of Level 2 and Level 3 and Level 3 and Level 4, wherein we slightly shift from “tasks” in-between the levels and the implementation of HoH and deaf researchers. The addition of Levels 4b and 4c (along with 4a) is also significant. The “learning as one” level still implicates the categorization of HoH/deaf versus hearing researchers and the SL users, while the level of “growing as one” and further puts these three categories (hearing, HoH, and deaf researchers) into one box with the SLC as an opposite. We suggest creating consortia with fair involvement of these types of researchers in strategic positions, moving beyond the predominantly hearing consortia. However, the included participants should have an active role, avoiding the challenge of “inclusion for the sake of inclusion”.
We propose including hearing, HoH, and deaf individuals (researchers and users) in the discussion and the establishment of common goals to allow all participants to benefit on an equal basis.
We propose establishing communication and dissemination protocols from the beginning. This would involve hiring interpreters, and therefore budget should be provisioned from the inception of a project. Furthermore, communication and dissemination should be conducted in a language in which the participants are fluent in to reduce miscommunication and misunderstandings. Timeline communication can also manage expectations, leading to achievable goals.
We propose expanding the breadth, depth, and scope of the project during its evolution through including more users and advancing the technology to address their needs.
We propose ensuring that research activities involving users have received ethical approvals from research ethics committees prior to these activities begin. It is important to value the privacy of participants and respect their wishes (e.g., in the case of participation withdrawal).
We emphasize that co-creation is a dynamic process and changes should be welcome.
Continuous assessment is beneficial for expectation management and alignment of participants and goals.
We ought to stress that, while co-creation should be executed as a dynamic and continuous, researchers take into account the time and personal constraints of users and be aware of research fatigue.

7. Conclusions

This study investigates the topic of co-creation for SLMT research. After reviewing existing work on co-creation as defined in different fields and using the work of Caselli et al. [1] and Harder et al. [2], we develop a participation typology for the involvement of SLCs in SLNLP and SLMT research. We then apply this typology to assess 111 existing articles on the topic of SLMT. Our study shows that the articles we reviewed are (mainly) centered around a technical perspective and that language is seen as data in contrast to a proposal in [1] to see language as people rather than data; most of the 111 reviewed articles are placed on Level −1 (denigration), Level 0 (neglect), or Level 1 (learning from). That is, none of the reviewed articles involve SL users and deaf researchers at Level 3 or Level 4 of our proposed typology. Furthermore, the number of papers decreases with the increase in the typology level (i.e., Level −1: 83 articles, Level 0: 13 articles, Level 1: 13 articles, Level 2: 2 articles). This is an indication that a change has been set in motion, in which collaboration with the SLC is increasing but that the tools for a fair, ethical, and responsible collaboration and co-creation are missing. We also note that systemic barriers to participation and advancement in research careers remains a legacy obstacle that higher education and research active institutions must address with SLCs and other parts of the educational ecosystem.

Guided by the advanced typology, we first present assessment criteria and then propose a set of nine guidelines to improve the SLMT research landscape (from a co-creative perspective); we note four overarching challenges that these guidelines tackle. The co-creation principles discussed in this paper could be adapted to other domains beyond SLMT. These principles should be followed in parallel with Caselli et al.’s [1].

Finally, it is worth stressing the dynamic nature of co-creation. As projects evolve, so do communities and their requirements. We do not explore this temporal aspect in detail and leave it for future work, recognizing the importance of considering changing requirements. In the future, we also plan to investigate the manner in which the different levels of participation and different users correlate and dig deeper into assessing the participation of users at different phases of an MT project (or ML/DL in general).

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/info16040290/s1.

Author Contributions

Conceptualization: L.L., M.D.S. and D.S.; Methodology: L.L., M.D.S. and D.S.; Validation: D.S.; Formal analysis: L.L.; Investigation: L.L., M.D.S. and D.S.; Resources: L.L., M.D.S. and D.S.; Data curation: L.L., M.D.S. and D.S.; Writing—original draft: L.L., M.D.S. and D.S.; Writing—review and editing: L.L., M.D.S., D.S. and G.C.; Visualization: L.L. and D.S.; Supervision: M.D.S., D.S. and G.C.; Project administration: M.D.S., D.S. and G.C.; Funding acquisition: D.S. and M.D.S.; L.L., M.D.S. and D.S. contributed equally to this work. All authors have read and agreed to the published version of the manuscript.

Funding

This research has been partially funded by a Starter Grant according to the Administrative Agreement signed by the Minister of OC&W, the Netherlands Association of Universities of Applied Sciences, and the Association of Universities of the Netherlands, as well as by.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study.

Acknowledgments

We would like to express our sincere gratitude to Lorraine Leeson for her thoughtful and constructive feedback which greatly contributed to improving the quality of our work.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this paper:

CBPR	Community-Based Participatory Research
CPUs	Central Processing Unit
GPUs	Graphics Processing Unit

HoH	Hard of Hearing
LLMs	Large Language Models
MT	Machine Translation
NADs	National Associations for the deaf
NLP	Natural Language Processing
PD	Participatory Design
SLs	Sign Languages
SLCs	Sign Language Communities
SLMT	Sign Language Machine Translation
SLNLP	Sign Language Natural Language Processing
SLP	Sign Language Processing
SpLs	Spoken Languages

Appendix A. The Nine Guidelines of Caselli et al. [1]

1.: PD is about consensus and conflict. The design of co-creation should be conducted in discussion and alignment between the involved parties.
2.: Design is an inherently disordered and unfinished process. The design should be a continuous, reflexive and ongoing process (Principles 2 and 6 of [1] and Level 4c of our proposed typology in Table 3. Ref. [1] mentions that the term community needs to be defined in a reflexive and adaptable manner, with its continuous changes. Ref. [2] assumes that this definition is a fixed format, based on the amount of power of different researchers (i.e., hearing, HoH, or deaf) to define the SLC.
3.: Communities are often not determined a priori.
4.: Data and communities are not separate things. Principle 4 of [1] contains the assumption that we expect that communities have a prominent role in the development of NLP-systems, but that the communities until now most often only function as language data providers. This assumption raises the question of where the separation line between SL-user and researchers is, or in which cases the SL-user indeed only provides data. In the last case we can assign this to Level 2 of [2].
5.: Community involvement is not scraping. In Principle 5, the social interactions are described as necessary for the creation or development of a tool for a specific community, wherein also the ethical engagements, equity, reciprocity, and respect should be discussed. Level 4.b. and Level 4.c assume that working together in equality, with clear ethical practices already described; this principle is also hard to assign to one level. Ideally, working on an equal level is the highest possible achievement, although in most of the current SLMT projects this step is not implemented or discussed. The development of the expectations/ethical engagement should be on Level 3 (as this part is meant as learning from each others needs) or Level 4 (in discussion with each other), and if this is already discussed and decided, then this principle can be divided into Levels 4b and 4c for the execution. But also, in this case, a reciprocity attitude is needed for reflection and adaption of execution.
6.: Never stop designing. Principle 6 states that when an NLP-tool is based on PD, there should be awareness about the needs of the SLC and they should be included into the design stage. By including them, technical and resource issues can be decreased, and participant effort can be recognized as labor.
7.: Language (please be aware that in article [1], the original principle is the following: text is a means rather than an end, that we have more specified in this article to language) is a means rather than an end. Principle 7 refers to a switch in perspective from language as data to language as people, wherein the main focus should be to serve people’s needs instead of trying to copy people’s language use. This principle can ideally be compared with Level 4b (growing as one) or Level 4c (working as one), but in most of the current SLMT this principle is comparable with Level 2, as the researchers need the SLC for this perspective switch, or Level 3, wherein both parties have a discussion and consensus about which perspective is followed.
8.: The thin red line between consent and intrusion. Principle 8 can be part of some of the lower levels already, as soon as some form of recognition of language as people is formed, so this principle can be seen as ”learning about” (Level 1) or ”learning from” (Level 2).
9.: The need to combine research goals, funding, and societal political dynamics. The last principle—Principle 9—refers to the complex dynamics of funding (for projects that support co-creation with the community), goals of the research projects, and the community itself. As the most SLMT projects are not supported by a grant for the above-needed adaptations, this principle can be compared to Level 1 or Level 2.

References

Caselli, T.; Cibin, R.; Conforti, C.; Encinas, E.; Teli, M. Guiding Principles for Participatory Design-inspired Natural Language Processing. In Proceedings of the 1st Workshop on NLP for Positive Impact, Online, 5 August 2021; pp. 27–35. [Google Scholar] [CrossRef]
Harder, M.K.; Hoover, E.M.; Burford, G. What is participation? Design Leads the way to a Cross-Disciplinary Framework. Des. Issues 2013, 29, 41–57. [Google Scholar] [CrossRef]
Prahalad, C.K.; Ramaswamy, V. Co-opting customer competence. Harv. Bus. Rev. 2000, 78, 79–90. [Google Scholar]
Prahalad, C.K.; Ramaswamy, V. Co-creation experiences: The next practice in value creation. J. Interact. Mark. 2004, 18, 5–14. [Google Scholar] [CrossRef]
Göransdotter, M.; Redström, J. Design Methods and Critical Historiography: An Example from Swedish User-Centered Design. Des. Issues 2018, 34, 20–30. [Google Scholar] [CrossRef]
Ramaswamy, V. The Power of Co-Creation: Build It with Them to Boost Growth, Productivity, and Profits; Simon and Schuster: New York, NY, USA, 2010. [Google Scholar]
Frow, P.; Nenonen, S.; Payne, A.; Storbacka, K. Managing Co-Creation Design: A Strategic Approach to Innovation. Br. J. Manag. 2015, 26, 463–483. [Google Scholar] [CrossRef]
Almeida, I.; Coheur, L.; Candeias, S. Coupling natural language processing and animation synthesis in portuguese sign language translation. In Proceedings of the 4th Workshop on Vision and Language, Lisbon, Portugal, 18 September 2015. [Google Scholar]
Chiu, Y.; Wu, C.H.; Su, H.Y.; Cheng, C.J. Joint optimization of word alignment and epenthesis generation for Chinese to Taiwanese sign synthesis. IEEE Trans. Pattern Anal. Mach. Intell. 2006, 29, 28–39. [Google Scholar] [CrossRef]
Foong, O.M.; Low, T.J.; La, W.W. V2s: Voice to Sign Language Translation System for Malaysian Deaf People; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
Stein, D.; Bungeroth, J.; Ney, H. Morpho-Syntax Based Statistical Methods for Automatic Sign Language Translation. In Proceedings of the European Association for Machine Translation Conferences/Workshops, Sheffield, UK, 24–27 June 2024. [Google Scholar]
Wu, C.H.; Su, H.Y.; Chiu, Y.H.; Lin, C.H. Transfer-based statistical translation of Taiwanese sign language using PCFG. ACM Trans. Asian Lang. Inf. Process. (TALIP) 2007, 6, 1227851. [Google Scholar] [CrossRef]
Guo, D.; Zhou, W.; Li, A.; Li, H.; Wang, M. Hierarchical recurrent deep fusion using adaptive clip summarization for sign language translation. IEEE Trans. Image Process. 2019, 29, 1575–1590. [Google Scholar] [CrossRef] [PubMed]
Ko, S.K.; Kim, C.J.; Jung, H.; Cho, C. Neural sign language translation based on human keypoint estimation. Appl. Sci. 2019, 9, 2683. [Google Scholar] [CrossRef]
Morrissey, S.; Way, A. Joining hands: Developing a sign language machine translation system with and for the deaf community. In Proceedings of the Conference and Workshop on Assistive Technologies for People with Vision and Hearing Impairments: Assistive Technology for All Ages, Granada, Spain, 21–30 August 2007. [Google Scholar]
López-Ludeña, V.; González-Morcillo, C.; López, J.C.; Ferreiro, E.; Ferreiros, J.; San-Segundo, R. Methodology for developing an advanced communications system for the Deaf in a new domain. Knowl. Based Syst. 2014, 56, 240–252. [Google Scholar] [CrossRef]
Mazzei, A.; Lesmo, L.; Battaglino, C.; Vendrame, M.; Bucciarelli, M. Deep natural language processing for italian sign language translation. In Congress of the Italian Association for Artificial Intelligence; Springer: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
Su, H.Y.; Wu, C.H. Improving Structural Statistical Machine Translation for Sign Language With Small Corpus Using Thematic Role Templates as Translation Memory. IEEE Trans. Audio Speech Lang. Process. 2009, 17, 1305–1315. [Google Scholar] [CrossRef]
David, B.; Bouillon, P. Prototype of automatic translation to the sign language of French-speaking Belgium. Evaluation by the deaf community. Model. Meas. Control C 2018, 79, 162–167. [Google Scholar]
Baldassarri, S.; Cerezo, E.; Royo-Santas, F. Automatic Translation System to Spanish Sign Language with a Virtual Interpreter; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
Bauer, B.; Nießen, S.; Hienz, H. Towards an automatic sign language translation system. In Proceedings of the 1st International Workshop on Physicality and Tangibility in Interaction: Towards New Paradigms for Interaction Beyond the Desktop, Siena, Italy, 20 October 1999. [Google Scholar]
Morrissey, S.; Way, A. Manual labour: Tackling machine translation for sign languages. Mach. Transl. 2013, 27, 25–64. [Google Scholar]
Sagawa, H.; Ohki, M.; Sakyiama, T.; Oohira, E.; Ikeda, H.; Fujisawa, H. Pattern recognition and synthesis for a sign language translation system. J. Vis. Lang. Comput. 1996, 7, 109–127. [Google Scholar] [CrossRef]
Mistry, J.; Inden, B. An approach to sign language translation using the intel realsense camera. In Proceedings of the 2018 10th Computer Science and Electronic Engineering (CEEC), Essex, UK, 19–21 September 2018. [Google Scholar]
Adnan, N.; Wan, K.; AB, S.; Bakar, J. Learning and manipulating human’s fingertip bending data for sign language translation using pca-bmu classifier. CREAM Curr. Res. Malays. 2013, 3, 361–372. [Google Scholar]
Ballantyne, D. Dialogue and its role in the development of relationship specific knowledge. J. Bus. Ind. Mark. 2004, 19, 114–123. [Google Scholar] [CrossRef]
Bendapudi, N.; Leone, R.P. Psychological Implications of Customer Participation in Co-Production. J. Mark. 2003, 67, 14–28. [Google Scholar] [CrossRef]
Vargo, S.L.; Lusch, R.F. Service-Dominant Logic: What It Is, What It Is Not, What It Might Be. In The Service-Dominant Logic of Marketing; Routledge: London, UK, 2006; 14p. [Google Scholar]
Vargo, S.L.; Lusch, R.F. Evolving to a New Dominant Logic for Marketing. J. Mark. 2004, 68, 1–17. [Google Scholar] [CrossRef]
Stier, J.; Smit, S. Co-creation as an innovative setting to improve the uptake of scientific knowledge: Overcoming obstacles, understanding considerations and applying enablers to improve scientific impact in society. J. Innov. Entrep. 2021, 10, 35. [Google Scholar]
Stack, E.; McDonald, K. We Are “Both in Charge, the Academics and Self-Advocates”: Empowerment in Community-Based Participatory Research. J. Pract. Intellect. Disabil. 2018, 15, 80–89. [Google Scholar]
Voorberg, W.H.; Bekkers, V.J.J.M.; Tummers, L.G. A Systematic Review of Co-Creation and Co-Production: Embarking on the social innovation journey. Public Manag. Rev. 2015, 17, 1333–1357. [Google Scholar] [CrossRef]
Owen, R.; Pansera, M. Responsible Innovation and Responsible Research and Innovation; Edward Elgar Publishing: Cheltenham, UK, 2019. [Google Scholar]
Sharma, A.; Rao, S.; Brockett, C.; Malhotra, A.; Jojic, N.; Dolan, B. Investigating Agency of LLMs in Human-AI Collaboration Tasks. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, St. Julian’s, Malta, 17–22 March 2024; Volume 1, pp. 1968–1987. [Google Scholar]
Konen, K.; Jentzsch, S.; Diallo, D.; Schütt, P.; Bensch, O.; El Baff, R.; Opitz, D.; Hecking, T. Style Vectors for Steering Generative Large Language Models. In Proceedings of the Findings of the Association for Computational Linguistics: EACL 2024, St. Julian’s, Malta, 17–22 March 2024; pp. 782–802. [Google Scholar]
Ding, Z.; Smith-Renner, A.; Zhang, W.; Tetreault, J.; Jaimes, A. Harnessing the power of LLMs: Evaluating human-AI text co-creation through the lens of news headline generation. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, 6–10 December 2023; pp. 3321–3339. [Google Scholar] [CrossRef]
Gonçalo Oliveira, H.; Mendes, T.; Boavida, A. Co-PoeTryMe: A Co-Creative Interface for the Composition of Poetry. In Proceedings of the 10th International Conference on Natural Language Generation, Santiago de Compostela, Spain, 4–7 September 2017; pp. 70–71. [Google Scholar] [CrossRef]
Manjavacas, E.; Karsdorp, F.; Burtenshaw, B.; Kestemont, M. Synthetic Literature: Writing Science Fiction in a Co-Creative Process. In Proceedings of the Workshop on Computational Creativity in Natural Language Generation (CC-NLG 2017), Santiago de Compostela, Spain, 4 September 2017; pp. 29–37. [Google Scholar] [CrossRef]
Nakaguchi, T.; Otani, M.; Takasaki, T.; Ishida, T. Combining Human Inputters and Language Services to provide Multi-language support system for International Symposiums. In Proceedings of the 3rd International Workshop on Worldwide Language Service Infrastructure and Second Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies (WLSI/OIAF4HLT2016), Osaka, Japan, 1–11 December 2016; pp. 28–35. [Google Scholar]
Ciravegna, F.; Lavelli, A.; Petrelli, D.; Pianesi, F. Participatory Design for Linguistic Engineering: The Case of the GEPPETTO Development Environment. In Proceedings of the Computational Environments for Grammar Development and Linguistic Engineering, Madrid, Spain, 12 July 1997. [Google Scholar]
Lavelli, A.; Pianesi, F.; Maci, E.; Prodanof, I.; Dini, L.; Mazzini, G. SiSSA—An Infrastructure for NLP Application Development. In Proceedings of the ACL 2001 Workshop on Sharing Tools and Resources, Toulouse, France, 7 July 2001. [Google Scholar]
Morrow, J. Co-Creation, Participatory Research, Co-Design, or Participatory Design… Which Is It? 2022. Available online: https://medium.com/@Josh.Morrow.1/co-creation-participatory-research-co-design-or-participatory-design-which-is-it-fa14a7f542c1 (accessed on 3 February 2025).
NAD—Community and Culture—Frequently Asked Questions. Available online: https://www.pa.gov/content/dam/copapwp-pagov/en/dli/documents/individuals/disability-services/odhh/odhh-resources/documents/community%20and%20culture.pdf (accessed on 3 February 2025).
Fang, B.; Co, J.; Zhang, M. DeepASL: Enabling Ubiquitous and Non-Intrusive Word and Sentence-Level Sign Language Translation; Association for Computing Machinery: New York, NY, USA, 2017. [Google Scholar]
Núñez-Marcos, A.; Perez-de Viñaspre, O.; Labaka, G. A Survey on Sign Language Machine Translation; Elsevier: Amsterdam, The Netherlands, 2023. [Google Scholar]
De Coster, M.; Shterionov, D.; Van Herreweghe, M.; Joni, D. Machine Translation from Signed to Spoken Languages: State of the Art and Challenges; Springer: Berlin/Heidelberg, Germany, 2024. [Google Scholar]
Forster, J.; Schmidt, C.A.; Hoyoux, T.; Koller, O.; Zelle, U.; Piater, J.H.; Ney, H. RWTH-PHOENIX-Weather: A Large Vocabulary Sign Language Recognition and Translation Corpus. In Proceedings of the International Conference on Language Resources and Evaluation, Istanbul, Turkey, 21–27 May 2012. [Google Scholar]
Morrissey, S. Assessing three representation methods for sign language machine translation and evaluation. In Proceedings of the 15th Annual Meeting of the European Association for Machine Translation, Leuven, Belgium, 30–31 May 2011. [Google Scholar]
Massó, G.; Badia, T. Dealing with Sign Language Morphemes in Statistical Machine Translation. In Proceedings of the LREC2010 4th Workshop on the Representation and Processing of Sign Languages: Corpora and Sign Language Technologies, Valletta, Malta, 17–23 May 2010; pp. 154–157. [Google Scholar]
De Sisto, M.; Vandeghinste, V.; Santiago Egea Gómez, M.D.C.; Shterionov, D.; Saggion, H. Challenges with Sign Language Datasets for Sign Language Recognition and Translation. In Proceedings of the Language Resources and Evaluation Conference, Marseille, France, 20–25 June 2022; pp. 2478–2487. [Google Scholar]
Jantunen, T.; Rousi, R.; Rainó, P.; Turunen, M.; Valipoor, M.M.; Narsico, G. Is There Any Hope for Developing Automated Translation Technology for Sign Languages? In Multilingual Facilitation; University of Helsinki: Helsinki, Finland, 2021. [Google Scholar]
Holcomb, L.; Hall, W.C.; Gardiner-Walsh, S.J.; Scott, J. Challenging the “norm”: A critical look at deaf-hearing comparison studies in research. J. Deaf. Stud. Deaf. Educ. 2024, 30, enae048. [Google Scholar] [CrossRef] [PubMed]
Meulder, M.D.; Landuyt, D.V.; Omardeen, R. Lessons in co-creation: The inconvenient truths of inclusive sign language technology development. arXiv 2024, arXiv:2408.13171. [Google Scholar]

Table 1. A typology of relationships of participation between researchers (A) and society (B) [2].

Level (−1)	Level (0)	Level (1)	Level (2)	Level (3)	Level (4)
Denigration	Neglect	Learning About	Learning from	Learning Together	Learning as One
A makes decisions without B’s involvement (sometimes contrary to B’s interests).	A makes decisions without B’s involvement: ignorant or dismissive of B’s interests.	A asks B’s opinions, but does not feel obliged to take them into account: A makes the final decisions.	A asks B’s opinions and considers B’s contribution seriously. A still makes the final decision.	Major issues are negotiated through discussion between A and B. Most decisions are made jointly, e.g., by consensus-building.	A–B consortium discusses relevant issues by focusing on the ideas themselves, rather than the source of ideas.

Table 2. A typology of relationships of participation between deaf, hearing, and HoH participants—both as users and as researchers/developers.

Level (−1)	Level (0)	Level (1)	Level (2)	Level (3)	Level (4)
Denigration	Neglect	Learning About	Learning from	Learning Together	Learning as One
Hearing researchers make decisions without the SLCs (neither HoH nor deaf researchers) involvement, contrary to the SLCs’ interests.	Hearing researchers make decisions without the SLCs (neither HoH nor deaf researchers) involvement, ignorant or dismissive of the SLCs’ interests.	Hearing researchers ask the SLC and the user (and/or HoH or deaf researchers) opinions, but do not necessarily take them into account: the hearing researchers make the final decisions.	Hearing researchers ask the SLC and the user opinions and consider the SLCs and users seriously. Hearing researchers still make the final decision based on the information; HoH and deaf researchers are asked for evaluation, but not included in the process.	Major objectives and issues are discussed/negotiated jointly involving hearing, HoH, and deaf researchers, as well as SL users. Most decisions are made jointly, e.g., by consensus-building.	A consortium that includes hearing, HoH, and deaf researchers, as well as SLC members, jointly built, discusses relevant issues by having knowledge exchange (e.g., seminars on different topics from all involved communities).

Table 3. Advanced typology of relationships of participation between deaf, hearing, and HoH participants—both as users and as researchers/developers.

Level (−1)		Level (0)	Level (1)	Level (2)	Level (3)	Level (4)
Denigration Direct Impact	Denigration Indirect Impact	Neglect	Learning About	Learning from	Learning Together	Learning as One	Growing as One	Working as One
Hearing researchers make decisions without the SLC (neither HoH nor deaf researcher) involvement, contrary to the SLC interests, producing outputs with direct impact on the SLC.	Hearing researchers make decisions without the SLC (neither HoH nor deaf researcher) involvement, contrary to or unaware of the SLC interests, producing outputs with no direct impact on the SLC.	Hearing researchers make decisions without the SLC (neither HoH nor deaf researcher) involvement, ignorant or dismissive of the SLC interests.	Hearing researchers ask the SLCs and the users (and/or HoH or deaf researchers) opinions, but do not necessarily take them into account: the hearing researchers make the final decisions.	Hearing researchers ask the SLCs and the users opinions and consider the SLCs and users seriously. Hearing researchers still make the final decision based on the information, HoH and deaf researchers are asked for evaluation, but not included in the process.	Major objectives and issues are discussed/ negotiated jointly involving hearing, HoH, and deaf researchers, as well as SL users. Most decisions are made jointly, e.g., by consensus-building.	A consortium that includes hearing, HoH, and deaf researchers, as well as SLC members, jointly built, discusses relevant issues by having knowledge exchange (e.g., seminars on different topics from all involved communities).	Hearing, HoH, and deaf researchers, as well as SL users work together on equal basis, are all integrated into the scope of the research cycle, but the SL user is not involved in the execution of each step and/or the societal diversity is not representative.	Hearing, HoH, and deaf researchers, as well as SL users have a full consensus about the practices, the design is a continuous process, and both the hearing researchers and the SL users are equally integrated into the scope, depth, and breadth of the research project.

Table 4. The amount of reviewed articles per level.

Typological Levels	Total of Articles per Level
Level −1	83
Level 0	13
Level 1	13
Level 2	2
Level 3	0
Level 4	0
Total	111

Table 5. The categorization of Level −1 over the reviewed articles.

Typological Level	Number of Articles
Level −1a	4
Level −1b	79
Total	83

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lepp, L.; Shterionov, D.; De Sisto, M.; Chrupała, G. Co-Creation for Sign Language Processing and Translation Technology. Information 2025, 16, 290. https://doi.org/10.3390/info16040290

AMA Style

Lepp L, Shterionov D, De Sisto M, Chrupała G. Co-Creation for Sign Language Processing and Translation Technology. Information. 2025; 16(4):290. https://doi.org/10.3390/info16040290

Chicago/Turabian Style

Lepp, Lisa, Dimitar Shterionov, Mirella De Sisto, and Grzegorz Chrupała. 2025. "Co-Creation for Sign Language Processing and Translation Technology" Information 16, no. 4: 290. https://doi.org/10.3390/info16040290

APA Style

Lepp, L., Shterionov, D., De Sisto, M., & Chrupała, G. (2025). Co-Creation for Sign Language Processing and Translation Technology. Information, 16(4), 290. https://doi.org/10.3390/info16040290

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Co-Creation for Sign Language Processing and Translation Technology

Abstract

1. Introduction

2. Co-Creation

2.1. Marketing/Business Perspective

2.2. Social Domain

2.3. Community-Based Participatory Research (CBPR)

2.4. Responsible Research and Innovation (RRI)

2.5. In NLP and MT

2.6. Participatory Research, Co-Design, and Co-Creation

3. The Essential Elements of Co-Creation

4. Co-Creation for SLMT Research

4.1. Common Goal

4.2. Basic Relationship Typology for SLC Participation

4.3. Advanced Relationship Typology for SLC Participation

4.4. Alignment with Participatory Design Guiding Principles

4.5. Definition of Co-Creation in SLMT

4.6. Assessment Criteria

5. Literature Review

5.1. Selection and Filtering Criteria

5.2. Distribution of Articles over Levels of Involvement

6. Proposal of Formal Guidelines for Adopting Co-Creation in SLMT Projects

6.1. Challenges

6.2. Proposals

7. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. The Nine Guidelines of Caselli et al. [1]

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI