Ideophones and Realia in a Santome / Portuguese Bilingual Dictionary

: In this work, we discuss how Araujo & Hagemeijer’s Santome / Portuguese bilingual dictionary deﬁnes and describes ideophones and realia lemmata. We show that ideophones were listed individually along with their expression counterparts. Realia lemmata (words and expressions for culture-speciﬁc items) or specialized lexical units were presented in their Santomean forms, followed by a description of their endemic speciﬁcities. Many realia items from Santome can also be found in Portuguese. We conclude that the authors contribute to the lexicographic record of ideophones, lexical items that did not exist in Portuguese, but relevant to the language and culture of Santome. On the other hand, with the documentation of realia entries, they collaborate for the validation of lexical units (originating in Santome) in the local vernacular variety of S ã o Tom é and Pr í ncipe’s Portuguese, a common historical practice in Portuguese lexicography.


Introduction
Santome, or Forro (cri 1 ), is a Portuguese-based creole language spoken by about 35,000 people in the Democratic Republic of São Tomé and Príncipe in the Gulf of Guinea (INE 2012). The goal of this article is to lexicographically discuss the definition and description of two linguistic units in Araujo and Hagemeijer's (2013) bilingual 2 Santome/Portuguese dictionary: the ideophone and realia lemmata. We will show that the documentation of these lexical units in the Santome/Portuguese dictionary may help the description of the lexicon of some vernacular varieties of São Tomé and Príncipe Portuguese.
Four Portuguese-based creole languages emerged in the Gulf of Guinea in the sixteenth century: Santome and Angolar on the island of São Tomé, Lung'Ie on Príncipe, and Fa d'Ambô on Annobón (Ferraz 1979;Hagemeijer 2009;Bandeira 2017). All four languages are descendants of the Portuguese-based Gulf of Guinea Proto-Creole, which was created at the beginning of the sixteenth century during contact between Portuguese colonists and the African populations brought as slaves to the island of São Tomé. Isolation, migration of certain groups from the island, and linguistic contributions from the African languages by way of the constant renewal of the enslaved population contributed to the proto-creole's speciation. Santome developed in the colonial centers on the island of São Tomé, while Angolar (aoa) was the language of the descendants of runaway slaves who escaped from the plantations and founded maroon communities. Proto-Creole speakers were taken to the islands of Príncipe and Annobón, where local conditions contributed to diversification and gave rise to Lung'Ie (pri) and Fa d'Ambô (fab), respectively (Bandeira et al. 2019).
In recent years, these languages have undergone grammatization processes 3 thanks to the production of dictionaries such as the Dicionário livre do santome-português (Araujo and Hagemeijer 2013). Santome and its sister languages (Angolar and Lung'Ie) have autonomous linguistic structures and are mutually unintelligible, even though they are languages with a Portuguese lexical base and share a common origin. However, all these languages have ideophones 4 (Bartens 2000), and due to the nature of their ecolinguistic systems, they also have realia items or specialized lexical units words and expressions for culture-specific items related to endemic fauna and flora, culture, and technology (Vlahov and Florin 1969, p. 432;Bergenholtz and Tarp 1995, p. 15). In general, understanding the linguistic nature of ideophones and the realia items in any minority language allows the lexicographer to address these phenomena in a way that encompasses linguistic facts, respecting the characteristics of working languages and the scientific accuracy of the dictionary.
The Santome-Portuguese bilingual dictionary's audience is formed of readers who are monolingual in Portuguese, bilingual Portuguese/Santome, or even scholars. Furthermore, Portuguese has been the official language of São Tomé and Príncipe since 1975 and is presently spoken by more than 98% of the population (Araujo 2020). Although it is the smallest of all Portuguese-speaking countries, this language is currently spoken by the majority of its 200,000 inhabitants. It has been reported that 80% of people under 20 years of age may speak only Portuguese (Bouchard 2019;Araujo 2020, p. 193).
In most Western languages with an alphabetic tradition, the development of dictionaries in print, unlike digital media (software and applications), has promoted the use of access structure based on a systematic order. For this reason, classifying each lemma is crucial in assisting a user who is looking for lexical items in printed lexicographical works. In digital media, automated search and cross-referencing systems make this issue less salient. There is little discussion of complex units and encyclopedic lemmata in Portuguese dictionaries in the literature (Bacelar do Nascimento 2006Martins 2013;Bacelar do Nascimento et al. 2006). However, in the history of lexicography, the recording of complex lexical items in both monolingual and bilingual dictionaries underwent distinct phases. The items in Portuguese lexicography are treated as main lemmata and sometimes as subentries, confined to encyclopedic or specialized dictionaries or simply ignored. Bluteau's (1712Bluteau's ( -1728 Vocabulário portuguez e latino, áulico, anatômico, architectonico, etc., for example, realia lemmata in Brazilian Portuguese without equivalents in European Portuguese classified as 'Brazilian words' (brasileirismos), creating a new and prolific category (see Frankenberg-Garcia 2017).
The current text is organized as follows. In Section 1, we define ideophones, exemplify their usage in Santome, and discuss the solutions for presenting such lexical units in a bilingual dictionary. In Section 2, we approach the realia lemmata in Santome and describe how they were documented. The final section presents concluding remarks.

Definition
The term ideophone has been used in the literature to define a lexical unit with a high degree of syntactic rigidity formed by a noun and a qualitative, a verb and a predicate, or an adverb commonly related to colors, sounds, smells, actions, states, or intensity (Araujo 2009;Bartens 2000;Costa 2017;Doke 1935;Voeltz 1971;Westermann 1907). Bartens (2000, p. 14) argued that ideophones 3 Auroux (1992) grammatization "is the process by which all the fluxes and flows [flux] through which symbolic (that is, also, existential) acts are linked, can be discretized, formalized and reproduced. The most well-known of this process is the writing of a language" (Stiegler 2011, p. 172). 4 The term 'ideophone' here differs from sound symbolism, as used in Bantu linguistics. See Bartens (2000) for a comprehensive analysis of ideophones in Creole languages.
Languages 2020, 5, 56 3 of 11 typically present sounds and combinations of sounds not found in the phonological inventory of the language. This may be the case with the ideophones analyzed here because they may not respect some phonotactic characteristics of their languages. Costa (2017, p. 8), for example, mentions some special suprasegmental features such as vowel lengthening in Santome. Nevertheless, ideophones in Santome have been translated into Portuguese in the literature (Araujo 2009;Araujo and Hagemeijer 2013;Costa 2017;Ferraz 1979) using superlatives or the formula 'very x' or 'x-ish,' where 'x' is the first lexical item of the ideophone lexical unit. In Portuguese or English translations, adverbs such as 'genuinely', 'strongly', and 'extremely' or lexical items indicating intensity or quality of excess or repetition have been traditionally used.
In addition, even though there are ideophones whose origin may be onomatopoeic, the etymological origin of this phenomenon may be related to simple lexical items in African languages (Ferraz 1979;Costa 2017;Hagemeijer and Ogie 2011). However, due to the historical process, speakers do not have any intuition about the historical link, and the original terms themselves have been changed through morpho-phonological reinterpretation or loss and addition of segmental and suprasegmental material (Ferraz 1979). Finally, ideophones should always occur in expressions or in connection with their nouns or verbs. Therefore, speakers do not utter ideophones without a context or without anaphoric syntactic relations in the discourse (Araujo 2009;Costa 2017). An ideophone and its related lexical unit are called an 'expression,' and the second part itself is the ideophone, as proposed by Araujo and Hagemeijer (hereinafter, A & H 2013 in the examples). In (1) Furthermore, ideophones occur in settings restricted to only one lexical item, such as in (3), and rarely form units with more than one word. 8 When this occurs, the lexical items related to the ideophone are cognates or semantically related, as in (4). 9 (3) moli mogogogo 10 ["mOli mO"gOmO"gO] (expr.) Molíssimo. (A & H 2013, s. v.).
(4) a. blanku fenene 11 ["bl Languages 2020, 5, x FOR PEER REVIEW 3 of 10 present sounds and combinations of sounds not found in the phonological inventory of the language. This may be the case with the ideophones analyzed here because they may not respect some phonotactic characteristics of their languages. Costa (2017, p. 8), for example, mentions some special suprasegmental features such as vowel lengthening in Santome. Nevertheless, ideophones in Santome have been translated into Portuguese in the literature ( de Araujo 2009;de Araujo and Hagemeijer 2013;Costa 2017;Ferraz 1979) using superlatives or the formula 'very x' or 'x-ish,' where 'x' is the first lexical item of the ideophone lexical unit. In Portuguese or English translations, adverbs such as 'genuinely', 'strongly', and 'extremely' or lexical items indicating intensity or quality of excess or repetition have been traditionally used. In addition, even though there are ideophones whose origin may be onomatopoeic, the etymological origin of this phenomenon may be related to simple lexical items in African languages (Ferraz 1979;Costa 2017;Hagemeijer and Ogie 2011). However, due to the historical process, speakers do not have any intuition about the historical link, and the original terms themselves have been changed through morpho-phonological reinterpretation or loss and addition of segmental and suprasegmental material (Ferraz 1979). Finally, ideophones should always occur in expressions or in connection with their nouns or verbs. Therefore, speakers do not utter ideophones without a context or without anaphoric syntactic relations in the discourse ( de Araujo 2009;Costa 2017). An ideophone and its related lexical unit are called an 'expression,' and the second part itself is the ideophone, as proposed by de Araujo and Hagemeijer (hereinafter, A & H 2013  Furthermore, ideophones occur in settings restricted to only one lexical item, such as in (3), and rarely form units with more than one word. 8 When this occurs, the lexical items related to the ideophone are cognates or semantically related, as in (4). 9 (3) moli mogogogo 10  Some authors (Brindle 2011;Friesen 2016;Voeltz 1971; Yakpo 2019) have stressed the onomatopoeic character of ideophones, but in Santome, there are linguistic motivations other than the mimetic character of a sound. Nevertheless, some ideophones, though not all, may be related to onomatopoeia: All examples are from de Araujo and Hagemeijer (2013). In the text, they appear exactly as in the dictionary. English simplified equivalents are presented in the footnotes. The present work does not aim to provide an in-depth discussion of ideophones. For this, see Bartens (2000) for ideophones in several languages. For ideophones in Santome, see de Araujo (2009) and Costa (2017). All examples were translated from Portuguese. 9 Although rare, it is not impossible for an ideophone to be used with semantically unrelated lexical units, as can be seen in kulu dĩĩĩ 'very dark, deep night' and da son din 'falling flat' (Costa 2017, p. 51 Some authors (Brindle 2011;Friesen 2016;Voeltz 1971; Yakpo 2019) have stressed the onomatopoeic character of ideophones, but in Santome, there are linguistic motivations other than the mimetic character of a sound. Nevertheless, some ideophones, though not all, may be related to onomatopoeia: 5 All examples are from Araujo and Hagemeijer (2013). In the text, they appear exactly as in the dictionary. English simplified equivalents are presented in the footnotes. 6 pletu lululu 'very black'. See pletu 'black'. 7 kota kla 'to cut in half'. See kota 'to cut'. 8 The present work does not aim to provide an in-depth discussion of ideophones. For this, see Bartens (2000) for ideophones in several languages. For ideophones in Santome, see Araujo (2009) and Costa (2017). All examples were translated from Portuguese. 9 Although rare, it is not impossible for an ideophone to be used with semantically unrelated lexical units, as can be seen in kulu dĩĩĩ 'very dark, deep night' and da son din 'falling flat' (Costa 2017, p. 51).  Ferraz (1979, pp. 75-78) was the first to address ideophones in Santome. Although the author does not effectively offer a comprehensive description, Ferraz stated that the ideophone is an element of a category that groups any word for which the modifying form of a verb or noun is repeated or duplicated (Araujo 2009, p. 25). In addition, Ferraz (1979) listed ideophones as reduplicated words that do not behave as expressions, such as leve-leve 'more or less'. However, in general, Ferraz's observation about partial or full reduplication of syllables is generally attested, with few exceptions, as shown by the following Santome examples.

Ideophones in Santome
Araujo (2009) argued that not every ideophone in Santome contains a reduplicated form because there are monosyllabic ideophones (7a), ideophones in which all syllables are different (7b), with all syllables repeated whenever the syllables repeat (7c), with only the two initial syllables repeated (7d), with only the two final syllables repeated (7e), and a monosyllabic word with a long vowel (7f). Costa (2017, p. 68) stated that 54% of ideophones contain repeated or reduplicated parts. In the next section, we will address how Araujo and Hagemeijer's work included lexical lemmata with ideophones in their dictionary.

Ideophones in Araujo and Hagemeijer's Work
Araujo and Hagemeijer (2013)'s a lexicographic work that consists of a Santome-Portuguese bilingual dictionary and a Portuguese-Santome list. In this dictionary, the lemmata, arranged in alphabetical order, contain the lemma (presented in the official orthography for this language) followed by the phonetic transcription, word class, and equivalents. Some lemmata, such as the functional lexicon, are accompanied by examples. Lemmata with ideophones and dialectal variation forms are connected to full expressions, including the ideophone itself and a basic form, respectively. The lemmata on fauna and flora contain their scientific names wherever possible. In the Portuguese-Santome reverse list, words and phrases in Portuguese only refer to their equivalent in Santome without a word class, phonetic form, etc.
The listing of any lexical item in a dictionary requires a reflection on its nature, which includes not only its meaning or equivalents in the case of bilingual dictionaries but also its form and function. Consequently, ideophones pose a challenge for lexicographers because they are part of lexical units with a compulsory association to another name. Therefore, they are not free forms carrying meaning but a combined form. Thus, in the dictionary, ideophones must be listed twice: individually and alongside their lexical counterpart. In the first case, ignoring the meaning of the ideophone or its lexical counterpart, the individual lemma of an ideophone would allow the user to look for it (knowing it or not) in alphabetical order and relate it to its counterpart. If the ideophone is not listed individually, an unfamiliar user who does not know it or ignores its full expression could not perform the alphabetical search. Additionally, the inclusion of the multilexical lemma would allow the user to know the ideophone's unique or limited association pattern. Araujo and Hagemeijer (2013) opted to list ideophones individually, as in (8), referring the user to the compound unit, as in (8).
In (8), the structure of the lemmata contains the lemma (in bold), a phonetic transcription, and the abbreviation of the word class (id.) for the ideophone. Subsequently, there is the abbreviation Cf. (compare or see), followed by the complete multilexical unit containing the related item and its ideophone. Therefore, the isolated ideophone presents a cross-reference to the full expression. In turn, the data in (8) contains the lemma of the complete multilexical lemma and the phonetic transcription. Additionally, the word class is (expr.), that is, an expression, followed by its equivalents.
Many African languages have ideophones, and they are represented in lexicography in different ways. For example, in Brindle's (2011Brindle's ( , 2017 dictionary and grammatical outline of Chakali (cli), a Gur language spoken in Ghana, it is stated that the majority of ideophones in that language function like qualifiers, intensifiers, or adjunct adverbials. However, Brindle does not describe ideophones as members of multilexical units. In general, all ideophones are listed as basic lemmata with a full description and an example, as in (9). In Brindle's English-Chakali dictionary, shown in (10), a simpler co-reference list is provided.
(9) felfel [félfél] ideo. manner of movement, as a lightweight entity, applicable to leaves, animals and humans • Languages 2020, 5, x FOR PEER REVIEW 5 of 10 (knowing it or not) in alphabetical order and relate it to its counterpart. If the ideophone is not listed individually, an unfamiliar user who does not know it or ignores its full expression could not perform the alphabetical search. Additionally, the inclusion of the multilexical lemma would allow the user to know the ideophone's unique or limited association pattern. de Araujo and Hagemeijer (2013) opted to list ideophones individually, as in (8), referring the user to the compound unit, as in (8). In (8), the structure of the lemmata contains the lemma (in bold), a phonetic transcription, and the abbreviation of the word class (id.) for the ideophone. Subsequently, there is the abbreviation Cf. (compare or see), followed by the complete multilexical unit containing the related item and its ideophone. Therefore, the isolated ideophone presents a cross-reference to the full expression. In turn, the data in (8) contains the lemma of the complete multilexical lemma and the phonetic transcription. Additionally, the word class is (expr.), that is, an expression, followed by its equivalents.
Many African languages have ideophones, and they are represented in lexicography in different ways. For example, in Brindle's (2011Brindle's ( , 2017 dictionary and grammatical outline of Chakali (cli), a Gur language spoken in Ghana, it is stated that the majority of ideophones in that language function like qualifiers, intensifiers, or adjunct adverbials. However, Brindle does not describe ideophones as members of multilexical units. In general, all ideophones are listed as basic lemmata with a full description and an example, as in (9). In Brindle's English-Chakali dictionary, shown in (10), a simpler co-reference list is provided.
Additionally, Friesen (2016) presented a description of Moloko (mlw), a Chadic language spoken in Cameroon, with an English-Moloko and a Moloko-English Lexicon. Ideophones in Moloko are described, following Doke (1935, p. 188), as sound symbolism, where they "evoke the 'idea' of a sensation or sensory perception (action, movement, color, sound, smell, or shape). As such they are often onomatopoeic" (Friesen 2016, p. 115). In the Moloko-English Lexicon, as in (11), ideophones are described with sensory images.
In the English-Moloko Lexicon, ideophones are listed as 'ideas of' something (12).
(12) idea of the way a sick person walks abəlgamay. (knowing it or not) in alphabetical order and relate it to its counterpart. If the ideophone is not listed individually, an unfamiliar user who does not know it or ignores its full expression could not perform the alphabetical search. Additionally, the inclusion of the multilexical lemma would allow the user to know the ideophone's unique or limited association pattern. de Araujo and Hagemeijer (2013) opted to list ideophones individually, as in (8), referring the user to the compound unit, as in (8) In (8), the structure of the lemmata contains the lemma (in bold), a phonetic transcription, and the abbreviation of the word class (id.) for the ideophone. Subsequently, there is the abbreviation Cf. (compare or see), followed by the complete multilexical unit containing the related item and its ideophone. Therefore, the isolated ideophone presents a cross-reference to the full expression. In turn, the data in (8) contains the lemma of the complete multilexical lemma and the phonetic transcription. Additionally, the word class is (expr.), that is, an expression, followed by its equivalents.
Many African languages have ideophones, and they are represented in lexicography in different ways. For example, in Brindle's (2011Brindle's ( , 2017 dictionary and grammatical outline of Chakali (cli), a Gur language spoken in Ghana, it is stated that the majority of ideophones in that language function like qualifiers, intensifiers, or adjunct adverbials. However, Brindle does not describe ideophones as members of multilexical units. In general, all ideophones are listed as basic lemmata with a full description and an example, as in (9). In Brindle's English-Chakali dictionary, shown in (10), a simpler co-reference list is provided.
Additionally, Friesen (2016) presented a description of Moloko (mlw), a Chadic language spoken in Cameroon, with an English-Moloko and a Moloko-English Lexicon. Ideophones in Moloko are described, following Doke (1935, p. 188), as sound symbolism, where they "evoke the 'idea' of a sensation or sensory perception (action, movement, color, sound, smell, or shape). As such they are often onomatopoeic" (Friesen 2016, p. 115). In the Moloko-English Lexicon, as in (11), ideophones are described with sensory images.
(11) abəlgamay id. n. the way a sick person walks. (Friesen 2016, p. 405 (8), the structure of the lemmata contains the lemma (in bo the abbreviation of the word class (id.) for the ideophone. Subseque (compare or see), followed by the complete multilexical unit con ideophone. Therefore, the isolated ideophone presents a cross-refere the data in (8) contains the lemma of the complete multilexical lemm Additionally, the word class is (expr.), that is, an expression, followe Many African languages have ideophones, and they are repres ways. For example, in Brindle's (2011Brindle's ( , 2017 dictionary and gramm Gur language spoken in Ghana, it is stated that the majority of ideo like qualifiers, intensifiers, or adjunct adverbials. However, Brindle members of multilexical units. In general, all ideophones are list description and an example, as in (9). In Brindle's English-Chak simpler co-reference list is provided.
Additionally, Friesen (2016) presented a description of Moloko in Cameroon, with an English-Moloko and a Moloko-English Lex described, following Doke (1935, p. 188), as sound symbolism, w sensation or sensory perception (action, movement, color, sound, s often onomatopoeic" (Friesen 2016, p. 115 (knowing it or not) in alphabetical order and relate it to its counterpart. If the ideophone is not listed individually, an unfamiliar user who does not know it or ignores its full expression could not perform the alphabetical search. Additionally, the inclusion of the multilexical lemma would allow the user to know the ideophone's unique or limited association pattern. de Araujo and Hagemeijer (2013) opted to list ideophones individually, as in (8), referring the user to the compound unit, as in (8) In (8), the structure of the lemmata contains the lemma (in bold), a phonetic transcription, and the abbreviation of the word class (id.) for the ideophone. Subsequently, there is the abbreviation Cf. (compare or see), followed by the complete multilexical unit containing the related item and its ideophone. Therefore, the isolated ideophone presents a cross-reference to the full expression. In turn, the data in (8) contains the lemma of the complete multilexical lemma and the phonetic transcription. Additionally, the word class is (expr.), that is, an expression, followed by its equivalents.
Many African languages have ideophones, and they are represented in lexicography in different ways. For example, in Brindle's (2011Brindle's ( , 2017 dictionary and grammatical outline of Chakali (cli), a Gur language spoken in Ghana, it is stated that the majority of ideophones in that language function like qualifiers, intensifiers, or adjunct adverbials. However, Brindle does not describe ideophones as members of multilexical units. In general, all ideophones are listed as basic lemmata with a full description and an example, as in (9). In Brindle's English-Chakali dictionary, shown in (10), a simpler co-reference list is provided.
Additionally, Friesen (2016) presented a description of Moloko (mlw), a Chadic language spoken in Cameroon, with an English-Moloko and a Moloko-English Lexicon. Ideophones in Moloko are described, following Doke (1935, p. 188), as sound symbolism, where they "evoke the 'idea' of a sensation or sensory perception (action, movement, color, sound, smell, or shape). As such they are often onomatopoeic" (Friesen 2016, p. 115). In the Moloko-English Lexicon, as in (11), ideophones are described with sensory images.
(11) abəlgamay id. n. the way a sick person walks. (Friesen 2016, p. 405 In (8), the structure of the lemmat the abbreviation of the word class (id.) (compare or see), followed by the co ideophone. Therefore, the isolated ideo the data in (8) contains the lemma of th Additionally, the word class is (expr.), Many African languages have ideo ways. For example, in Brindle's (2011, Gur language spoken in Ghana, it is sta like qualifiers, intensifiers, or adjunct a members of multilexical units. In gen description and an example, as in (9 simpler co-reference list is provided. Additionally, Friesen (2016) presented a description of Moloko (mlw), a Chadic language spoken in Cameroon, with an English-Moloko and a Moloko-English Lexicon. Ideophones in Moloko are described, following Doke (1935, p. 188), as sound symbolism, where they "evoke the 'idea' of a sensation or sensory perception (action, movement, color, sound, smell, or shape). As such they are often onomatopoeic" (Friesen 2016, p. 115). In the Moloko-English Lexicon, as in (11), ideophones are described with sensory images.
Additionally, Friesen (2016) presented a description of Moloko (mlw), a Chadic language spoken in Cameroon, with an English-Moloko and a Moloko-English Lexicon. Ideophones in Moloko are described, following Doke (1935, p. 188), as sound symbolism, where they "evoke the 'idea' of a sensation or sensory perception (action, movement, color, sound, smell, or shape). As such they are often onomatopoeic" (Friesen 2016, p. 115). In the Moloko-English Lexicon, as in (11), ideophones are described with sensory images.
In the English-Moloko Lexicon, ideophones are listed as 'ideas of' something (12).
Pichi (fpe), an English-based creole spoken in the Republic of Equatorial Guinea, is another African language with ideophones, as described by Yakpo (2009Yakpo ( , 2019. The author claims that, in Pichi, ideophones "are words with expressive semantics and particular structural characteristics" (Yakpo 2019, p. 443). Furthermore, Yakpo adds: "It is therefore difficult to ascertain how widespread the use of these ideophones is, and whether some of them are sound symbolic ad hoc creations, 20 ba bligidi 1. 'to collapse'. 2. 'to plummet'. 21 liku sonosono 'very rich'. lgamay id. n. the way a sick person walks. (Friesen 2016, p. 405).
bakaka id. spicy hot taste. (Friesen 2016, p. 406). g members of multilexical units. In general, all ideophones are listed as basic lemmata with a full description and an example, as in (9). In Brindle's English-Chakali dictionary, shown in (10), a simpler co-reference list is provided.
Additionally, Friesen (2016) presented a description of Moloko (mlw), a Chadic language spoken in Cameroon, with an English-Moloko and a Moloko-English Lexicon. Ideophones in Moloko are described, following Doke (1935, p. 188), as sound symbolism, where they "evoke the 'idea' of a sensation or sensory perception (action, movement, color, sound, smell, or shape). As such they are often onomatopoeic" (Friesen 2016, p. 115). In the Moloko-English Lexicon, as in (11), ideophones are described with sensory images.
In the English-Moloko Lexicon, ideophones are listed as 'ideas of' something (12).
Pichi (fpe), an English-based creole spoken in the Republic of Equatorial Guinea, is another African language with ideophones, as described by Yakpo (2009Yakpo ( , 2019. The author claims that, in Pichi, ideophones "are words with expressive semantics and particular structural characteristics" (Yakpo 2019, p. 443). Furthermore, Yakpo adds: "It is therefore difficult to ascertain how widespread the use of these ideophones is, and whether some of them are sound symbolic ad hoc creations, 20 ba bligidi 1. 'to collapse'. 2. 'to plummet'. 21 liku sonosono 'very rich'. raw id. idea of cutting something through the middle. (Friesen 2016, p. 411).
In the English-Moloko Lexicon, ideophones are listed as 'ideas of' something (12).
(12) idea of the way a sick person walks ab@lgamay.
Pichi (fpe), an English-based creole spoken in the Republic of Equatorial Guinea, is another African language with ideophones, as described by Yakpo (2009Yakpo ( , 2019. The author claims that, in Pichi, ideophones "are words with expressive semantics and particular structural characteristics" (Yakpo 2019, p. 443). Furthermore, Yakpo adds: "It is therefore difficult to ascertain how widespread the use of these ideophones is, and whether some of them are sound symbolic ad hoc creations, whether they are carried over from other languages used by the speaker, or whether they form part of the lexicon of Pichi" (Yakpo 2019, p. 443). Even though ideophones are documented in the Pichi-English list in Yakpo's work, as in (13), many are not present in the English-Pichi word list, such as bwa and gbin. When ideophones are listed, they have a simple co-reference item. In this sense, the search for an ideophone in Yakpo's printed word lists may not be an easy task. Considering that the grammatical category of ideophones does not exist in the Portuguese language, the solution presented by Araujo and Hagemeijer (2013), as in example (8), is convenient for allowing users to find both the ideophone alone and the complete unit in the minority language. Therefore, it is a useful solution, especially to readers of Portuguese or other languages with no ideophones. However, isolated ideophones are not listed in the Portuguese/Santome list because they do not belong to the Portuguese local vernacular lexicon. Nonetheless, names and their ideophones are listed with their full Portuguese meanings, as in (15).

Realia
The formation of the Portuguese-based Proto-Creole of the Gulf of Guinea and its later speciation into four languages (Santome, Lung'Ie, Angolar, and Fa d'Ambô) in the sixteenth century are related to the colonial system that was implemented in the then uninhabited islands of São Tomé and Príncipe and Annobón. 24 The thousands of African slaves kidnapped from the mainland, mainly from the regions of the Niger Delta, Congo, and Angola, and their coexistence with Portuguese settlers of European origin promoted the emergence of the proto-creole. However, the dozens of mother tongues of the slaves-including their unique ways of naming the world-as well as the Portuguese language of the European settlers and their cultures and the linguistic agency of members of those communities-associated with the very nature of the new island's environments and their endemic fauna and flora-promoted the naming of a new world without a parallel in the Portuguese language. Thus, like any human language, Santome has lexical items that reflect their unique specificities, known as realia or 'cultureme' (Vlahov and Florin 1969;Xatara and Seco 2014) or defined as specialized lexical units (Bergenholtz and Tarp 1995).
Therefore, Araujo and Hagemeijer (2013) applied a solution for realia lemmata that correlates a lexical item (simple or compound) to an identical word in local Portuguese or to a full description when the realia in Santome does not circulate in the vernacular Portuguese. Thus, their dictionary also documents lexical items in the minority language related to the vernacular Portuguese of São Tomé and Príncipe. In fact, most realia items from Santome already circulate in the local Portuguese varieties. In (16), for example, Araujo and Hagemeijer (2013, p. 41) present a lexical lemma naming a species of pepper endemic to the region of West Africa without an equivalent in Portuguese. Consequently, the dictionarists chose to repeat the name of the plant in italics-thus, establishing a 'Santomean Portuguese word' (a language fact proper to the Portuguese influenced by the Santome language or the ethnic group Santome/Forro)-followed by its scientific name (listed whenever possible in Araujo and Hagemeijer's work) written in bold and italics. Thus, in (16a), a name is listed for a fauna item, an endemic tree, as an example of realia. If a local fauna or flora item has a Portuguese word, the authors simply documented it, such as (16b), 'pau-sabrina'. However, (16c) and (16d) show an interesting contrast: (16c) is a realia item in Santome with the same equivalent in vernacular Portuguese, however, the equivalent of (16d) is a combination of a Santomean-originated realia and a Portuguese expression (do-mato), creating a new item in Portuguese.  (17), the name of a traditional therapist dedicated to the sensory examination of urine (through smell, texture, taste, color, and impurities) of patients with kidney diseases or diabetes is a realia item. However, differently from a fauna or flora item, it was necessary to propose a definition of this word. In these three cases, the Santome term was repeated in the dictionary because the lexical item is itself the equivalent word in Portuguese. Therefore, São Tomé and Príncipe's vernacular varieties of Portuguese have already loaned many words from Santome and other national languages. Moreover, in daily work with the informants, 24 Territory of the present Republic of Equatorial Guinea. 25 pyadô-zawa 'traditional therapist that examined urine'. See pyadô 'observer'. it was often difficult to separate the items from one another given the increasing use of Portuguese. Users were fully aware of the influence of Portuguese and were therefore sometimes confused about the source of many endemic items. However, working with the community is crucial for the acceptance of the dictionary by the local residents. The materials in this Santome dictionary may thereby feed other lexicographical works focusing on African countries that use the Portuguese language (see Bacelar do Nascimento 2013) with unique items from the linguistic reality of São Tomé and Príncipe. The documentation of a 'Santomean Portuguese word' (a language fact proper to the Portuguese influenced by the Santome language or its ethnic group) has a parallel in the Luso-Brazilian lexicography tradition, just as 'Brazilian Portuguese words' have been populating the Portuguese language (Bluteau 1712(Bluteau -1728Frankenberg-Garcia 2017).

Nonetheless, in
Additionally, the Santome/Portuguese dictionary contains lexical items that refer to festivals, religious rituals, flora, fauna, technologies, etc., which require a description in the dictionary because they are unique to São Tomé and Príncipe's linguistic and cultural environment. The example in (18) refers to a traditional medicine given to women ready to give birth. Whereas in (19), sôwô is the name of a local dish, prepared with breadfruit, cassava, plantain, fish and aromatic herbs, served with manioc flour or baked bananas.
thought, but that it was indeed a new species (Ceriaco et al. 2017). Thus, Agostinho and Araujo's (in press) Lung'Ie/Portuguese bilingual dictionary already includes the new scientific name of that species endemic to the island of Príncipe, improving the previous work 32 .
Therefore, all realia items in the Santome/Portuguese dictionary may be a source for lexicographers who prepare bilingual and monolingual dictionaries in Portuguese.

Final Remarks
In this text, based on the case study of one bilingual dictionary of the Portuguese-based creole language Santome, we presented how the authors dealt with the documentation of ideophones and realia items. The task of documenting Santome, a minority and endangered language spoken in São Tome and Príncipe, is urgent. Thus, Santome can benefit from the production of bilingual dictionaries, just as any threatened minority language can (Auroux 1992;Ogilvie 2010). We demonstrated that ideophones can be documented in isolation and with reference to the other members of their multilexical unit. Thus, the user is allowed to search for the ideophone per se and its full expression in the dictionary. Realia lemmata, in turn, are necessary to describe the endemic characteristics of local lexical items. They simultaneously allow the documentation of lexical units in Portuguese and are able to be classified in the future as Santomean Portuguese words.
The publication of bilingual dictionaries of threatened languages is a step towards the documentation and grammatization of minority languages. Moreover, it enables the lexicographic and scientific discussion of phenomena that do not exist or were not explored in Portuguese, especially in the less known vernacular varieties.