Research

This page gives short descriptions of papers and presentations on Toki Pona. At this time it only includes those by the site author, but research from others will be added in the future.

As presented by Magda Kitano at Vocab@Maryland 2025, University of Maryland School of Languages, Literatures, and Cultures, June 17, 2025

Construction of the Corpus

Corpus building software AntConc (Anthony, 2024) was utilized to create a corpus of written Toki Pona. In total, there were 118 files and 286,09 tokens.

Included are all materials available until May, 2025 of:

  • Magazine issues (Lipu Tenpo, Lipu Monsuta, Lipu Kule)
  • YouTube songs that include the lyrics in the description. Both original and translated covers are included.
  • Translations (The Wonderful Wizard of Oz (Sonja Lang), Beatrix Potter collection (toki soweli), Fingtam Languages books)
  • Original writings (lipu pi wawa lili, Micro Stories)

Not included:

  • Toki Pona learning materials
  • materials in other languages talking about Toki Pona
  • social network posts
  • spoken dialog

Finding 1: A quick look at a Zipf’s plot

According to Zipf’s law, the frequency of any word is inversely proportional to its rank in a frequency table, and this has been found to be consistant for most languages. As Toki Pona has only 140 words, are word frequencies consistant with those in natural languages, or is a higher proportion of the lexis used more frequently, giving a flatter curve?

Skotarek (2020) also examined Toki Pona in regard to Zipf’s law, comparing the text of an original French book with the Toki Pona translation of that book.

In the above study, distribution in both texts represented a correlation with the Zipfian prediction, with Toki Pona being even closer to the prediction than French was.

The current study resulted in a curve similar to that of Toki Pona in the Skotarek study, dropping off after 100 words, reaching the end of the language’s limited vocabulary.

Toki Pona therefore does not seem to act differently from natural languages in terms of word frequency.

Finding 2: Word combinations whose frequencies warrant status as part of the lexis

Combining words is central to Toki Pona to communicate concepts not included in its 140 words. But word combinations are meant to be fluid – that is, one does not memorize lists of word combinations. Rather, speakers combine words as they see fit to the particular circumstance. However, when one starts to use the language, it is clear that there are certain word combinations that are generally used and understood. The corpus in this study was utilized to check whether there are word combinations that could be considered fixed, and which should be learned by beginners.

Word frequency was compared between single words and combinations, or n-grams. Colligations and collocations, such as nouns and verbs that are often used together, were omitted from the list, leaving a total of 61 word combinations that are equally as frequent as single words. (Word combinations that have frequencies equal to or lower than the lesser-used of the nimi sin are not included.) See this page for the complete list. As seen in the graph below, one combination, toki pona, the name of the language itself, was particularly high in frequency, around the same as that of kalama. The others in the list fell in the range between mu and lanpan.

So, can word combinations be called fluid in Toki Pona? While this study identified word combinations that could be said to have been lexicalized, many more concepts are communicated by free combination. An example is the variety of ways that coffee can be translated. Another is how Sonja Lang used different expressions for the word lion in her translation of The Wonderful Wizard of Oz (2024). As the Cowardly Lion grows, he is referred to as soweli suli, then soweli wawa, then finally soweli lawa. Within the context of fluid word combination, the 61 word combinations identified in this study should be considered important to learn for beginners.

References

Anthony, L. (2024). AntConc (Version 4.3.1) [Computer Software]. Tokyo, Japan: Waseda University. https://www.laurenceanthony.net/software/AntConc
Baum, L. F. (2024). The Wonderful Wizard of Oz (S. Lang, Trans.). Sonja Lang. (Original work published 1900).
Skotarek, D. J. (2020). Zipf’s law in Toki Pona. ExLing 2020: Proceedings of 11th International Conference of Experimental Linguisticshttps://doi.org/10.36505/ExLing-2020/11/0047/000462

As presented by Magda Kitano at the 11th Language Creation Conference, University of Maryland, April 12, 2025

While teaching Toki Pona to university students in Japan in an informal club setting, it was observed that although Toki Pona has many similarities with Japanese, these similarities were not benefiting the students in acquiring the language.

Finding 1: Mistakes students were making seemed to be influenced by their second language, English.

In cross-linguistic influence, the morphological congruency hypothesis states that if a grammatical structure is present in your native language, (1) it is easier to notice it in a second language, and (2) it is easier to become fluent in using it. For example, if your native language makes plural forms of nouns by adding one sound to the end of the word, it’s easier for you than others to learn to use plurals in English when learning it as a foreign language. However, when teaching Toki Pona to Japanese university students, it was found that similarities between Toki Pona and Japanese were not aiding them in acquiring Toki Pona.

Toki Pona and Japanese both have particles that come after the subject of a sentence, but the Japanese students often neglected to use the particle li when writing sentences.

Toki Pona and Japanese both have particles that indicate a direct object. The Toki Pona e comes before the noun, and the Japanese wo comes after it. However, the Japanese students would often put the direct object right after the verb without any particle at all, as in English.

The animal is eating a fish.
What is the animal eating?

To make questions in both Toki Pona and Japanese, the word for what is substituted into the sentence in the place of the information desired. There is no inversion or change in word order. However, when forming sentences, the students would often try to start with “What…” as English sentences do.

Toki Pona and Japanese both have a context marker which designates a noun or phrase as being the context of the sentence, although it is used similarly only in one kind of usage. For example, the sentence to the left would be “Regarding today, the weather is good.” Students still had a hard time grasping this usage.

In all of these cases, the students seemed to be influenced not by their first language, but by their second language, English. A clue to what may be happening here comes from Falk, Lindqvist, & Bardel (2015), who found that the learner’s second language has more influence on third language learning when the learners have low metalinguistic knowledge of their first language. This makes a lot of sense, probably to any second language learner. One often does not have a firm grasp of the grammar of their own language, but a second language is usually taught through grammar lessons. We therefore end up with a better understanding of the grammar of our second language, even if we do not speak it as fluently as our first. In the case of the participants of the current study, this may be especially so. Many were training to become English teachers in Japanese schools. They therefore were working closely with English grammar and thinking about how to teach it.

The students did, however, benefit from other similarities between Toki Pona and Japanese, such as there being no plural markers and no determiners. Similarity of phonemes helped them with pronunciation, and although the Toki Pona hieroglyphic writing system may have seemed less foreign to students who have logographic characters in their language, in that respect they did not seem very interested in learning or using that form of writing for Toki Pona.

Finding 2: Although Toki Pona and Japanese are both high context languages, this did not aid Japanese students in acquiring Toki Pona.

A high context culture is one where there is a high level of shared context, and implicit knowledge is relied upon in communication. Meyer (2014) writes that the Japanese language is a high-context language because of its many homonyms and words with multiple meanings. However, Japanese does not come near to the level of Toki Pona in this regard. I would say that the high-context aspect of these two languages differs greatly.

In Japanese, the subject and even sometimes other nouns central to the message are omitted. This makes understanding the statement reliant upon the shared experience of the speaker and listener. It is particularly rare because the verb form does not indicate speaker. Languages where the subject is often omitted usually indicate the speaker through verb conjugation.

Familiarity with this kind of high context communication does not prepare learners for the high context aspect of Toki Pona, where listeners have to deal with several ways in which words are not precise regarding meaning. First, each word has many meanings, and one must decipher which is being used. Pona can mean good, friendly, simple, correct, peace, fix, improve, heal, or thanks. Second, general concepts are often used instead of specific labels. One usually refers to an animal as just soweli, animal, instead of saying dog or cat. Third, word combinations are fluid, so one needs to be in tune with what the speaker wants to say in order to know what they are referring to when using a new word combination.

According to Meyer (2014), most intercultural miscommunications arise not between people from two different types of cultures, a mix of high- and low-context cultures, but rather between people from two different high-context cultures. What is implied in a statement and how that is communicated differs between high-context cultures and languages. The students in this study not only did not have an advantage in dealing with the high-context aspect of Toki Pona because of their Japanese background, but it may even have made it harder for them.

References

Falk, Y., Lindqvist, C., & Bardel, C. (2015). The role of L1 explicit metalinguistic knowledge in L3 oral production at the initial state. Bilingualism: Language and Cognition 18 (2), 227-235.
Meyer, E. (2014). The Culture Map. PublicAffairs.

As presented by Magda Kitano at EdYouFest Tokyo, on August 19, 2025, at Meiji Gakuin University.

In this short workshop, participants learned and started using the basic words and grammar of Toki Pona. Then, possible usages of Toki Pona in language research and language learning were discussed. Points that came up included:

  • Research: words introduced in the workshop were given in two batches. The first followed recommendations for vocabulary learning, and the other did not. For example, the second batch included words that had similar spellings or sounds, opposites, and lexical sets. Participants could then reflect on their own experiences under these two treatments. The benefits of using Toki Pona for such experiments were discussed. [lack of time due to logistics prevented this part of the workshop from being fully appreciated]
  • Language learning: while learning a third language can aid in an understanding of languages, the time and effort required to do so can be daunting. However, the participants were soon making their own word combinations and sentences, and it is said that only 30 hours is required to master Toki Pona. The language can therefore be a way to give language learners a new perspective on languages without committing to learn a natural language from scratch.
  • Teacher training: several participants were teacher trainers, and Toki Pona could be used for their teaching practice. Preparing mock lessons for their classmates in the language that they are all preparing to teach does not give them practice in presenting a language to those new to the language. Toki Pona could be a way for them to experience teaching beginners.

As presented by Magda Kitano at JALT2023: 49th Annual International Conference on Language Teaching and Learning on November 27, 2023, at the Tsukuba International Congress Center.

Teaching Toki Pona to a group of university students in Japan, observations were recorded in a reflective teaching journal, and a student questionnaire was implemented at the end of the program. The following was observed:

  • Students showed a higher awareness of part of speech. Because all Toki Pona words can be used as any part of speech without changes in form, students ended up discussing with each other how words were being used.
  • Students had a hard time with the high context nature of communication in Toki Pona.
  • When asked at the end of the program, students agreed that they felt that they had better skills in paraphrasing words that they do not know in a foreign language, as found by Coluzzi (2022).
  • Japanese English education tries to avoid straight drilling of sentences in favor of putting new words and expressions into conversational situations. However, due to the nature of Toki Pona’s words and grammar, our lessons were basically sentence practice and translation. The students did not find this to be undesirable. Rather, they responded that they appreciated practice in the new grammar structure, and feeling confident in one grammar point before going on to the next.
  • Participants training to be English teachers appreciated the deeper understanding of languages that this experience gave them.

Access to the full paper on this presentation:
Kitano, M. L. (2024). Teaching Toki Pona in Japan. In B. Lacy, R. P. Lege, & P. Ferguson (Eds.), Growth Mindset in Language Education. JALT. https://doi.org/10.37546/JALTPCP2023-27

References

Coluzzi, P. (2022). How learning Toki Pona may help improving communication strategies in a foreign or second language. Language Problems and Language Planning 46:1 p. 78-98. https://doi.org/10.1075/lplp.00086.col