Skip to main content

Generating Vocabulary

Here's another conundrum I spent many hours figuring out. How can I generate a vocabulary, or lexicon, without it taking YEARS?

There are a few different schools of thought on this.
Some people feel that each word needs to sound like what it is, within the confines of their phonology. Meaning, you think about and create each word. This is very abstract, but you just might come out of it actually being able to remember a lot of your words, maybe even be able to speak your conlang (Remember, VERY few conlangers are fluent in their language, and the ones that claim to be are suspect, because who can really judge them?). Plus, you're guaranteed to get a conlang that sounds the way you want it to.

The opposite extreme is to randomly generate your vocabulary, after keying in your phonology to a word generator program. The advantage is you get a big vocabulary quickly, the down side is that you won't know any of the words off the top of your head until after some studying, and some of the words may not be to your taste.

I started out wanting to randomly generate my lexicon, but found LangMaker and word generators like the one linked above to be inadequate, at least at first. I had quite a time figuring out a good word list to use; I started with
Ogden's Basic English, which has about 850 words. However, it is a list generated for teaching basic English, not for creating a conlang. Some words in the Basic English list might be "covered" differently in the word list of another language.

But I found another list that I thought was better, mostly because it was much shorter: the Swadesh List. Only about 200 words there!

In the second Language Creation Conference, John Clifford spoke a little about semantic primes, which aren't "words" so much as they are blocks of meaning. Its a different way of thinking, but a little reading here can also help you develop a word list of your own. I found a word list, called the Universal Language Dictionary, that groups words together according to concepts, which may help you if you want to create a derivational morphology or something. The ULD at least partially embraces the semantic prime idea, and can be another good resource for developing/building/copying a word list for lexicon generation.

So, with a short list, you CAN use the first abstract method, or you can randomly generate, and then change words as you determine better sounding ones, and add to the lexicon as you translate phrases. Long lists may be more cumbersome, but can be worth the time and headache if you plan on doing a lot of translating, as you won't have to stop to create a lot of new words each time.

Comments

Unknown said…
I stumbled across this entry randomly and it turned out to be exactly what I didn't know I was looking for! Very useful, thank you.
Matthew Shields said…
Glad to hear it. Comments like that keep me working on this blog!
Kristian Hart said…
Nice, exactly what I needed for an initial word set.
Kartov said…
Спасиво вла постa цу!

My conglang.

"Thank you for posting this!"

Popular posts from this blog

Tolkien's Alphabets

I was working on my fonts for my conlang again and I starting thinking about alphabets in general, and I thought it would be fun to do a post on Tolkien's Middle Earth alphabets, Cirth and Tengwar. First, let's take a look at Cirth , which was used to write Khuzdul, the dwarvish language, as well as Quenya and Sindarin, the elvish languages. It was based on the Norse & Anglo-Saxon Futhark runes. There's nothing very fancy about this alphabet, it functions much the same as our own; each glyph represent one character. But note that the different letters correspond to each other in certain ways: letters that are phonetically close to each other look similar. P and B, for example. B is pretty much the "voiced" form of P (voiced means that your vocal chords are engaged and vibrating). B looks just like P but its got that extra little stroke sticking out there, making it look like an R. Same thing for T and D, and K and G. And those are just the plosives; lo...

Orthography - Making Your Own Alphabet

This is Part One, Part Two , Part Three The idea of making up my own alphabet was probably the first thing that attracted me to conlanging. After I learned Bulgarian, I made up a code that was based on Cirth and Bulgarian . I sent my brother the code and would mail him letters using it, just for fun. I started thinking about developing a new alphabet later, when I was playing the Myst games, and I saw the flowing script of D'Ni  (D'Ni is a conlang Cyan/Richard Watson developed for their games and books). First things to consider as you start developing your alphabet - What do you want? a phonetic alphabet a non-phonetic alphabet (like English) or a syllable-based alphabet (meaning one character per syllable, like po, kee, ot, or kel, would be represented by one character/Tibetan is syllabic) or an abjad, which would be a consonant-only-alphabet, and all vowels would be represented by diacritic marks (Hebrew and Arabic are examples) A little research on Omniglot will...

How to Make a Conlang out of English

Ok.  My experience has been that some conlangers out there do not like it when your conlang is too... Englishey . This generally means your conlang has basically the same syntax and grammar as English, and the same sounds, too.  There might be a few twists in there - an extra case, some extra phonemes, a different alphabet, but overall, pretty close to English. And really, who can blame them?  For those that take the time to learn and understand linguistics and all the concepts behind it, it looks and feels lazy and uninspired. For the record, I do not encourage conlanging snobbery, I'm just saying that I understand where it comes from. But... if you DON'T know lots of linguistics, and don't care to study all the principles and so forth, what else can you do?  If you know a second language you can mash up the two languages you know.  But aside from that, how else can you build a language? Being the conlang contrarian I am, I think you can transform English...