Skip to main content

Morphology 101

Today I want to start talking about morphology, which means how words are structured. If you've been reading MakeALang for awhile, I posted last year about phonotactics a little. Phonotactics = phon ( sound) + tact (touch). Phonotactics is about what sounds can touch other sounds in a language. Example: in English, s and r cannot be next to each other. Sri Lanka is obviously a foreign name to us because we just know that s and r aren't supposed to be together.

Morphology is different. Morphology is not about the sounds that make up words, but about the structure of words. Its about what a word is in your conlang, and how it works to convey meaning. This is actually a huge subject (for me, at least) and I've been struggling for MONTHS to try and break it down to a point where its digestible. Well that, and my wife and I had a baby boy end of September. :D

I won't be covering all morphology concepts in this post, but there will probably be a Morphology 202 post later. But the first concepts to digest are Lexemes vs. Word Forms. A lexeme is a unit of meaning, in as much as rock and rocks have almost the same meaning. A word form can be considered another form, or sub-meaning of a lexeme, so rocks is a pluralized word form of the lexeme rock.

Morpheme vs. lexeme vs. word-based morphologies: There are three main approaches to studying morphology, and you can keep these in mind as you develop your word structure. Most of what follows for the next few paragraphs is pretty much copied and pasted from the Wikipedia article on Morhology, because I think its already pretty easy to understand.

Morpheme-based word forms are analyzed as arrangements of morphemes. A morpheme is defined as the minimal meaningful unit of a language. In a word like independently, we say that the morphemes are in-, depend, -ent, and ly; depend is the root and the other morphemes are, in this case, derivational affixes. In a word like dogs, we say that dog is the root, and that -s is an inflectional morpheme. This way of analyzing word forms as if they were made of morphemes put after each other like beads on a string, is called Item-and-Arrangement.

The morpheme-based approach is the first one that beginners to morphology usually think of, and which laymen tend to find the most obvious. This is so to such an extent that very often beginners think that morphemes are an inevitable, fundamental notion of morphology, and many five minute explanations of morphology are, in fact, five minute explanations of morpheme-based morphology. This is, however, not so. The fundamental idea of morphology is that the words of a language are related to each other by different kinds of rules. Analyzing words as sequences of morphemes is a way of describing these relations, but is not the only way. In actual academic linguistics, morpheme-based morphology certainly has many adherents, but is by no means the dominant approach.

Lexeme-based morphology is (usually) an Item-and-Process approach. Instead of analyzing a word form as a set of morphemes arranged in sequence, a word form is said to be the result of applying rules that alter a word form or stem in order to produce a new one. An inflectional rule takes a stem, changes it as is required by the rule, and outputs a word form; a derivational rule takes a stem, changes it as per its own requirements, and outputs a derived stem; a compounding rule takes word forms, and similarly outputs a compound stem.

Word-based morphology is a (usually) Word-and-Paradigm approach. This theory takes paradigms as a central notion. Instead of stating rules to combine morphemes into word forms, or to generate word forms from stems (stems meaning the root word), word-based morphology states generalizations that hold between the forms of inflectional paradigms. The major point behind this approach is that many such generalizations are hard to state with either of the other approaches. The examples are usually drawn from fusional languages, where a given "piece" of a word, which a morpheme-based theory would call an inflectional morpheme, corresponds to a combination of grammatical categories, for example, "third person plural." Morpheme-based theories usually have no problems with this situation, since one just says that a given morpheme has two categories. Item-and-Process theories, on the other hand, often break down in cases like these, because they all too often assume that there will be two separate rules here, one for third person, and the other for plural, but the distinction between them turns out to be artificial. Word-and-Paradigm approaches treat these as whole words that are related to each other by analogical rules. Words can be categorized based on the pattern they fit into. This applies both to existing words and to new ones. Application of a pattern different than the one that has been used historically can give rise to a new word, such as older replacing elder (where older follows the normal pattern of adjectival superlatives) and cows replacing kine (where cows fits the regular pattern of plural formation). While a Word-and-Paradigm approach can explain this easily, other approaches have difficulty with phenomena such as these.

Now that I've thrown all that at you, I want to condense it a bit by emphasizing this: word-building is usually analyzed with a three-way distinction:

Derivation: adding affixes to roots or compound stems to get new stems with different meanings; side vs. inside vs. insidious.

Composition or compounding: joining of two or more prior members (i.e., already extant words) to create a new word. Sometimes the words don't need to become one word, like bookkeeping. The meaning of hunter can change a lot by adding a word - deer hunter vs. bargain hunter.

Inflection: Adding affixes to roots or stems that alter grammaticalized categories, as opposed to altering meaning (as with derivation); categories like person, number, gender, tense, mode/mood, aspect, etc. We do this in English - we pluralize something by adding an s at the end. This is an "inflectional rule."

One last concept - Isolating Morphology. This is basically a lack of morphology. Every word has its own meaning and there is no morphology. You could not have words like "amusement" or "firefighter" in an isolating language. You could have a word like "trolsh" that MEANT amuse or fire, and another word "im" that meant -ment or fighter, but words cannot be derived in an isolating language, so you just wouldn't combine them into "trolshim". Also, because words are not marked by morphology showing their role in the sentence, word order tends to carry a lot of importance in isolating languages. Isolating languages are common in southest Asia, if you want to know any examples of how this might work.

I want to give props and thanks to Jeff Burke who helped me with this post. Please check out his blog - he's writing a novel just like me! Also special thanks to Baalak, who recommended me to add something about isolating languages.

Ama pos tulonu sa taka oma so! This post is already too long so thats it!

Comments

Anonymous said…
Congratulations! I hope your son is healthy and happy. I'm also glad to see you coming back to the table and making a post which seems to be trying to make up for lost time. Way to go.

You've created a detailed, well-informed, and, had I not already done much of the research you yourself did for this entry, informative post on the subject. I was glad to read it. However, I think you've overlooked something important, and I'd like to bring it to your attention.

The morphology you described is descriptive of a synthetic language, but I feel it is important to point out that it fails to describe isolating tongues. As a matter of fact, I've heard isolating languages described as lacking any morphology at all, an interesting concept for those of us raised with synthetic and polysynthetic languages to wrap our heads around.

I feel it is important to state that everything which is encoded in a language with morphology is also encodable solely with syntax; though that seems to be something for another post to describe.

Welcome back to your blag. I hope to see lots more from you. Hope you had a merry Christmas, and have a happy new year.
Matthew Shields said…
Yeah, I had thought about putting in something about isolating morphology, but trimmed it out, and some other things, for simplicity's sake. I think I'll edit the post to include it now though. Keep on conlangin' Baalak!
This is nice blog. Contents over here are so informative. Want to grab more on Morphology | Cognitive Science

Popular posts from this blog

Tolkien's Alphabets

I was working on my fonts for my conlang again and I starting thinking about alphabets in general, and I thought it would be fun to do a post on Tolkien's Middle Earth alphabets, Cirth and Tengwar. First, let's take a look at Cirth , which was used to write Khuzdul, the dwarvish language, as well as Quenya and Sindarin, the elvish languages. It was based on the Norse & Anglo-Saxon Futhark runes. There's nothing very fancy about this alphabet, it functions much the same as our own; each glyph represent one character. But note that the different letters correspond to each other in certain ways: letters that are phonetically close to each other look similar. P and B, for example. B is pretty much the "voiced" form of P (voiced means that your vocal chords are engaged and vibrating). B looks just like P but its got that extra little stroke sticking out there, making it look like an R. Same thing for T and D, and K and G. And those are just the plosives; lo...

Orthography - Making Your Own Alphabet

This is Part One, Part Two , Part Three The idea of making up my own alphabet was probably the first thing that attracted me to conlanging. After I learned Bulgarian, I made up a code that was based on Cirth and Bulgarian . I sent my brother the code and would mail him letters using it, just for fun. I started thinking about developing a new alphabet later, when I was playing the Myst games, and I saw the flowing script of D'Ni  (D'Ni is a conlang Cyan/Richard Watson developed for their games and books). First things to consider as you start developing your alphabet - What do you want? a phonetic alphabet a non-phonetic alphabet (like English) or a syllable-based alphabet (meaning one character per syllable, like po, kee, ot, or kel, would be represented by one character/Tibetan is syllabic) or an abjad, which would be a consonant-only-alphabet, and all vowels would be represented by diacritic marks (Hebrew and Arabic are examples) A little research on Omniglot will...

Numbers in your Conlang

This topic comes up every once in awhile on the conlang forums - numbers in your conlang. This post goes out to you, Janko Gorenc. ;) Usually the biggest issue of these threads is simply, what base do you want for your number system and why? The base for your number system basically means, how many numbers are there, before you go up to the next "place" in the numeral system? Now, most of the world uses a base 10 number system, and its probably because people have 10 fingers. But we could have had a base 5 number system, and a lot of conlangers play with this. Or, you might be developing a language and culture for an alien culture that has 12 fingers, or six limbs, or nine tentacles! Whatever base you want, for whatever reason, I wanted to provide a brief tutorial on how to calculate or translate base 10 numbers into another base, or vice versa. If you want to know more about number systems before diving into this, read these Wikipedia articles on number systems....