|Southeast Asia and the Pacific|
The western Malayo-Polynesian languages.
The Malayo-Polynesian languages are a subgroup of the Austronesian languages, with approximately 385.5 million speakers. The Malayo-Polynesian languages are spoken by the Austronesian people of the island nations of Southeast Asia and the Pacific Ocean, with a smaller number in continental Asia. Malagasy is a geographic outlier, spoken in the island of Madagascar located off the eastern cost of Africa in the Indian Ocean. The language family shows a strong influence of Sanskrit and later Arabic as the region has been a stronghold of Buddhism, Hinduism and since the 10th century, Islam.
Two morphological characteristics of the Malayo-Polynesian languages is a system of affixation and the reduplication (repetition of all or part of a word, such as wiki-wiki) to form new words. Like other Austronesian languages they have simple phonologies; thus a text has few but frequent sounds. The majority also lack consonant clusters (e.g., [str] or [mpt] in English). Most also have only a small set of vowels, five being a common number.
The Sunda–Sulawesi languages are spoken by about 230 million people and include Malay (Indonesian and Malaysian), Sundanese, Javanese, Balinese, Acehnese, Chamorro (of the Mariana Islands), and Palauan.
The Philippine languages are spoken by 90 million people and include Tagalog (Filipino), Cebuano, Ilokano, Hiligaynon, Bikolano, Waray-Waray, and Kapampangan each with at least three million speakers.
Central–Eastern includes the Oceanic languages with 2 million speakers, with mainly Western Oceanic, Southern Oceanic and Central Pacific (Polynesian and Fiji languages), such as Kuanua, Gilbertese, Hawaiian, Māori, Samoan, Tahitian, or Tongan.
The Malayo-Polynesian languages share several phonological and lexical innovations with the eastern Formosan languages, including the leveling of proto-Austronesian *t, *C to /t/ and *n, *N to /n/, a shift of *S to /h/, and vocabulary such as *lima "five" which are not attested in other Formosan languages. However, it does not align with any one branch. A 2008 analysis of the Austronesian Basic Vocabulary Database suggests the closest connection is with Paiwan, though it only assigns that connection a 75% confidence level.
Malayo-Polynesian consists of a large number of small local language clusters, with the one exception being Oceanic, the only large group which has been reconstructed and is indisputably valid. All other large groups within Malayo-Polynesian are disputed. The family has traditionally been divided into Western ("Hesperonesian"), Central, and Eastern branches. However, there is little support for these groups; Central MP languages are distinctive because they are typologically Melanesian due to substratum effects of the Papuan languages of eastern Indonesia, as similarly are the Eastern MP languages, while the Western branch is simply the branches which have not undergone such extensive contact-induced change.
Wouk and Ross (2002) proposed a Nuclear Malayo-Polynesian branch, based on a consistent simplification of the Austronesian alignment in the syntax of the proto-Malayo-Polynesian language, which is found throughout Indonesia apart from much of Borneo and the north of Sulawesi. Because Nuclear MP included some Western MP languages along with Central–Eastern MP, Wouk and Ross split Western MP into an "Inner" group on Sulawesi and the Sunda Islands, which together with Central–Eastern formed Nuclear Malayo-Polynesian, and an "Outer" group on Borneo and the Philippines. Both are remnant groups with negative definitions: Outer WMP (Borneo–Philippines) are those Malayo-Polynesian languages which are not Nuclear, while Inner WMP (Sunda–Sulawesi) are those Nuclear languages which are not Central–Eastern, which is itself a dubious group. Although Nuclear MP was defined using syntactic data, it finds moderate support from lexical data.
Austronesian Basic Vocabulary Database (2008) 
The 2008 analysis found three branches of Malayo-Polynesian with full support of the lexical data. These were the Philippine languages, including some languages of northern Sulawesi; Sama–Bajaw, of the Sulu Archipelago between the Philippines and Borneo; and the Indo-Melanesian languages, being all the rest. It found moderate (75%) support for Sama–Bajaw forming a unit with the Philippine languages. Within Indo-Melanesian, it found moderate (75%) support for Nuclear Malayo-Polynesian, and lesser (65%) support for the Bornean languages as a valid group.
Thus the internal structure of Malayo-Polynesian suggested by the 2008 study is:
- Malayo-Polynesian (100%)
- Fay Wouk and Malcolm Ross (ed.), The history and typology of western Austronesian voice systems. Australian National University, 2002.
- Austronesian Basic Vocabulary Database, 2008.
- 2008 Austronesian Basic Vocabulary Database analysis