| Literature DB >> 34339480 |
Mariah Abdul Rahman1, Nor Samsiah Sani1, Rusnita Hamdan1, Zulaiha Ali Othman1, Azuraliza Abu Bakar1.
Abstract
The Multidimensional Poverty Index (MPI) is an income-based poverty index which measures multiple deprivations alongside other relevant factors to determine and classify poverty. The implementation of a reliable MPI is one of the significant efforts by the Malaysian government to improve measures in alleviating poverty, in line with the recent policy for Bottom 40 Percent (B40) group. However, using this measurement, only 0.86% of Malaysians are regarded as multidimensionally poor, and this measurement was claimed to be irrelevant for Malaysia as a country that has rapid economic development. Therefore, this study proposes a B40 clustering-based K-Means with cosine similarity architecture to identify the right indicators and dimensions that will provide data driven MPI measurement. In order to evaluate the approach, this study conducted extensive experiments on the Malaysian Census dataset. A series of data preprocessing steps were implemented, including data integration, attribute generation, data filtering, data cleaning, data transformation and attribute selection. The clustering model produced eight clusters of B40 group. The study included a comprehensive clustering analysis to meaningfully understand each of the clusters. The analysis discovered seven indicators of multidimensional poverty from three dimensions encompassing education, living standard and employment. Out of the seven indicators, this study proposed six indicators to be added to the current MPI to establish a more meaningful scenario of the current poverty trend in Malaysia. The outcomes from this study may help the government in properly identifying the B40 group who suffers from financial burden, which could have been currently misclassified.Entities:
Year: 2021 PMID: 34339480 PMCID: PMC8328299 DOI: 10.1371/journal.pone.0255312
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Income classification for Malaysia.
| Term | Description | Monthly Income Threshold | |
|---|---|---|---|
| 2016 | 2020 | ||
| T20 | Top 20 percent | ≥9,620 | ≥10,960 |
| M40 | Middle 40 percent | 4,360–9,619 | 4,850–10,959 |
| B40 | Bottom 40 percent | <4,360 | <4,850 |
Source: Household income and basic amenities survey
Fig 1The workflow for the B40 clustering model.
A set of attributes from person source file.
| No | Attributes | Description | Data type |
|---|---|---|---|
| 1 | Household ID | ID number for household | string |
| 2 | Living Quarter ID | ID number for living quarters | string |
| 3 | Household Member ID | ID number for household member | string |
| 4 | State | States in Malaysia | string |
| 5 | District | Administrative Districts | string |
| 6 | Strata | Urban/ Rural | num |
| 7 | Living Quarter No | Living Quarter Number | string |
| 8 | Household No | Household Number | string |
| 9 | Person No | Number of Household Member | string |
| 10 | Relationship | Relationship to Head of Household | string |
| 11 | Gender | Gender | numerical |
| 12 | Age | Age | string |
| 13 | Age Group | Age (5 year group) | string |
| 14 | Marital Status | Marital Status | numerical |
| 15 | Ethnic Group | Ethnic Group | numerical |
| 16 | Birthplace | Birthplace | numerical |
| 17 | State of birth | State of birth in Malaysia | string |
| 18 | Country of birth | Country of birth | string |
| 19 | Citizenship | Residence Status | numerical |
| 20 | Country of Citizenship | Country of citizenship | string |
| 21 | Place of Residence 5 Years Ago | Usual Place of Residence 5 Years Ago | numerical |
| 22 | Coding State/Country | State/Country Code | string |
| 23 | Coding District | District Code | string |
| 24 | Read and Write | Refers to literacy | numerical |
| 25 | Use Computer | Refers to computer literacy | numerical |
| 26 | Ever Been to School | Ever Been to School/Polytechnic/College/University | numerical |
| 27 | Highest Education | Highest Level of Education | numerical |
| 28 | Highest Certificate | Highest Certificate/Diploma/Degree | numerical |
| 29 | Work during the last 7 days | Work for at least 1 hour during the last 7 days | numerical |
| 30 | Work to return to | persons who did not work during the reference week but had a job, farm, enterprise or other family enterprise to return to | numerical |
| 31 | Look for work during the last 7 days | Look for work during the last 7 days | numerical |
| 32 | Reason for not seeking work | Reason for not seeking work | string |
| 33 | Occupation (1 Digit) | Major group for occupation | numerical |
| 34 | Industry (1 Digit) | Major group for Industry | numerical |
| 35 | Occupation (3 Digit) | Minor group for Occupation | string |
| 36 | Industry (3 Digit) | Minor group for Industry | string |
| 36 | Occupation Status | Employment status | numerical |
| 38 | Religion | Religion | string |
| 39 | Migration Status | 5 Year of Migration Status | string |
| 40 | Labour Force Status | Labour Force Status | numerical |
A set of attributes from living quarters source file.
| No | Attributes | Description | Data type |
|---|---|---|---|
| 1 | Living Quarter ID | ID number for living quarters | string |
| 2 | State | States in Malaysia | string |
| 3 | District | Administrative District | string |
| 4 | Strata | Urban/ Rural | numerical |
| 5 | Living Quarter No | Living Quarter Number | string |
| 6 | Type of Living Quarter | Type of living quarter | numerical |
| 7 | Living Quarter Housing Unit | Category of housing unit | string |
| 8 | Construction Material of Outer Walls | Construction Material of Outer Walls | numerical |
| 9 | Number of Rooms | Number of rooms in living quarter | string |
| 10 | Number of Bedrooms | Number of bedrooms in living quarter | string |
| 11 | Ownership Status | Ownership status of living quarter | numerical |
| 12 | Water Supply | Drinking water supply facility | numerical |
| 13 | Electricity Supply | Electricity supply facility | numerical |
| 14 | Toilet Facility | Toilet facility | numerical |
| 15 | Garbage Collection | Garbage collection facility | numerical |
| 16 | Total Persons in Living Quarter | Total persons in living quarter | numerical |
| 17 | Total Households in Living Quarter | Total households in living quarter | numerical |
A set of attributes after unsupervised feature selection.
| No | Attributes | No | Attributes |
|---|---|---|---|
| 1 | Birthplace | 13 | Radio/Hi-Fi |
| 2 | Construction Material of Outer Walls | 14 | Reason for Not Seeking Work |
| 3 | Ever Been to School | 15 | Read and Write |
| 4 | Gender | 16 | Refrigerator |
| 5 | Highest Certificate | 17 | Strata |
| 6 | Highest Education | 18 | Toilet Facility |
| 7 | iPod/PDA | 19 | Type of Household |
| 8 | None of the Items | 20 | VCD/DVD Player |
| 9 | Occupation | 21 | Washing Machine |
| 10 | Occupation Status | 22 | Water Filter |
| 11 | Paid TV Channel | 23 | Work during the last seven days |
| 12 | Personal Computer |
Clustering performance based on Davies Bouldin, average within centroid distance and sum of squares for k = 2 to 15 based on Euclidean distance, correlation similarity, cosine similarity and dice similarity.
| Davies Bouldin (DB) | Average within Centroid Distance (AWCD) | Sum of Squares (SS) | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Distance Techniques | Distance Techniques | Distance Techniques | ||||||||||
| ED | CrS | CS | DS | ED | CrS | CS | DS | ED | CrS | CS | DS | |
| 2 | 2.00 | ∞ | 2.39 | 5.64 | 19.85 | 23.00 | 20.00 | 23.39 | 0.65 | 1.00 | 0.53 | 0.50 |
| 3 | 1.90 | 2.64 | 2.54 | ∞ | 18.19 | 19.08 | 18.73 | 23.39 | 0.50 | 0.36 | 0.33 | 0.50 |
| 4 | 2.03 | 2.34 | 2.50 | 6.58 | 17.25 | 17.40 | 17.37 | 23.40 | 0.41 | 0.28 | 0.26 | 0.27 |
| 5 | 2.08 | 2.43 | 2.23 | ∞ | 16.02 | 16.75 | 16.26 | 23.28 | 0.28 | 0.21 | 0.23 | 0.25 |
| 6 | 1.78 | ∞ | 2.21 | 5.80 | 15.05 | 16.34 | 15.63 | 24.08 | 0.29 | 0.23 | 0.18 | 0.18 |
| 7 | 1.86 | 2.15 | 2.18 | ∞ | 14.36 | 15.19 | 15.22 | 23.76 | 0.24 | 0.20 | 0.15 | 0.25 |
| 8 | 1.60 | 2.23 | 2.16 | 5.64 | 13.41 | 15.10 | 14.41 | 24.21 | 0.24 | 0.15 | 0.13 | 0.15 |
| 9 | 1.80 | 2.19 | 2.22 | ∞ | 12.97 | 14.31 | 14.20 | 24.64 | 0.16 | 0.13 | 0.11 | 0.15 |
| 10 | 1.56 | 2.12 | 2.15 | ∞ | 12.18 | 14.49 | 13.84 | 23.98 | 0.19 | 0.13 | 0.10 | 0.13 |
| 11 | 1.70 | ∞ | 2.17 | ∞ | 12.19 | 14.04 | 13.73 | 23.59 | 0.15 | 0.12 | 0.10 | 0.21 |
| 12 | 1.63 | 2.19 | 2.10 | ∞ | 11.21 | 13.47 | 13.27 | 23.95 | 0.12 | 0.11 | 0.09 | 0.14 |
| 13 | 1.68 | 1.82 | 2.03 | ∞ | 11.13 | 12.45 | 12.48 | 24.51 | 0.12 | 0.10 | 0.08 | 0.12 |
| 14 | 1.65 | ∞ | 1.97 | ∞ | 11.03 | 12.90 | 12.18 | 24.02 | 0.11 | 0.11 | 0.08 | 0.14 |
| 15 | 1.63 | 1.91 | 1.90 | ∞ | 10.83 | 12.06 | 12.08 | 23.97 | 0.10 | 0.09 | 0.08 | 0.11 |
| Average | 1.78 | 2.20 | 2.19 | 5.91 | 13.98 | 15.47 | 14.96 | 23.87 | 0.25 | 0.23 | 0.18 | 0.22 |
Comparison of average clustering performance based on distance measure.
| Distance Measure | DB | AWCD | SS |
|---|---|---|---|
| Euclidean Distance | 1.78 | 13.98 | 0.25 |
| Correlation Similarity | 2.20 | 15.47 | 0.23 |
| Cosine Similarity | 2.19 | 14.96 | 0.18 |
| Dice Similarity | 5.91 | 23.87 | 0.22 |
Final score ranking to select the best distance measure.
| Distance Measure | DB | AWCD | SS | Mean Rank | Ranking Position |
|---|---|---|---|---|---|
| Euclidean Distance | 1 | 1 | 4 | 2.00 | 2 |
| Correlation Similarity | 3 | 3 | 3 | 3.00 | 3 |
| Cosine Similarity | 2 | 2 | 1 | 1.67 | 1 |
| Dice Similarity | 4 | 4 | 2 | 3.33 | 4 |
Fig 2Cluster performance plot.
Size of cluster.
| Cluster | No of Individual | Cluster Size (%) | Average within Centroid Distance |
|---|---|---|---|
| 0 | 46,430 | 16 | 12.347 |
| 1 | 31,076 | 11 | 15.329 |
| 2 | 45,459 | 16 | 9.064 |
| 3 | 26,540 | 9 | 20.192 |
| 4 | 28,437 | 10 | 16.909 |
| 5 | 30,950 | 11 | 14.972 |
| 6 | 42,710 | 15 | 13.028 |
| 7 | 35,496 | 12 | 18.218 |
| 287,098 | 100 | ||
| 14.437 |
Fig 3Centroid chart.
Fig 4Scatter plot of (a) Cluster 0; (b) Cluster 1; (c) Cluster 2; (d) Cluster 3; (e) Cluster 4; (f) Cluster 5; (g) Cluster 6; (h) Cluster 7.
Fig 5List of important attributes for B40 clustering model from heat map analysis.
Descriptive statistics for B40 clustering model.
| Attributes | Cluster 0 | Cluster 1 | Cluster 2 | Cluster 3 | Cluster 4 | Cluster 5 | Cluster 6 | Cluster 7 |
|---|---|---|---|---|---|---|---|---|
| Read and Write | 100% Yes | 100% Yes | 100% Yes | 55% Yes | 97% Yes | 92% No | 100% Yes | 99% Yes |
| 45% No | ||||||||
| Highest Education | 35% Primary | 66% Primary | 70% Secondary | 61% No education | 26% Not applicable | 99% No education | 64% Secondary | 62% Secondary |
| 61% Secondary | 34% Pre-School | 19% Primary | 34% Not applicable | 38% Secondary | 26% Primary | 17% Primary | ||
| 19% Primary | ||||||||
| Highest Certificate | 31% UPSR | 100% No certificate | 48% SPM/STPM | 61% No certificate | 40% Not applicable | 99% No certificate | 39% SPM/STPM | 42% SPM/STPM |
| 30% SPM/STPM | 19% UPSR | 34% Not applicable | 25% SPM/STPM | 24% UPSR | 18% UPSR | |||
| 19% PMR/SRP | 17% PMR/SRP | 17% UPSR | ||||||
| Strata | 79% Rural | 69% Urban | 89% Urban | 61% Urban | 68% Urban | 69% Urban | 97% Urban | 88% Urban |
| 21% Urban | 31% Rural | 11% Rural | 39% Rural | 32% Rural | 31% Rural | |||
| Birthplace | 99% Malaysia | 99% Malaysia | 99% Malaysia | 98% Malaysia | 99% Malaysia | 99% Malaysia | 99% Malaysia | 99% Malaysia |
| Toilet Facility | 71% Pour Flush | 69% Flush system | 91% Flush system | 59% Flush system | 64% Flush system | 68% Flush system | 87% Flush system | 90% Flush system |
| 28% Pour Flush | 37% Pour Flush | 29% Pour Flush | 29% Pour Flush | 13% Pour Flush | ||||
| Construction Material of Outer Walls | 35% Brick | 70% Brick | 88% Brick | 58% Brick | 64% Brick | 69% Brick | 86% Brick | 89% Brick |
| 32% Brick and Plank | 15% Brick and Plank | 7% Brick and Plank | 22% Plank | 22% Plank | 16% Plank | 7% Plank | 7% Brick and Plank | |
| 25% Plank | 11% Plank | 16% Brick and Plank | 7% Brick and Plank | 11% Brick and Plank | ||||
| Paid TV Channel | 78% No | 70% No | 84% Yes | 77% No | 96% No | 66% No | 92% No | 65% Yes |
| 34% Yes | 35% No | |||||||
| Water Filter | 94% No | 86% No | 64% No | 92% No | 99% No | 85% No | 95% No | 54% Yes |
| 36% Yes | 46% No | |||||||
| Refrigerator | 98% Yes | 78% Yes | 99% Yes | 91% Yes | 94% No | 82% Yes | 99% Yes | 99% Yes |
| Washing Machine | 82% Yes | 71% Yes | 97% Yes | 73% Yes | 96% No | 75% Yes | 93% Yes | 98% Yes |
| Occupation | 67% No | 100% No (below age 10 years) | 52% No | 54% No | 48% No | 99% No (below age 10 years) | 76% No | 64% No |
| Reason for Not Seeking Work | 34% Still schooling | 100% Not applicable | 48% Not applicable | 46% Not applicable | 52% Not applicable | 100% Not applicable | 38% Still schooling | 37% Not applicable |
| 34% Not applicable | 23% Still schooling | 28% Retired | 33% Still schooling | 25% Not applicable | 33% Still schooling | |||
| 18% Housewife | 17% Housewife | 18% Housewife | 20% Housewife | 18% Housewife | ||||
| Personal Computer | 99% No | 87% No | 100% No | 96% No | 99% No | 88% No | 99% No | 98% Yes |
| iPod/PDA | 100% No | 99% No | 100% No | 99% No | 99% No | 99% No | 100% No | 91% No 9% Yes |
| CLUSTER SIZE (individuals) | 46,430 | 31,076 | 45,459 | 26,540 | 28,437 | 30,950 | 42,710 | 35,496 |
Dimensions, indicators and measure attributes identified from the B40 clustering model.
| DIMENSIONS | INDICATORS | MEASURE ATTRIBUTES |
|---|---|---|
| EDUCATION | Literacy | Read and Write |
| Highest education level and grade | Highest Education | |
| Highest Certificate | ||
| LIVING STANDARDS | Sanitation | Toilet Facility |
| Housing | Construction Material of Outer Walls | |
| Access to television services | Paid TV Channel | |
| Assets | Water Filter | |
| Refrigerator | ||
| Washing Machine | ||
| Personal Computer | ||
| iPod/PDA | ||
| EMPLOYMENT | Work | Occupation |
| Reason for Not Seeking Work |
Dimensions and indicators comparison.
| MPI Dimensions and Indicators | ||
|---|---|---|
| Global MPI (2018) | Malaysia MPI (2016) | This study (2020) |
| EDUCATION | EDUCATION | EDUCATION |
| HEALTH | HEALTH | EMPLOYMENT |
| LIVING STANDARDS | LIVING STANDARDS | LIVING STANDARDS |
| INCOME | ||
Distribution of B40 group based on 2016’s PLI.
| Cluster | Poor | Low-income | Lower-middle income | |
|---|---|---|---|---|
| <RM 981 | RM 981- RM 2614 | >RM 2614 | ||
| 0 | 46430 | 4299 | 27127 | 15004 |
| 1 | 31076 | 4320 | 16819 | 9937 |
| 2 | 45459 | 3955 | 22275 | 19229 |
| 3 | 26540 | 6085 | 11418 | 9037 |
| 4 | 28437 | 6102 | 11970 | 10365 |
| 5 | 30950 | 3350 | 17027 | 10573 |
| 6 | 42710 | 7234 | 22081 | 13395 |
| 7 | 35496 | 3904 | 16203 | 15389 |
| Grand total ( | 287098 | 39249 | 144920 | 102929 |
| Percentage | 14% | 50% | 36% |
Poor characteristic from Cluster 3.
| Attributes | Cluster 3 | Description |
|---|---|---|
| Read and Write | 55% Yes | 55% Poor Can Read and Write while 45% Cannot |
| 45% No | ||
| Highest Education | 61% No education | 61% Poor does not have education while 34% not applicable |
| 34% Not applicable | ||
| Highest Certificate | 61% No certificate | 61% poor do not have a certificate and 34% not applicable |
| 34% Not applicable | ||
| Strata | 61% Urban 39% Rural | 61% of poor live in urban while 39% live in a rural area |
| Birthplace | 98% Malaysia | 98% poor was born in Malaysia |
| Toilet Facility | 59% Flush system | 59% of the poor using flush system and 37% using pour-flush toilet system |
| 37% Pour Flush | ||
| Construction Material of Outer Walls | 58% Brick | 58% of the poor live in a brick house, 22% in plank house and 16% mixed house |
| 22% Plank | ||
| 16% Brick and Plank | ||
| Paid TV Channel | 77% No | 77% of the poor do not have paid tv channel |
| Water Filter | 92% No | 92% of the poor do not have a water filter |
| Refrigerator | 91% Yes | 91% of the poor have a refrigerator |
| Washing Machine | 73% Yes | 73% of poor have a washing machine |
| Occupation | 54% No | 54% of the poor are unemployed. |
| Reason for Not Seeking Work | 46% Not applicable | 28% of the poor do not look for a job because they have already retired, and 18% are housewives. |
| 28% Retired | ||
| 18% Housewife | ||
| Personal Computer | 96% No | 96% poor do not have a personal computer |
| iPod/PDA | 99% No | 99% poor people do not have PDA |
A set of attributes from household source file.
| No | Attributes | Description | Data type |
|---|---|---|---|
| 1 | Household ID | ID number for household | numerical |
| 2 | Living Quarter ID | ID number for living quarters | numerical |
| 3 | State | States in Malaysia | numerical |
| 4 | District | Administrative District | numerical |
| 5 | Strata | Urban/ Rural | numerical |
| 6 | Living Quarter No | Living Quarter Number | numerical |
| 7 | Household No | Household Number | numerical |
| 8 | 1 Motor Car | Owned 1 Motor Car | numerical |
| 9 | 2 Motor Car | Owned 2 Motor Car | numerical |
| 10 | 3 or More Motor Car | Owned 3 or more Motor Car | numerical |
| 11 | 1 Motorcycle | Owned 1 Motorcycle | numerical |
| 12 | 2 or more Motorcycle | Owned 2 or more Motorcycle | numerical |
| 13 | Bicycle | Owned Bicycle | numerical |
| 14 | Air-conditioner | Owned Air-conditioner | numerical |
| 15 | Washing Machine | Owned Washing Machine | numerical |
| 16 | Refrigerator | Owned Refrigerator | numerical |
| 17 | Television | Owned Television | numerical |
| 18 | VCD/DVD Player | Owned VCD/DVD Player | numerical |
| 19 | Personal Computer | Owned Personal Computer | numerical |
| 20 | Laptop | Owned Laptop | numerical |
| 21 | Fixed Telephone Line | Owned Fixed Telephone Line | numerical |
| 22 | Mobile Phone | Owned Mobile Phone | numerical |
| 23 | Paid TV Channel | Owned Paid TV Channel | numerical |
| 24 | Digital Camera | Owned Digital Camera | numerical |
| 25 | Microwave Oven | Owned Microwave Oven | numerical |
| 26 | Internet Subscription | Subscribed to Internet | numerical |
| 27 | i-pod/PDA | Owned i-pod/PDA | numerical |
| 28 | Water Filter | Owned Water Filter | numerical |
| 29 | Radio/Hi-Fi | Owned Radio/Hi-Fi | numerical |
| 30 | None of the Items | Owned None of the Items | numerical |
| 31 | Ownership of Living Quarter | Ownership of Living Quarter | numerical |
| 32 | Ownership of other Living Quarter in Malaysia | Ownership of other Living Quarter in Malaysia | numerical |
| 33 | Rental Payment | Does the households paying rental for the living querter | numerical |
| 34 | Monthly Rental | Monthly rental payment amount | numerical |
| 35 | Type of Household | Type of Household | numerical |
| 36 | Composition of Household | Composition of Household | numerical |
| 36 | Total Male in Household | Total Male in Household | numerical |
| 38 | Total Female in Household | Total Female in Household | numerical |
| 39 | Total Persons in Household | Total Persons in Household | numerical |