Appendix Within An Assignment Of Probability

Probability[edit]

Basic Concepts[edit]

Probability is a numerical measure of likelihood. If an event has a probability equal to 1 (or 100%), then it is certain to occur. If it has a probability equal to 0, then it will definitely not occur. And if it has a probability equal to 1/2 (or 50%), then it is as likely as not to occur.

You will know that tossing a fair coin has probability 1/2 to yield heads, and that casting a fair die has probability 1/6 to yield a 1. How do we know this?

There is a principle known as the principle of indifference, which states: if there are n mutually exclusive and jointly exhaustive possibilities, and if, as far as we know, there are no differences between the n possibilities apart from their names (such as "heads" or "tails"), then each possibility should be assigned a probability equal to 1/n. (Mutually exclusive: only one possibility can be realized in a single trial. Jointly exhaustive: at least one possibility is realized in a single trial. Mutually exclusive and jointly exhaustive: exactly ony possibility is realized in a single trial.)

Since this principle appeals to what we know, it concerns epistemic probabilities (a.k.a. subjective probabilities) or degrees of belief. If you are certain of the truth of a proposition, then you assign to it a probability equal to 1. If you are certain that a proposition is false, then you assign to it a probability equal to 0. And if you have no information that makes you believe that the truth of a proposition is more likely (or less likely) than its falsity, then you assign to it probability 1/2. Subjective probabilities are therefore also known as ignorance probabilities: if you are ignorant of any differences between the possibilities, you assign to them equal probabilities.

If we assign probability 1 to a proposition because we believe that it is true, we assign a subjective probability, and if we assign probability 1 to an event because it is certain that it will occur, we assign an objective probability. Until the advent of quantum mechanics, the only objective probabilities known were relative frequencies.

The advantage of the frequentist definition of probability is that it allows us to measure probabilities, at least approximately. The trouble with it is that it refers to ensembles. You can't measure the probability of heads by tossing a single coin. You get better and better approximations to the probability of heads by tossing a larger and larger number of coins and dividing the number of heads by  The exact probability of heads is the limit

The meaning of this formula is that for any positive number however small, you can find a (sufficiently large but finite) number  such that

The probability that events from a mutually exclusive and jointly exhaustive set of  possible events happen is the sum of the probabilities of the  events. Suppose, for example, you win if you cast either a 1 or a 6. The probability of winning is

In frequentist terms, this is virtually self-evident. approximates approximates and approximates

The probability that two independent events happen is the product of the probabilities of the individual events. Suppose, for example, you cast two dice and you win if the total is 12. Then

By the principle of indifference, there are now equiprobable possibilities, and casting a total of 12 with two dice is one of them.

It is important to remember that the joint probability of two events equals the product of the individual probabilities and only if the two events are independent, meaning that the probability of one does not depend on whether or not the other happens. In terms of propositions: the probability that the conjunction is true is the probability that is true times the probability that is true only if the probability that either proposition is true does not depend on whether the other is true or false. Ignoring this can have the most tragic consequences.

The general rule for the joint probability of two events is

is a conditional probability: the probability of given that 

To see this, let be the number of trials in which both and happen or are true. approximates approximates and approximates But

An immediate consequence of this is Bayes' theorem:

The following is just as readily established:

where happens or is true whenever does not happen or is false. The generalization to mutually exclusive and jointly exhaustive possibilities should be obvious.



Given a random variable, which is a set of random numbers, we may want to know the arithmetic mean

as well as the standard deviation, which is the root-mean-square deviation from the arithmetic mean,

The standard deviation is an important measure of statistical dispersion.

Given possible measurement outcomes with probabilities we have a probability distribution and we may want to know the expected value of  defined by

as well as the corresponding standard deviation

which is a handy measure of the fuzziness of .

We have defined probability as a numerical measure of likelihood. So what is likelihood? What is probability apart from being a numerical measure? The frequentist definition covers some cases, the epistemic definition covers others, but which definition would cover all cases? It seems that probability is one of those concepts that are intuitively meaningful to us, but — just like time or the experience of purple — cannot be explained in terms of other concepts.

Results for the survey are based on face-to-face interviews conducted under the direction of Opinion Research Business in Iraq, Morocco and Tunisia and Princeton Survey Research Associates International in the other 36 countries. Findings are reported exclusively for Muslims; however, the survey is based on national samples that did not screen out non-Muslims, except in Thailand, where a sample of only Muslims was fielded in five southern provinces. In certain instances, regions of countries with high levels of insecurity or limited access were also excluded from the national samples. Oversamples of Muslims were conducted in two countries: Bosnia-Herzegovina and Russia. In both countries, oversampling was achieved by disproportionately sampling regions or territories known to have higher concentrations of Muslims.

In all countries, surveys were administered through face-to-face interviews conducted at a respondent’s place of residence. All samples are based on area probability designs, which typically entailed proportional stratification by region and urbanity, selection of primary sampling units (PSUs) proportional to population size, and random selection of secondary and tertiary sampling units within PSUs. Interview teams were assigned to designated random routes at the block or street level and followed predetermined skip patterns when contacting households. Within households, adult respondents were randomly selected by enumerating all adults in the household using a Kish grid or selecting the adult with the most recent birthday.

The questionnaire administered by survey interviewers was designed by the staff of the Pew Research Center’s Forum on Religion & Public Life in consultation with subject matter experts and advisers to the project. The questionnaire was translated into the vernacular language(s) of each country surveyed, checked through back-translation and pretested prior to fieldwork. In total, the survey was conducted in more than 80 languages.

Conducting opinion polls in diverse societies necessitates adapting the survey to local sensitivities. In some countries, pretest results indicated the need to suppress certain questions to avoid offending respondents and/or risking the security of the interviewers. In other countries, interviewers considered some questions too sensitive to pretest. Thus, not all questions were asked in all countries.

For example, interviewers in Afghanistan, Uzbekistan and Morocco indicated that certain questions about sexual preference and sexual behavior were too sensitive to be asked. Questions on these topics were either eliminated or modified in these countries.

Following fieldwork, survey performance for each country was assessed by comparing the results for key demographic variables with reliable, national-level population statistics. For each country, the data were weighted to account for different probabilities of selection among respondents in each sample. Additionally, where appropriate, data were weighted through an iterative procedure to more closely align the samples with official population figures for characteristics such as gender, age, education and ethnicity. The reported sampling errors and the statistical tests of significance used in analysis take into account the effect of both types of weighting. The reported sampling errors and statistical tests of significance also take into account the design effects associated with each sample.

The table below shows the sample size and margin of sampling error for Muslim respondents in each country. For results based on the Muslim sample in the countries surveyed, one can say with 95% confidence that the error attributable to collecting data from some, rather than all, members of the Muslim population is plus or minus the margin of error. This means that in 95 out of 100 samples of the same size and type, the results obtained would vary by no more than plus or minus the margin of error for the country in question.

It should be noted that practical difficulties in conducting multinational surveys can introduce error or bias into the findings of opinion polls. In some countries, the achieved samples suffered from imbalances in the number of women or men interviewed, while in some countries a lack of adequate, national-level statistics made it difficult to assess the accuracy of educational characteristics among the sampled population. Specific difficulties encountered were:

Gender Imbalances: In Afghanistan and Niger, the survey respondents are disproportionately male, while in Thailand, Azerbaijan and Uzbekistan they are disproportionately female.

In each of these countries interviewers faced practical difficulties in reaching additional male or female respondents. In Afghanistan, despite strict gender matching, cultural norms frequently limited the ability of interviewers to contact women in certain areas. In Niger, difficulties associated with recruiting enough female interviewers affected gender matching and may have discouraged the participation of women in the survey.

Surveying in active conflict zones posed particular challenges for interviewers. In southern Thailand, security concerns limited the number of interviews that could be conducted in the evening hours, leading in part to fewer interviews with men, who often are out of the house during daytime hours.

In Azerbaijan and Uzbekistan, large-scale labor migration patterns may have contributed to fewer interviews with male respondents.

Education: In many countries, census statistics on education are unavailable, dated or disputed by experts. The lack of reliable national statistics limits the extent to which survey samples can be assessed for representativeness on this measure.

In Albania, the Palestinian territories and Tajikistan, the surveys appear to overrepresent highly educated respondents compared with the last available national census. In each of these cases, however, official education statistics are based on, or estimated from, censuses conducted five or more years prior to the survey and thus were not used for the purposes of weighting.

In Niger, the sample is disproportionately well-educated compared with the last available Demographic and Health Survey (2006), but no education census statistics are available to assess the representativeness of the sample.

In addition to sampling error and other practical difficulties, one should bear in mind that question wording can also have an impact on the findings of opinion polls.

For details about the surveys conducted in 15 sub-Saharan African countries in 2008-2009, see the Pew Research Center’s 2010 report “Tolerance and Tension: Islam and Christianity in Sub-Saharan Africa.”

The survey questionnaire and a topline with full results for the 24 countries surveyed in 2011-2012 is included in Appendix D (PDF).

Afghanistan
Sample design: Stratified area probability sample of all 34 Afghan provinces (excluding nomadic populations) proportional to population size and urban/rural population.
Mode: Face-to-face adults 18+
Languages: Baloch, Dari, Hazara, Pashto, Uzbek
Fieldwork dates: Nov. 27–Dec. 17, 2011
Representative: Nationally representative of 94% of the adult population.
Design effect: 3.4

Albania
Sample design: Stratified area probability sample of all three regions proportional to population size and urban/rural population. Some difficult-to-reach areas were excluded.
Mode: Face-to-face adults 18+
Languages: Albanian
Fieldwork dates: Oct. 24–Nov. 13, 2011
Representative: Nationally representative of 98% of the adult population.
Design effect: 2.3

Azerbaijan
Sample design: Stratified area probability sample of eight of 11 oblasts (excluding Upper-Karabakh, Nakhchivan and Kalbacar-Lacin) and city of Baku proportional to population size and urban/rural population.
Mode: Face-to-face adults 18+
Languages: Azeri, Russian
Fieldwork dates: Dec. 4–Dec. 25, 2011
Representative: Nationally representative of 85% of the adult population.
Design effect: 3.3

Bangladesh
Sample design: Stratified area probability sample of all seven administrative divisions proportional to population size and urban/rural population.
Mode: Face-to-face adults 18+
Languages: Bangla
Fieldwork dates: Nov. 21, 2011–Feb. 5, 2012
Representative: Nationally representative of the adult population.
Design effect: 3.8

Bosnia-Herzegovina
Sample design: Stratified area probability sample of all seven regions proportional to population size and urban/rural population. In addition, an oversample of Muslims was conducted in majority-Bosniak areas. Some difficult-to-reach areas were excluded.
Mode: Face-to-face adults 18+
Languages: Bosnian, Croatian, Serbian
Fieldwork dates: Nov. 3–Nov. 20, 2011
Representative: Nationally representative of 98% of the adult population.
Design effect: 1.8

Egypt
Sample design: Stratified area probability sample of 24 of 29 governorates proportional to population size and urban/rural population. The five frontier provinces, containing 2% of the overall population, were excluded.
Mode: Face-to-face adults 18+
Languages: Arabic
Fieldwork dates: Nov. 14–Dec. 18, 2011
Representative: Nationally representative of 98% of the adult population.
Design effect: 2.6

Indonesia
Sample design: Stratified area probability sample of 19 provinces (excluding Papua and other remote areas and provinces with small populations) proportional to population size and urban/rural population.
Mode: Face-to-face adults 18+
Languages: Bahasa Indonesian
Fieldwork dates: Oct. 28–Nov. 19, 2011
Representative: Nationally representative of 87% of the adult population.
Design effect: 2.3

Iraq
Sample design: Stratified area probability sample of all 18 governorates proportional to population size and urban/rural population.
Mode: Face-to-face adults 18+
Languages: Arabic, Kurdish
Fieldwork dates: Nov. 4–Dec. 1, 2011
Representative: Nationally representative of the adult population.
Design effect: 4.9

Jordan
Sample design: Stratified area probability sample of all 12 governorates proportional to population size and urban/rural population.
Mode: Face-to-face adults 18+
Languages: Arabic
Fieldwork dates: Nov. 3–Dec. 3, 2011
Representative: Nationally representative of the adult population.
Design effect: 3.5

Kazakhstan
Sample design: Stratified area probability sample of all 14 oblasts proportional to population size and urban/rural population. Three districts each in Almaty oblast and East Kazakhstan were excluded due to government restrictions.
Mode: Face-to-face adults 18+
Languages: Kazakh, Russian
Fieldwork dates: Nov. 24–Dec. 17, 2011
Representative: Nationally representative of 98% of the adult population.
Design effect: 2.5

Kosovo
Sample design: Stratified area probability sample of all eight KFOR-administered regions proportional to population size and urban/rural population.
Mode: Face-to-face adults 18+
Languages: Albanian, Serbian
Fieldwork dates: Dec. 16, 2011–Jan. 20, 2012
Representative: Nationally representative of 99% of the adult population.
Design effect: 3.7

Kyrgyzstan
Sample design: Stratified area probability sample of all seven oblasts and the cities of Bishkek and Osh proportional to population size and urban/rural population.
Mode: Face-to-face adults 18+
Languages: Kyrgyz, Russian, Uzbek
Fieldwork dates: Jan. 31–Feb. 25, 2012
Representative: Nationally representative of the adult population.
Design effect: 3.3

Lebanon
Sample design: Stratified area probability sample of all seven regions (excluding areas of Beirut controlled by a militia group and a few villages in the south near the border with Israel) proportional to population size and urban/rural population.
Mode: Face-to-face adults 18+
Languages: Arabic
Fieldwork dates: Nov. 14–Dec. 8, 2011
Representative: Nationally representative of 98% of the adult population.
Design effect: 2.2

Malaysia
Sample design: Stratified area probability sample of Peninsular Malaysia, East Malaysia and the Federal Territory of Kuala Lumpur. In Peninsular Malaysia and Kuala Lumpur, interviews were conducted proportional to population size and urban/rural population. A disproportionately higher number of interviews were conducted in Sarawak and Sabah states in East Malaysia to adequately cover this geographically challenging region.
Mode: Face-to-face adults 18+
Languages: Mandarin Chinese, English, Malay
Fieldwork dates: Nov. 4, 2011–Jan. 25, 2012
Representative: Nationally representative of the adult population.
Design effect: 2.5

Morocco
Sample design: Stratified area probability sample of 15 regions proportional to population size and urban/rural population.
Mode: Face-to-face adults 18+
Languages: Arabic, French
Fieldwork dates: Nov. 3–Dec. 1, 2011
Representative: Nationally representative of the adult population.
Design effect: 2.8

Niger
Sample design: Stratified area probability sample of seven of eight regions (Agadez was excluded) and city of Niamey proportional to population size and urban/rural population.
Mode: Face-to-face adults 18+
Languages: French, Hausa
Fieldwork dates: Dec. 5–Dec. 16, 2011
Representative: Nationally representative of 97% of the adult population.
Design effect: 3.1

Pakistan
Sample design: Stratified area probability sample of all four provinces (excluding the Federally Administered Tribal Areas, Gilgit-Baltistan, and Azad Jammu and Kashmir for reasons of security as well as areas of instability in Khyber Pakhtunkhwa and Balochistan) proportional to population size and urban/rural population.
Mode: Face-to-face adults 18+
Languages: Balochi, Hindko, Pashto, Punjabi, Sindhi, Saraiki, Urdu
Fieldwork dates: Nov. 10–Nov. 30, 2011
Representative: Nationally representative of 82% of the adult population.
Design effect: 4.7

Palestinian territories
Sample design: Stratified area probability sample of all five regions (excluding Bedouins and some communities near Israeli settlements due to military restrictions) proportional to population size and urban/rural population.
Mode: Face-to-face adults 18+
Languages: Arabic
Fieldwork dates: Dec. 4, 2011–Jan. 2, 2012
Representative: Nationally representative of 95% of the adult population.
Design effect: 4.1

Russia
Sample design: Area probability sample of all 80 oblasts proportional to population. In addition, an oversample of Muslims was conducted in oblasts with a higher concentration of ethnic Muslims.
Mode: Face-to-face adults 18+
Languages: Russian
Fieldwork dates: Oct. 27–Dec. 2, 2011
Representative: Nationally representative of 99% of the adult population, with a Muslim oversample.
Design effect: 0.9

Tajikistan
Sample design: Stratified area probability sample of all four oblasts and city of Dushanbe proportional to population size and urban/rural population.
Mode: Face-to-face adults 18+
Languages: Russian, Tajik
Fieldwork dates: Dec. 28, 2011–Jan. 21, 2012
Representative: Nationally representative of 99% of the adult population.
Design effect: 4.4

Thailand
Sample design: Stratified area probability sample of Muslims in the provinces of Yala, Pattani, Narathiwat, Satun and Songkhla proportional to population size and urban/rural population.
Mode: Face-to-face adults 18+
Languages: Thai, Yawee
Fieldwork dates: Nov. 12, 2011–Jan. 8, 2012
Representative: Representative of adult Muslims in five southern provinces.
Design effect: 3.3

Tunisia
Sample design: Stratified area probability sample of all 24 governorates proportional to population size and urban/rural population.
Mode: Face-to-face adults 18+
Languages: Arabic, French
Fieldwork dates: Nov. 10–Dec. 7, 2011
Representative: Nationally representative of the adult population.
Design effect: 1.6

Turkey
Sample design: Stratified area probability sample of all 26 regions proportional to population size and urban/rural population.
Mode: Face-to-face adults 18+
Languages: Turkish
Fieldwork dates: Nov. 18–Dec. 19, 2011
Representative: Nationally representative of the adult population.
Design effect: 5.2

Uzbekistan
Sample design: Stratified area probability sample of all 14 oblasts and city of Tashkent proportional to population size and urban/rural population.
Mode: Face-to-face adults 18+
Languages: Russian, Uzbek
Fieldwork dates: Feb. 2–Feb. 12, 2012
Representative: Nationally representative of 99% of the adult population.
Design effect: 2.2

Photo Credit: © Scott E Barbour

One thought on “Appendix Within An Assignment Of Probability

Leave a Reply

Your email address will not be published. Required fields are marked *