Multiple Regression Models On Playboy Models
In my Multivariate Statistics course, I elected to cover the section on multiple linear regression (MLR).  As part of the class requirements and by way of illustrating this technique, I applied MLR in a mock study on Playboy Playmates.  This was written facetiously, but essentially the same study was published a couple years later, though it was admittedly a much superior piece of work.

My goal when I chose this topic was to test a few of the "models" the media spout during the election.  "The taller candidate always wins"; "Democrats are always elected when the economy is strong" and so on.  I tried to seek out a database of such raw information (some conclusions would have been nice as well), but I couldn't find any such thing.  I ran searches on several engines with several different approaches  each (wild cards, etc.), but mostly I came up with out-of-date sites devoted to the most recent presidential election.  Finally, I just ran a search for "candidate height weight" to try to get just exactly what I was looking for.  I found something even better instead.

The first site that came up was from some nut out there who had compiled in table format all of the information for the Playboy Playmates going all the way back to the original herself, Marilyn Monroe.  In many of the earliest few years, a few items were missing here and there, but for the most part I had bust sizes, waist, hip measurements, age, height, and weight.  From these variables plus the date of publication, I could run an enormous number of statistical tests.  For example, given that this is a biological sample of a highly selected population, I could check for correlations between various aspects of figure, age, between height and weight, check for trends in measurements according to year, etc.  What follows is the report on those findings.

 Multiple Regression Models On Playboy Models Or *everything you ever wanted to know about Playboy Playmates, but never bothered to ask. Alexplorer Multivariate Statistics Spring 2001 Preface The Task: Find the story in the data. The Tool: Multivariate regression analysis. The Data: Well, sometimes things work out in ways you could never expected. Given that this was election year, I thought, "The media have been spouting election models for the past 12 months.  Let me see if I can put some of these to the test." The World Wide Web being the vast repository of digital data it is, I tried looking for numbers concerning the heights, weights, etc. of the presidents.  We've all heard these "rules" that "the taller candidate wins" and all such as that, so I had intended to model these to a percentage of the popular vote.   But that's not what the Web is for. I found these data instead. And here is the story they tell.  Read on. Introduction From isolated (but frequent!) sampling of issues of Playboy Magazine, it is apparent that the overall "look" of the models featured as Playmates has changed over the years.  It is my hypothesis that there is a relationship between the passage of time and the individual attributes (age, bust, waist, hips, weight, and height). Note that pictures of models are often taken several months to years before they are ever published, thus, the date of publication of a particular model reflects the opinion of the editors of the magazine regarding the overall appeal of a particular model. A second goal to be achieved through this data set is to develop a model which would: 1) Describe the influence of the aforementioned predictor variables in which year pictures of models were published. 2) Predict when a photo shoot of a model might conceivably be published. Methods Biometric information about Playboy Playmates was extracted from a the web site "Playboy Playmate Data Statistics" and formatted in such a manner to perform statistical analyses.  This information concerned the height, weight, age, bust, waist, and  hips of each model (all of which were continuous variables).  These data were recorded with the date of publication of each issue in which the model appeared. Data on the web site ranged from the entire series of publications of Playboy magazine (1953 to present).  However, measurements of models were recorded and/or reported inconsistently in the earliest issues of Playboy.  Therefore, observations prior to 1959 were excluded because these (sex) objects lacked data for one or more of the variables analyzed in this paper.  As a result, of the total of 572 Playmates who have been featured in Playboy, 499 were included for analysis.  Thus 87.2% of the Playmate "universe" was sampled. The height data required some manipulation in order to perform the analyses.  Specifically, measurements were initially reported in feet with remaining inches.  These were transformed to total inches (e.g., 5' 10" -> 60" + 10" -> 70").  This was accomplished via the "search and replace" function in Microshaft Word. All statistical analyses were accomplished via SAS v. 8.1. Results Simple Linear Regression Prior to developing a multivariate regression model, simple linear regression (SLR) was employed to determine the relationships between the year and the following variables age, bust, waist, hips, weight, and height. Below are scatterplots of raw data of the individual variables; the regression line itself is not plotted.  Note some data points are not shown as many completely overlap near or identical values.  With each scatterplot, SLR models are reported with significance level and R2.
 For age (p <0.0001, R2 = 0.0735):      year = 1952.95 + 1.24 (age, years) For bust (p <0.0001, R2 = 0.1153):      year = 2078.40 + -2.78 (bust, inches) For waist (p <0.0001, R2 = 0.0755):      year = 1919.24 + 2.60 (waist, inches) For hips (p <0.0001, R2 = 0.0983):      year = 2090.05 + -3.16 (hips, inches) For weight (p <0.0001, R2 = 0.0000):      year = 1980.38 + -0.004 (weight, pounds) For height (p <0.0001, R2 = 0.1466):       year = 1847.13 + 2.01 (height, inches)
 The Correlation Matrix A correlation matrix was generated to determine the magnitude of the relationships between the predictors. Of the 15 possible pairwise combinations of predictors, 13 were significant at an alpha level of 0.05. For statistically significant correlations between pairs of variables, R (absolute value) and its corresponding p are reported. The Multivariate Model Both the Stepwise Forward and Backward and the Maximum R2 methods yielded the same highly significant (p < 0.0001, R2 = 0.4807) model. year = 1909.06 + 0.621*age - 1.17191*bust + 3.24521*waist - 3.21603*hips - 0.40623*weight + 2.75089*height Both methods provided the same coefficients for all predictors.  For both methods, the predictors were added in the same order and all six were used in the final model. Beta Coefficients These scores indicate the relative (not the absolute) contribution of each variable to the model. Variable Beta Coefficient age 0.13539 bust 0.14323 waist 0.34238 hips 0.31868 weight 0.30739 height 0.52400 Discussion One unexpected finding was the complete absence of a significant change (p = 0.9510) in the weight of the Playmates over the history of the publication.  This can be explained by the highly statistically significant (p < 0.0001) contradicting influences of an increase in height and waist line with declining bust and hip measurements.  In effect they grew taller and less shapely over the years. The relatively low predictive value of the model (R2 = 0.4807) was likely due to several factors. Naturally, the development of the model was limited to the variables for which data were collected.  Other predictor variables which might have influenced the likelihood of publication of photos of a particular model include the race/ethnicity/skin tone, hair length and color, facial features such as eye color, shape of lips, etc. Still other variables are hard to quantify.  For example, from time to time Playboy has published pictorials of some women because of a popularity outside of their appeal strictly as nude models, such as actresses and musicians. Other potential predictors might also include those which were derived from other absolute variables.  For example, in dealing with the topic of feminine beauty, absolutes are often of less concern than are proportions.  Therefore, better indicators of the likelihood of publication might be those which were derived from ratios of height:weight and/or bust:waist:hips and/or other combinations.  Several of these alternate models were attempted via the Maximum R2 method with the following results: When height and weight were replaced with htwt (ratio of ht:wt) the following highly significant model (p < 0.0001, R2 = 0.3678) was produced: year = 2045.63 + 0. 888*age - 1.433*bust + 4.397*waist - 3.145*hips - 15.731*htwt When bust, waist, and hips were replaced with bwh (ratio of bust:waist:hips) the following highly significant model (p < 0.0001, R2 = 0.3408) was produced: year = 1827.64 + 0. 758*age - 11058*bwh - 0.649*weight + 3.541*height When bust, waist, and hips were replaced with bw (ratio of bust:waist) and bh (ratio of bust:hips) the following highly significant model (p < 0.0001, R2 = 0.4570) was produced: year = 1856.36 + 0. 643*age - 54.809*bw - 62.036*bh - 0.657*weight + 3.106*height When height and weight were replaced with htwt (ratio of ht:wt) and bust, waist, and hips were replaced with bwh (ratio of bust:waist:hips) the following highly significant model (p < 0.0001, R2 = 0.1797) was produced: year = 2040.03 + 1.107*age - 21.303*bwh - 21990*htwt Note that the R2 for all of these manipulations was less than that of the original model which included all of the variables.  It is therefore reasonable to conclude that all the individual predictors are indeed relevant in determining whether or not a pictorial will be published.  However, these factors are only a portion of the predictors. References Kachigan, Sam Kach.  1991. Multivariate Statistical Analysis: A Conceptual Introduction.  Radius Press, New York. Playboy Playmate Data Statistics By Year. 2001. http://members.home.net/sedford/pmdatamain.html (now defunct) StatSoft, Inc. 2001. http://www.statsoft.com/textbook/stathome.html Appendix: SAS Program and Data Set The raw data set, SAS program, and SAS output for the analysis on can be found on this page.