File talk:Trump vs. Clinton nationwide.svg

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Code not running

[edit]

I tried to run the code this morning, and seems 131 of the records in 'df' have NA Date. Any ideas? Btyner (talk) 14:43, 31 July 2016 (UTC)[reply]

I changed the presentation on the polling page. I'll fix the code shortly. Abjiklɐm (tɐlk) 01:01, 1 August 2016 (UTC)[reply]
The code is now fixed. Abjiklɐm (tɐlk) 19:12, 1 August 2016 (UTC)[reply]
Took a closer look at the code, and in the call to ggplot() it is doing: weight=1/Error. If this is the weight for smoothing, then I don't think 1/Error is correct; I think the weight for smoothing should be proportional to Size instead. Btyner (talk) 12:02, 1 August 2016 (UTC)[reply]
True, but Error is defined as 1/sqrt(Size) so essentially Weight=sqrt(Size). Abjiklɐm (tɐlk) 18:58, 1 August 2016 (UTC)[reply]
Right...what I'm saying is, the sqrt should not be used. Btyner (talk) 22:32, 2 August 2016 (UTC)[reply]
Ok. I'm not an expert but I thought I had read somewhere that the square root was better for weighting. I could be completely wrong though. Do you have some resources on the topic? I've looked online again but couldn't find an answer. Abjiklɐm (tɐlk) 15:16, 3 August 2016 (UTC)[reply]
In the case of loess, the source code says

weights: vector of weights to be given to individual observations in the sum of squared residuals that forms the local fitting criterion. By default, an unweighted fit is carried out. If supplied, weights should be a non-negative numeric vector. If the different observations have non-equal variances, weights should be inversely proportional to the variances.

and I would imagine the same is true for gam. By the way if you know how to make that indent properly above without resorting to a bunch of colons, I would be interested to learn. Btyner (talk) 10:59, 4 August 2016 (UTC)[reply]
That's helpful! I think I may have mixed up margin of error with variance in this case. From this page I saw that it is inversely proportional to and I just went with that. I don't think I have the variance for every observation, but I would assume they are not equal, and I'm not sure variance is proportional to n. What do you think? (by the way, for the indent, I've only ever seen a bunch of colons ;) ) Abjiklɐm (tɐlk) 13:04, 4 August 2016 (UTC)[reply]
For the two-candidate case, assuming the sampling scheme was reasonable (a big assumption given the prevalence of cell phones, but I digress) the variance is equal to where is the unknown probability being estimated by the survey. So yeah, just go with for the weights. Regards Btyner (talk) 22:04, 4 August 2016 (UTC)[reply]

Is there any basis for the particular trend curve methodology and span in reliable sources?

[edit]

See discussion at here Hugetim (talk) 20:48, 5 August 2016 (UTC)[reply]