Commons talk:Picture of the Year/2019/Committee

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Let's get this going

[edit]

@Christian Ferrer, -revi, Steinsplitter, and Moheen:

It's 2020 now :) Would you mind adding yourself to Commons:Picture_of_the_Year/2019/Committee/User so people know us?

Unfortunately I'm still got a lot of IRL stuffs going on, but I will get all the technical stuffs like gadgets and modules ready this few weeks.

Start date: In Commons_talk:Picture_of_the_Year/Archive_1#POTY_date I think we agreed to do March and April. So how about March for R1 and April for R2, and make the start times close to the starts of the months and two weeks in between?

Categorization: Christian Ferrer, Moheen, would you mind handling Commons:Picture_of_the_Year/2019/Candidates?

It's fine for me. ~Moheen (keep talking) 18:17, 24 January 2020 (UTC)[reply]

Central notice: Steinsplitter, revi, would you mind handling that?

--Zhuyifei1999 (talk) 18:04, 24 January 2020 (UTC)[reply]

How about R1 March 8 - March 22, R2 April 5 - April 19? --Zhuyifei1999 (talk) 06:23, 1 February 2020 (UTC)[reply]

CN

[edit]

Categories

[edit]

I just went to Commons:Picture_of_the_Year/2019/Candidates, IMO the categories are fine now. Christian Ferrer (talk) 12:23, 6 February 2020 (UTC)[reply]

FYI

[edit]

Commons:Village_pump/Technical#Looking_for_someone_to_take_my_part_in_POTY --Zhuyifei1999 (talk) 22:01, 22 February 2020 (UTC)[reply]

Cryptography proposal

[edit]

The problem is:

Because we have to be transparent, and do eligibility checking, and perform the entire voting process on-wiki, we have to release who voted for what after each round, instead of homomorphic encryption compute the tally because the latter does not not allow for the decryption of voter.

I'm thinking of using asymmetric encryption of this scheme:

Instead of vote pages like Commons:Picture_of_the_Year/2019/R1/v/Foobar.png, we have pages like Commons:Picture_of_the_Year/2019/R1/v/AB where AB is the first two characters in the hex payload (explained later)

Each vote page contains entries of the form:

* [[User:Username|Username]] ABCDEF1234567890
  • [[User:Username|Username]] exists for abuse filter purposes, you can only delete the entries you created.
  • ABCDEF1234567890 is the entire hex-encoded payload, consisting of:
    • an asymmetrically encrypted random symmetric encryption key -- I'm thinking of using RSA here
      The purpose of this is en:IND-CPA
    • an symmetrically encrypted message, containing the voter and candidate -- I'm thinking of using AES here
      Ignore the entry if the voter is not the given plaintext username -- prevents someone from replaying someone else's vote
      Ignore the entry if candidate is invalid
      Other checks like eligibility, double voting, and more-than-three-votes-during-R2 still apply
    • a checksum of the payload prior -- I'm thinking of using CRC here
      The purpose is just to prevent absolute junk from being considered at all, so no need for a cryptographically secure hash function

We publish the asymmetric encryption public key to the JS before the round, and publish the secret key along with the vote counting script output after each round so that results can be verified. Each round will use a new key pair for asymmetric encryption.

The JS will first attempt to use Web Crypto API, and if that is not there, we polyfill it. For non-JS users, we provide a toolforge tool to do the crypto, asking the user for a username and a candidate to vote, it generates the line to add to the vote page and the page name.

How does this sound? Input welcome. CC @Eatcha: also. --Zhuyifei1999 (talk) 16:56, 12 March 2020 (UTC)[reply]

I'll also note that I have no time to implement it for this year so this proposal is for next POTY. --Zhuyifei1999 (talk) 16:59, 12 March 2020 (UTC)[reply]

Implementation will be tested on beta first prior to attempting here. --Zhuyifei1999 (talk) 17:31, 12 March 2020 (UTC)[reply]

Thinking about this, there's a significant issue: until phab:T128602 becomes a thing, if one changes browsers, the gadget forgets what they have voted for and cannot unvote. This cannot be restored easily without the unpublished secret key. Perhaps have a toolforge tool to tell what images a user has voted for after they themselves authenticate via oauth? --Zhuyifei1999 (talk) 01:30, 13 March 2020 (UTC)[reply]

Sounds good to me. --Steinsplitter (talk) 17:04, 12 March 2020 (UTC)[reply]
Outside my skills. Christian Ferrer (talk) 17:29, 12 March 2020 (UTC)[reply]
IMO the benefits of encryption do not outweigh the costs. Concerns about privacy can be addressed by making it clearer that votes are public. As for issues with outcomes influencing subsequent votes, I think the current/tentative compromise seems reasonable: keep a heavily qualified version in userspace that's not advertised in any way. I think the number of votes it would influence would be exceptionally small. On the other hand, I'm probably not the only one who prefers to err on the side of transparency and visibility except when there's potential personal harm (as with the political nature of some arbcom elections or WMF board elections). $0.02 — Rhododendrites talk17:59, 12 March 2020 (UTC)[reply]
Special:Diff/403647810 was not addressed. The page will be known no matter whether you advertise it or not. Regarding transparency and visibility: I will publish the plaintext of every valid ciphertext with the vote counting script. --Zhuyifei1999 (talk) 18:09, 12 March 2020 (UTC)[reply]
I don’t entirely understand how these steps prevent both issues, but the question is: if we go for privacy (I’d also rather support openness, but in case privacy wins), do the votes really have to be stored as wiki pages? Shouldn’t we rather use either Special:SecurePoll (if it fits our needs), or entirely a Toolforge tool (which can store data privately until the end of the round, and open source can prove the results haven’t been tampered with)? —Tacsipacsi (talk) 18:05, 12 March 2020 (UTC)[reply]
Openness wins IMO. Votes should not be known during the round, but should be known after the round. --Zhuyifei1999 (talk) 18:11, 12 March 2020 (UTC)[reply]
No, openness definitely loses with any change compared to the current situation. (I was speaking about current vs change, not change A vs change B, so, to make clear, I support keeping the current situation.) I’m sure a custom web application can be as open as the proposed cryptographic solution, although it may be more difficult to create that web application. —Tacsipacsi (talk) 23:22, 12 March 2020 (UTC)[reply]
The current situation does not work. Openness during a round and preventing someone from analyzing and publishing the vote statistics are mutually exclusive. So no that is a no-go. --Zhuyifei1999 (talk) 00:17, 13 March 2020 (UTC)[reply]
A custom web application are more difficult to trust than cryptography. The web application cannot prove the results have not been tampered with without crypto. I did ask a cryptographer about this issue and he recommended Helios, but it is homomorphic encryption and does not suit our use case perfectly. --Zhuyifei1999 (talk) 00:23, 13 March 2020 (UTC)[reply]
Regarding SecurePoll: this seems discussed before and the extension lacks some features we absolutely require. Commons_talk:Picture_of_the_Year/2009#Votine_system_:_about_SecurePoll Commons:Picture_of_the_Year/2009/Preparation#Final_-_using_SecurePoll? --Zhuyifei1999 (talk) 18:35, 12 March 2020 (UTC)[reply]
  • I do not want to comment or have any opinion. I would also try to Abstain, from discussions related to real-time results or POTY unless pinged. I am also on a wikibreak, right now. // Eatcha (talk) 19:00, 12 March 2020 (UTC)[reply]

I'm a bit late to this discussion, after being pinged below. I didn't even notice POTY was running till I missed round 1. Anyway, here's my 2p on all this concern about encryption and public votes, as someone who (a) Won POTY in 2016 and (b) set up Photo Challenge. Firstly, both are popularity contests and the judges are not experts. My Tower of London Ravens is not my best photo, but it is amusing, educationally useful and clearly a lot of people enjoyed it. There's no prize and the only people who mention I won an international photo competition are my family when they are being ironic about my lack of fame and fortune :-). I think the main benefit to Commons is that a lot of Wikipedians get to see the best photos Commons has recently collected and promoted to FP and appreciate all the work everyone does here to take or upload great free educational images for the projects. No photo competition is guaranteed to select the best photos as winners; all you can hope for is that the winners are among the best.

I know from Featured Picture Candidates that previous votes can influence subsequent votes. So on Photo Challenge I hid the votes and asked people not to look until they had decided their top 3. It is purely a request and there's nothing stopping anyone looking at the previous votes to sway theirs. All I hope it is a bit more representative of individual votes than if you were faced with a long page of clear supports against certain photos.

So, lets not over think this. There are so many photos to review for POTY that most folk wouldn't have the time to investigate existing votes. I agree that publishing a real-time list of the top entries is not a good idea but don't think we need to engineer a solution to make it impossible. -- Colin (talk) 17:05, 26 March 2020 (UTC)[reply]

R1 end

[edit]

I messed up. 1.5 hours late re-adding the vote pages to titleblacklist. Do we count the late votes, or no? I can do either way. @Christian Ferrer, Steinsplitter, and -revi: --Zhuyifei1999 (talk) 01:39, 23 March 2020 (UTC)[reply]

  • I would say 2 weeks+ 1.5 hours is not a so big issue. We have not, you have not, to be perfect. At the opposite I thanks you for all your work. IMO we count the last votes. Christian Ferrer (talk) 05:53, 23 March 2020 (UTC)[reply]
    Ok. I did both just to check if there are any differences. without cut-off with cut-off, diff. It looks like just a matter of whether the ranks are ties.
    There's another hiccup. Ignoring line: # [[User:Ammad 2019 (^ ^)|Ammad 2019]] refers to Special:Diff/404602118. I think the user used pipe completion # [[User:{{subst:REVISIONUSER}}|]] like the filter warning suggested, but pipe completion strips the (^ ^) part. My script doesn't recognize this. However, that user is ineligible anyways.
    Other than that, please check for issues. --Zhuyifei1999 (talk) 06:35, 23 March 2020 (UTC)[reply]
I don't understand this candidate has no vote on 23 march but is listed in User:Zhuyifei1999/poty/potyvotesR1.py/2019/stricttime/diff Christian Ferrer (talk) 06:55, 23 March 2020 (UTC)[reply]
In the strict version it ties with Panoramaweg tussen Waltensburg-Vuorz en Breil-Brigels (actm) 10.jpg, both at 274 votes, ending at #16. In the non-strict version "Panoramaweg" gets an extra vote at 275 votes so "Panoramaweg" is #16 and "Jukung" is #17. --Zhuyifei1999 (talk) 07:55, 23 March 2020 (UTC)[reply]
Ok understood, the rest of the scripts seems fine. Christian Ferrer (talk) 08:14, 23 March 2020 (UTC)[reply]
My opinion is that rule is a rule and there's a time limit listed in the rule page, so it should be discounted, but if majority of us think it should be fine, I don't mind either way. — regards, Revi 08:45, 23 March 2020 (UTC)[reply]
Both solutions are fine for me. Christian Ferrer (talk) 09:22, 23 March 2020 (UTC)[reply]
While we have set a deadline, !votes added later should be striked. But agree with Revi and Christian. :-) --Steinsplitter (talk) 12:49, 23 March 2020 (UTC)[reply]
@Zhuyifei1999: yes ok, do you need help? Christian Ferrer (talk) 21:26, 23 March 2020 (UTC)[reply]
No worries. I got it. Code:
Extended content
zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~/poty-scripts $ git diff
diff --git a/poty/eligibility/candidates.py b/poty/eligibility/candidates.py
index 86d8d25..4d52f08 100644
--- a/poty/eligibility/candidates.py
+++ b/poty/eligibility/candidates.py
@@ -145,7 +145,7 @@ class VoteTally(object):
 
     def process_candidates(self, year, candidates):
         candidates = list(candidates)
-        voters = concurrent_map(functools.partial(get_voters, self, year),
+        voters = map(functools.partial(get_voters, self, year),
                                 candidates)
 
         # not using MultiDicts here because we can neither remove a pair easily
diff --git a/poty/parsers/votepage.py b/poty/parsers/votepage.py
index 61d500f..39913b0 100644
--- a/poty/parsers/votepage.py
+++ b/poty/parsers/votepage.py
@@ -6,6 +6,8 @@ from __future__ import absolute_import, unicode_literals, print_function
 import logging
 import re
 
+import pywikibot
+
 from poty.eligibility.voter import get_voter
 from poty.utils.misc import warn_lineignore
 
@@ -17,7 +19,20 @@ def get_voters(votetally, year, candidate):
     votepage = year.subpage(votetally.page.format(c=candidate.nons_title))
     voters = set()
 
-    for line in votepage.text.split('\n'):
+    for rev in votepage.revisions():
+        if rev.timestamp < pywikibot.Timestamp(2020, 3, 23, 0, 0, 0):
+            break
+    else:
+        assert False
+
+    text = votepage.getOldVersion(rev.revid)
+    if votepage.text != text:
+        votepage.text = text
+        votepage.save(f'Revert to old revision before R1 end, [[Special:Permalink/{rev.revid}]]')
+
+    return voters
+
+    for line in votepage.getOldVersion(rev.revid).split('\n'):
         line = line.strip()
 
         if not line:
@@ -47,5 +62,3 @@ def get_voters(votetally, year, candidate):
                 ))
 
         voters.add(voter)
-
-    return voters
--Zhuyifei1999 (talk) 23:37, 23 March 2020 (UTC)[reply]

Issues with the current system I didn't get to address

[edit]
  • The R1 vote pages should be shuffled in LUA for those who don't have JS. Currently they are FP-time-sorted which can create a minor bias. If it only gets shuffled every purge I don't think that is too much of a concern.
  • The R2 set handling is super hacky. I simply translude all the sets in Commons:Picture_of_the_Year/2019/R2/Gallery. I think this should be preprocessed in LUA to filter out any set candidate that isn't in R2.

@Jarekt: Would you like to help with the above?

  • The cryptography... I honestly don't know a common-er that has sufficient experience in crypto. I personally know a cryptographer IRL who I was thinking of asking for help when I implement it. I guess this is moot now
  • It should be documented in /Help somehow that the votes are public...
  • @revi, ping me on IRC if you have trouble running the POTY scripts, I'll try me best to make sure R2 as smooth as if I'm still here :)

Anyways, it was nice experience organizing POTY with you. Wish you best of luck. --Zhuyifei1999 (talk) 05:54, 26 March 2020 (UTC)[reply]

Zhuyifei1999 I can look at this this evening as I am not familiar enough with POTY to understand what is needed. I have not process any wikipages with LUA code, as I mostly process parameters passed from templates. Some of the codes for processing Commons:Photo challenge were written by user:Colin in C#, see Commons:Photo challenge/code/CreateVoting.cs and Commons:Photo challenge/code/voting.cs. May be we need something similar here. --Jarekt (talk) 15:14, 26 March 2020 (UTC)[reply]
The pages should better be the generated on-the-fly from candidate lists like Commons:Picture_of_the_Year/2019/Candidates/warnbig, Commons:Picture_of_the_Year/2019/Candidates/R2, Commons:Picture_of_the_Year/2019/Candidates/Sets, so using a bot script to process is mostly a no-go. You can do it, it's just, we'd have multiple sources of truth and if something happens to a candidate (like disqualifying videos this year) there will be more places to change it. Module:POTY and Module:POTY/parser currently parses & does stuffs with them. --Zhuyifei1999 (talk) 16:42, 26 March 2020 (UTC)[reply]
I don't really see how I can help out. I always wished Photo Challenge had as good a UI as POTY. This is beyond my capabilities. Perhaps there are folk on Wikipedia who can help. Btw, the "shuffled for those who don't have JS" sound a bit like overthinking again. This is a community vote among normal folk, not nerds using some locked down Linux browser with JavaScript disabled. But perhaps I misunderstand. -- Colin (talk) 17:10, 26 March 2020 (UTC)[reply]
There are plenty of people attempting to vote without the POTY JS. Most of them are mobile users. And they cause most of the ineligible votes. --Zhuyifei1999 (talk) 17:13, 26 March 2020 (UTC)[reply]
What about rather making the JavaScript mobile-friendly? It would improve the UX far more than shuffling in Lua. (By the way, if we go the Lua way anyway, maybe the shuffling can be removed from the gadget—AFAIK the cache isn’t much used for signed-in users, so nearly everybody would get different order even without JS. And even if the order changes only every day, it’s fair enough for all authors.) —Tacsipacsi (talk) 01:09, 27 March 2020 (UTC)[reply]
Making it mobile-friendly would definitely a better solution for the long term I agree. However, it would likely be a massive undertaking.
POTY JS gadget shuffling actually works seems to be MediaWiki:Gadget-EnhancedPOTY.js, shuffleElements, seeded with the username. IIRC the seed's purpose is to make the page not shuffle itself when a user revisits the page. "if the order changes only every day" is not perfect but might be as much as we can do. POTY votes per day is very skewed due to CN, so I think there wouldn't be much of an issue keeping the JS shuffle. --Zhuyifei1999 (talk) 07:01, 27 March 2020 (UTC)[reply]
Mobile-friendliness is mainly about screen size and lack of mouse, isn’t it? As far as I remember (although I didn’t vote enough to remember clearly :P), neither of these is really a concern for the vote tools. Much of the styling currently stored in MediaWiki namespace should be moved to TemplateStyles for it to be available even without CSS (for example the black-yellow menu is currently completely hidden without JS, which is certainly not a great UX), and be made mobile-friendly. This is the more difficult task; the remaining part (the really JavaScript-dependant things) seem for me pretty easy to make mobile-friendly. Is there a place to test things, apart from a local MediaWiki install? Testing would require access to both rounds and all intermediate time slots at once, interface administrator access etc. —Tacsipacsi (talk) 23:59, 27 March 2020 (UTC)[reply]
I guess in theory you can set it up on beta Commons. I don't think we have a replica there though. I can get you need access to the tools on beta if you want to work on this. --Zhuyifei1999 (talk) 00:13, 28 March 2020 (UTC)[reply]

Licensing Issues

[edit]

Hello, The winner is File:Mud Cow Racing - Pacu Jawi - West Sumatra, Indonesia.jpg, but the Author changed the license to NC. So i am sure he will not be amused to see this as winning photo. Maybe someone (at the WMF? @Ed Erhart (WMF): did the blogpost the last times) can reach out to the photographer before we publish the official results. --Steinsplitter (talk) 08:30, 25 April 2020 (UTC)[reply]

CC: @Christian Ferrer and -revi: , CC also Communications Strategist @GVarnum-WMF: . --Steinsplitter (talk) 08:31, 25 April 2020 (UTC)[reply]
I don't think. Let me explain, we have a lot of FPs every years that comes from Flick, of which a number of which have changed licenses. Does that mean the before to start the next contests we have to check if the license have been changed? IMO a fair competition is fair when all candidtaes are equal, and I mean fair towards to the voters too. What do we do you contact the photographer and he answer, "I don't want that this image be the winner?" do we cancell this candidate? Don't respect everyone who voted? that is not a concern of the contest IMO. If one day the community decides to delete it, then the second will be promoted first, ect... but otherwise... Christian Ferrer (talk) 10:57, 25 April 2020 (UTC)[reply]
Either it is deleted or a candidate like the others. And a deletion request is a better place to discuss about that, than a contest talk page. I mean I don't see how we can keep the file and in the same time to cancell the votes of the majority, therefore this is not up to us to decide of the result of the POTY as regards to a change of license, these are two different things IMO. Christian Ferrer (talk) 10:59, 25 April 2020 (UTC)[reply]
Well..., but in the last years we always created a blogpost, congratulated the user. So - when we or WMF reaches out to him - maybe the license can be adjusted. We won't strike it, this has never been discussed or proposed. No worry. Best --Steinsplitter (talk) 11:19, 25 April 2020 (UTC)[reply]
Of course there is no worries!! :) Christian Ferrer (talk) 11:27, 25 April 2020 (UTC)[reply]
(Via email:) GVarnum told us they have no immediate plan to reach out to the photographer. WMF Legal is now CCed for possible license dispute. — regards, Revi 08:13, 28 April 2020 (UTC)[reply]
PS: My personal opinion is: "While this is a non-ideal situation that {{Flickr-change-of-license}} has been chosen as a winner, we at the POTY Committee has no authority to disqualify a winner just because it is flickr-change-of-license." So if WMF Legal has nothing to say about this or they think it's fine, we should proceed. — regards, Revi 08:16, 28 April 2020 (UTC)[reply]

Results published

[edit]

@Christian Ferrer and -revi: Results published. Feel free to make change to the style etc. if necessary. --Steinsplitter (talk) 13:45, 1 May 2020 (UTC)[reply]

Thanks!

[edit]

@Christian Ferrer, -revi, and Steinsplitter: Thanks for your hard work on POTY. I always look forward to it each year. Being a leader can sometimes be a thankless job, but I hope this reverses that a little. Glennfcowan (talk) 05:10, 3 May 2020 (UTC)[reply]