File:From Zero to Hero - Anticipating Zero Results From Query Features, Ignoring Content.pdf
Original file (1,275 × 1,650 pixels, file size: 1.81 MB, MIME type: application/pdf, 8 pages)
|DescriptionFrom Zero to Hero - Anticipating Zero Results From Query Features, Ignoring Content.pdf||
English: The Discovery Department uses the zero results rate -- the proportion of searches that yield zero results -- to measure the performance of our search system. However, little is known about possible patterns that affect the quantity (and quality) of results our users see. In this report, we use random forest and logistic regression models to shed light on the types of queries that tend to yield zero results.
Namely, we found that whether the query has an even number of double quotes is one of the most important indicators of whether it will yield zero results. Other notable features that impact the quantity of results include: whether the query is only punctuation and spaces, whether it ends with ?, and whether it has logical operators. For a full list of features and their importance and impact, please see Figures 4 and 5.Going forward, we may want to rewrite queries to not have quotes when the original query (with double quotation marks) yielded 0 results. We may also want to address question queries (that end with ?).
Copyright in this work is either owned/co-owned by the Wikimedia Foundation or the content has been licensed to the Wikimedia Foundation. The uploader asserts that they are acting as an agent for the Wikimedia Foundation in uploading this content. In reusing this media under the specified license, please attribute the creator.
|This file is licensed under the Creative Commons Attribution-Share Alike 4.0 International license.|
CC BY-SA 4.0 Creative Commons Attribution-Share Alike 4.0 true
™ Wikimedia Foundation, Inc.
This file is (or includes) one of the official logos or designs used by the Wikimedia Foundation or by one of its projects. Use of the Wikimedia logos and trademarks is subject to the Wikimedia trademark policy and visual identity guidelines, and may require permission.
Click on a date/time to view the file as it appeared at that time.
|current||23:17, 24 May 2016||1,275 × 1,650, 8 pages (1.81 MB)||MPopov (WMF)||User created page with UploadWizard|
File usage on Commons
This file contains additional information such as Exif metadata which may have been added by the digital camera, scanner, or software program used to create or digitize it. If the file has been modified from its original state, some details such as the timestamp may not fully reflect those of the original file. The timestamp is only as accurate as the clock in the camera, and it may be completely wrong.
|Image title||Anticipating Zero Results From Query Features, Ignoring Content|
|Author||Mikhail Popov (Analysis & Report); Trey Jones (Review)|
|Short title||From Zero to Hero|
|Software used||LaTeX with hyperref package|
|Conversion program||XeTeX 0.99992|
|Page size||612 x 792 pts (letter)|
|Version of PDF format||1.5|