File:Query Features and Search Performance.pdf
From Wikimedia Commons, the free media repository
Jump to navigation
Jump to search
Size of this JPG preview of this PDF file: 463 × 599 pixels. Other resolutions: 185 × 240 pixels | 371 × 480 pixels | 593 × 768 pixels | 1,275 × 1,650 pixels.
Original file (1,275 × 1,650 pixels, file size: 3.37 MB, MIME type: application/pdf, 12 pages)
File information
Structured data
Captions
Summary
[edit]DescriptionQuery Features and Search Performance.pdf |
English: Zero results rate (ZRR) – the proportion of searches that yield zero results – is a metric to measure the performance of our search system. In May 2016, we performed an analysis on zero result rate and query features using random forest and logistic regression model. This lead to us identifying question marks as the most important predictor of whether a query will yield zero results and lead to us stripping question marks from queries. With this analysis, we wanted to see which features float up to the top now after eliminating the question mark. Furthermore, we also joined search satisfaction event logging data with our Cirrus search logs to investigate the relationship between query features and other search performance metrics: clickthrough rate and PaulScore. We used random forest and generalized linear model with elastic net penalty to shed light on the relationship between query features and search performance metrics. For ZRR, we found that whether the query has an even number of double quotes, and whether it is only punctuation and spaces are more important than other features when predicting zero results. For clickthrough rate and PaulScore, we found that query features have very small predicting power. |
Date | |
Source | Own work |
Author | CXie (WMF) |
Licensing
[edit]I, the copyright holder of this work, hereby publish it under the following license:
This file is licensed under the Creative Commons Attribution-Share Alike 4.0 International license.
- You are free:
- to share – to copy, distribute and transmit the work
- to remix – to adapt the work
- Under the following conditions:
- attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- share alike – If you remix, transform, or build upon the material, you must distribute your contributions under the same or compatible license as the original.
File history
Click on a date/time to view the file as it appeared at that time.
Date/Time | Thumbnail | Dimensions | User | Comment | |
---|---|---|---|---|---|
current | 04:16, 15 November 2016 | 1,275 × 1,650, 12 pages (3.37 MB) | CXie (WMF) (talk | contribs) | User created page with UploadWizard |
You cannot overwrite this file.
File usage on Commons
There are no pages that use this file.
Metadata
This file contains additional information such as Exif metadata which may have been added by the digital camera, scanner, or software program used to create or digitize it. If the file has been modified from its original state, some details such as the timestamp may not fully reflect those of the original file. The timestamp is only as accurate as the clock in the camera, and it may be completely wrong.
Short title | Query Features and Search Performance |
---|---|
Author | Chelsy Xie (Analysis & Report); Deb Tankersley (Product Management); Mikhail Popov (Review); Erik Bernhardson (Review); Trey Jones (Review); David Causse (Review) |
Software used | LaTeX with hyperref package |
Conversion program | XeTeX 0.99996 |
Encrypted | no |
Page size | 612 x 792 pts (letter) |
Version of PDF format | 1.5 |