Key Performance Indicators

  • User-perceived load time If our search is fast and snappy, then more people will use it!
  • Zero Results Rate If a user gets zero results for their query, they’ve by definition not found what they’re looking for.
  • API usage We want people, both within our movement and outside it, to be able to easily access our information.
  • User Engagement (not quite User Satisfaction) This is an augmented version of clickthrough rate. In it we are including the proportion of users' sessions exceeding a pre-specified threshold. Note that we deployed v2.0 of the satisfaction schema on 9/2/2015.

Additional information

In the case of a data outage, the medians will be computed from non-missing data. When this is the case, the value displayed will be approximate and will have a '~'.

Questions, bug reports, and feature suggestions

For technical, non-bug questions, email Mikhail or Chelsy. If you experience a bug or notice something wrong or have a suggestion, open a ticket in Phabricator in the Discovery board or email Deb.


Link to this dashboard: https://discovery.wmflabs.org/metrics/#kpis_summary | Page is available under CC-BY-SA 3.0 | Code is licensed under MIT | Part of Discovery Dashboards

Monthly Metrics

This tab is to make it easier to update Wikimedia Product wiki page and the accompanying slide decks for WMF metrics and activities meetings with last month's KPIs and month-over-month (MoM) & year-over-year (YoY) changes.

Questions, bug reports, and feature suggestions

For technical, non-bug questions, email Mikhail or Chelsy. If you experience a bug or notice something wrong or have a suggestion, open a ticket in Phabricator in the Discovery board or email Deb.


Link to this dashboard: https://discovery.wmflabs.org/metrics/#monthly_metrics | Page is available under CC-BY-SA 3.0 | Code is licensed under MIT | Part of Discovery Dashboards

Key Performance Indicator: User-perceived load time

If our search is fast and snappy, then more people will use it!

Notes

Questions, bug reports, and feature suggestions

For technical, non-bug questions, email Mikhail or Chelsy. If you experience a bug or notice something wrong or have a suggestion, open a ticket in Phabricator in the Discovery board or email Deb.


Link to this dashboard: https://discovery.wmflabs.org/metrics/#kpi_load_time | Page is available under CC-BY-SA 3.0 | Code is licensed under MIT | Part of Discovery Dashboards

Key Performance Indicator: Zero results rate

If a user gets zero results for their query, they’ve by definition not found what they’re looking for.

Outages and inaccuracies

  • Anotation “A”: On 13 April 2016 we switched to a new data format for zero results rate (see T132503) wherein we stopped lumping in different query types into just two categories (“Full-Text Search” and “Prefix Search”). The ZRR data were backfilled from 1 February 2016 under the new format which breaks down ZRR into the individual query types. We also began filtering out irrelevant query types (see T131196#2200560) and requests with unknown number of hits (“-1” in the database).
  • On 15 January 2016 there was an issue with Avro serialization that prevented data from entering the Hadoop cluster. A patch was deployed on 19 January 2016. As a result, there are no recorded zero results rates for 01/15-01/19. The values you may see on those dates are estimates computed with statistical models.
  • 'R': on 2017-01-01 we started calculating all of Discovery's metrics using a new version of our data retrieval and processing codebase that we migrated to Wikimedia Analytics' Reportupdater infrastructure. See T150915 for more details.

Questions, bug reports, and feature suggestions

For technical, non-bug questions, email Mikhail or Chelsy. If you experience a bug or notice something wrong or have a suggestion, open a ticket in Phabricator in the Discovery board or email Deb.


Link to this dashboard: https://discovery.wmflabs.org/metrics/#kpi_zero_results | Page is available under CC-BY-SA 3.0 | Code is licensed under MIT | Part of Discovery Dashboards

Key Performance Indicator: API usage

We want people, both within our movement and outside it, to be able to easily access our information.

Outages and inaccuracies

Questions, bug reports, and feature suggestions

For technical, non-bug questions, email Mikhail or Chelsy. If you experience a bug or notice something wrong or have a suggestion, open a ticket in Phabricator in the Discovery board or email Deb.


Link to this dashboard: https://discovery.wmflabs.org/metrics/#kpi_api_usage | Page is available under CC-BY-SA 3.0 | Code is licensed under MIT | Part of Discovery Dashboards

Key Performance Indicator: User Engagement (Augmented Clickthroughs)

We are in the process of obtaining qualitative data from our users (their intent and satisfaction), so this metric is less akin to “user satisfaction” and more akin to “user engagement” we observe.

This metric combines the clickthrough rate and the proportion of users' session dwell times exceeding the threshold of 10s.

Notes

Questions, bug reports, and feature suggestions

For technical, non-bug questions, email Mikhail or Chelsy. If you experience a bug or notice something wrong or have a suggestion, open a ticket in Phabricator in the Discovery board or email Deb.


Link to this dashboard: https://discovery.wmflabs.org/metrics/#kpi_augmented_clickthroughs | Page is available under CC-BY-SA 3.0 | Code is licensed under MIT | Part of Discovery Dashboards

Desktop full-text search

User actions that we track around search on the desktop website generally fall into three categories:

  1. The start of a user's search session;
  2. The presentation of the user with a results page, and;
  3. A user clicking through to an article in the results page.

These three things are tracked via the EventLogging 'TestSearchSatisfaction2' schema (previously 'Search', see note “A”), and stored to a database. The results are then aggregated and anonymised, and presented on this page. For performance/privacy reasons we randomly sample what we store, so the actual numbers are a vast understatement of how many user actions our servers receive - what's more interesting is how they change over time. In the case of desktop search, this sampling rate varies by project (see T163273 for more details), but does not change day-to-day.

* This number represents the median of the last 90 days.

Outages and inaccuracies

There are occasionally going to be outages that will affect the accuracy of data. To make it easier to rely on the data (or not!) they are listed here, from most- to least-recent.

  • Between 2 October 2015 and 28 October 2015 we were not logging any events from the Search schema. There was a change in core that broke the code being inserted into pages. Those pages were cached into varnish so an alternate solution had to be taken. that was delayed because of deployment freezes. The change in core only broke it because the way the code was added from our side was technically wrong, but happened to work anyways.
  • Between 5 May and 6 May 2015, approximately 40% of incoming EventLogging data was lost due to a wider EventLogging outage. You can read more about the outage here.
  • Data in late September/early October 2015 is unavailable due to another bug in EventLogging as a whole, which impacted data collection.
  • 'A': we switched to using data from Schema:TestSearchSatisfaction2 instead of Schema:Search for Desktop event counts and load times on 12 July 2016.
  • 'R': on 2017-01-01 we started calculating all of Discovery's metrics using a new version of our data retrieval and processing codebase that we migrated to Wikimedia Analytics' Reportupdater infrastructure. See T150915 for more details.
  • 'S': on 2017-04-25 we changed the rates at which users are put into event logging (see T163273). Specifically, we decreased the rate on English Wikipedia (“EnWiki”) and increased it everywhere else.

Questions, bug reports, and feature suggestions

For technical, non-bug questions, email Mikhail or Chelsy. If you experience a bug or notice something wrong or have a suggestion, open a ticket in Phabricator in the Discovery board or email Deb.


Link to this dashboard: https://discovery.wmflabs.org/metrics/#desktop_events | Page is available under CC-BY-SA 3.0 | Code is licensed under MIT | Part of Discovery Dashboards

Desktop full-text result load times

When a user types in a search query, it's sent off to the servers which identify and rank probable articles and then return them in a list. In an ideal world this would be instantaneous, but realistically, even with all the caching in the world it's going to take some time.

One of the things we anonymously track is how long it takes search results to be provided to the user, after they've sent the request. Here we're displaying the mean, or average, the median, the 95th percentile and the 99th percentile. A caveat when interpreting the results is that the mean may sometimes be higher than the 99th percentile, in the case that that last percentile consists of requests that take a truly ludicrously long time to display. Caveat emptor.

General trends

Load times for results are remarkably consistent, absent the situation, mentioned above, where a tiny number of users face a really slow service. Other than that, there's little interesting to see here unless we decide to start focusing specifically on speeding up the service.

Notes, outages, and inaccuracies

There are occasionally going to be outages that will affect the accuracy of data. To make it easier to rely on the data (or not!) they will be listed here, from most- to least-recent.

  • Between 2 October 2015 and 28 October 2015 we were not logging any events from the Search schema. There was a change in core that broke the code being inserted into pages. Those pages were cached into varnish so an alternate solution had to be taken. that was delayed because of deployment freezes. The change in core only broke it because the way the code was added from our side was technically wrong, but happened to work anyways.
  • Between 5 May and 6 May 2015, approximately 40% of incoming EventLogging data was lost due to a wider EventLogging outage. You can read more about the outage here.
  • Data in late September/early October 2015 is unavailable due to another bug in EventLogging as a whole, which impacted data collection.
  • 'A': we switched to using data from Schema:TestSearchSatisfaction2 instead of Schema:Search for Desktop event counts and load times on 12 July 2016.
  • 'R': on 2017-01-01 we started calculating all of Discovery's metrics using a new version of our data retrieval and processing codebase that we migrated to Wikimedia Analytics' Reportupdater infrastructure. See T150915 for more details.
  • 'S': on 2017-04-25 we changed the rates at which users are put into event logging (see T163273). Specifically, we decreased the rate on English Wikipedia (“EnWiki”) and increased it everywhere else.
  • 'B': on 2017-06-15 we deployed the sister search feature to all Wikipedia in all languages. This technically has a slight impact on load time since we are performing additional searches, but we have not seen any noticeable or alarming increases since the feature's deployment.

Questions, bug reports, and feature suggestions

For technical, non-bug questions, email Mikhail or Chelsy. If you experience a bug or notice something wrong or have a suggestion, open a ticket in Phabricator in the Discovery board or email Deb.


Link to this dashboard: https://discovery.wmflabs.org/metrics/#desktop_load | Page is available under CC-BY-SA 3.0 | Code is licensed under MIT | Part of Discovery Dashboards

Divides PaulScore by the maximum possible score for each F

PaulScore Approximations

"PaulScore" is the name we've given to a metric proposed by Paul Nelson in a talk he gave at Elasticon. We use PaulScore to evaluate the quality of results provided by CirrusSearch or proposed modifications to CirrusSearch, based on historical click data. A big advantage of the PaulScore is that it relies on user click history to award points, so it is easy to compute.

This dashboard shows the PaulScore approximation for 3 values of $F$: 0.1, 0.5, and 0.9. The maximum score possible for each value of $F$ is $1/(1-F)$, so the dashboard has the option of looking at relative PaulScores, which is the computed value divided by maximum possible value for given $F$.

For auto-completion suggestions, we expect a much lower score since most queries get no clicks -- i.e., while typing many results are shown and ignored -- and most users will only click on one results, whereas full-text searchers can more easily go back to the results page or open multiple results in other windows.

For more details, please see Discovery's Search glossary.

PaulScore is computed via the following steps:

  1. Pick scoring factor $0 < F < 1$.
  2. For $i$-th search session $S_i$ $(i=1, \ldots, n)$ containing $m$ queries $Q_1, \ldots, Q_m$ and search result sets $\mathbf{R}_1, \ldots, \mathbf{R}_m$:
    1. For each $j$-th search query $Q_j$ with result set $\mathbf{R}_j$, let $\nu_j$ be the query score: $$\nu_j=\sum_{k~\in~\{\text{0-based positions of clicked results in}~\mathbf{R}_j\}} F^k.$$
    2. Let user's average query score $\bar{\nu}_{(i)}$ be $$\bar{\nu}_{(i)}=\frac{1}{m} \sum_{j=1}^m \nu_j.$$
  3. Then the PaulScore is the average of all users' average query scores: $$\text{PaulScore}~=~\frac{1}{n} \sum_{i=1}^n \bar{\nu}_{(i)}.$$

Outages and inaccuracies

  • 'R': on 2017-01-01 we started calculating all of Discovery's metrics using a new version of our data retrieval and processing codebase that we migrated to Wikimedia Analytics' Reportupdater infrastructure. See T150915 for more details.
  • 'S': on 2017-04-25 we changed the rates at which users are put into event logging (see T163273). Specifically, we decreased the rate on English Wikipedia ("EnWiki") and increased it everywhere else, and since EnWiki generally has higher PaulScore than other projects, we effectively lowered the overall PaulScore by lessening EnWiki's contribution. See T168466 for more details.

Questions, bug reports, and feature suggestions

For technical, non-bug questions, email Mikhail or Chelsy. If you experience a bug or notice something wrong or have a suggestion, open a ticket in Phabricator in the Discovery board or email Deb.


Link to this dashboard: https://discovery.wmflabs.org/metrics/#paulscore_approx | Page is available under CC-BY-SA 3.0 | Code is licensed under MIT | Part of Discovery Dashboards

Mobile search

User actions that we track around search on the mobile website generally fall into three categories:

  1. The start of a user's search session;
  2. The presentation of the user with a results page, and;
  3. A user clicking through to an article in the results page.

These three things are tracked via the EventLogging 'MobileWebSearch' schema, and stored to a database. The results are then aggregated and anonymised, and presented on this page. For performance/privacy reasons we randomly sample what we store, so the actual numbers are a vast understatement of how many user actions our servers receive - what's more interesting is how they change over time. In the case of Mobile Web search, this sampling rate is going to be 0.1%: it's currently turned off entirely but should be enabled soon.

* This number represents the median of the last 90 days.

General trends

It's hard to tell because we have too little data and there's clearly something screwy in the data we're provided with.

Outages and inaccuracies

There are occasionally going to be outages that will affect the accuracy of data. To make it easier to rely on the data (or not!) they are listed here, from most- to least-recent.

Questions, bug reports, and feature suggestions

For technical, non-bug questions, email Mikhail or Chelsy. If you experience a bug or notice something wrong or have a suggestion, open a ticket in Phabricator in the Discovery board or email Deb.


Link to this dashboard: https://discovery.wmflabs.org/metrics/#mobile_events | Page is available under CC-BY-SA 3.0 | Code is licensed under MIT | Part of Discovery Dashboards

Mobile web result load times

When a user types in a search query, it's sent off to the servers which identify and rank probable articles and then return them in a list. In an ideal world this would be instantaneous, but realistically, even with all the caching in the world it's going to take some time.

One of the things we anonymously track is how long it takes search results to be provided to the user, after they've sent the request. Here we're displaying the mean, or average, the median, the 95th percentile and the 99th percentile. A caveat when interpreting the results is that the mean may sometimes be higher than the 99th percentile, in the case that that last percentile consists of requests that take a truly ludicrously long time to display. Caveat emptor.

General trends

It's hard to tell because we have too little data and there's clearly something screwy in the data we're provided with. Like: seriously screwy.

Outages and inaccuracies

There are occasionally going to be outages that will affect the accuracy of data. To make it easier to rely on the data (or not!) they are listed here, from most- to least-recent.

Questions, bug reports, and feature suggestions

For technical, non-bug questions, email Mikhail or Chelsy. If you experience a bug or notice something wrong or have a suggestion, open a ticket in Phabricator in the Discovery board or email Deb.


Link to this dashboard: https://discovery.wmflabs.org/metrics/#mobile_load | Page is available under CC-BY-SA 3.0 | Code is licensed under MIT | Part of Discovery Dashboards

Mobile App search

User actions that we track around search on the mobile apps generally fall into three categories:

  1. The start of a user's search session;
  2. The presentation of the user with a results page, and;
  3. A user clicking through to an article in the results page.

These three things are tracked via the EventLogging 'MobileWikiAppSearch' schema, and stored to a database. The results are then aggregated and anonymised, and presented on this page. For performance/privacy reasons we randomly sample what we store, so the actual numbers are a vast understatement of how many user actions our servers receive - what's more interesting is how they change over time. In the case of app search, this sampling rate is 1%.

Due to a bug in the iOS EventLogging system, iOS events are currently being tracked much more frequently than Android ones and so are displayed in a different graph to avoid confusion.

* This number represents the median of the last 90 days.

Notes

Outages and inaccuracies

  • Between 5 May and 6 May 2015, approximately 40% of incoming EventLogging data was lost due to a wider EventLogging outage. You can read more about the outage here.

Questions, bug reports, and feature suggestions

For technical, non-bug questions, email Mikhail or Chelsy. If you experience a bug or notice something wrong or have a suggestion, open a ticket in Phabricator in the Discovery board or email Deb.


Link to this dashboard: https://discovery.wmflabs.org/metrics/#app_events | Page is available under CC-BY-SA 3.0 | Code is licensed under MIT | Part of Discovery Dashboards

Mobile app result load times

When a user types in a search query, it's sent off to the servers which identify and rank probable articles and then return them in a list. In an ideal world this would be instantaneous, but realistically, even with all the caching in the world it's going to take some time.

One of the things we anonymously track is how long it takes search results to be provided to the user, after they've sent the request. Here we're displaying the median, the 95th percentile and the 99th percentile.

Due to a bug in the iOS EventLogging system, iOS events are currently being tracked much more frequently than Android ones and so are displayed in a different graph to avoid confusion.

General trends

Outages and inaccuracies

There are occasionally going to be outages that will affect the accuracy of data. To make it easier to rely on the data (or not!) they will be listed here, from most- to least-recent.

  • Between 5 May and 6 May 2015, approximately 40% of incoming EventLogging data was lost due to a wider EventLogging outage. You can read more about the outage here.
  • A bug in the app implementation led to the clocks being off, hence some claims of -38000 seconds as a completion time. This is now patched.
  • 'R': on 2017-01-01 we started calculating all of Discovery's metrics using a new version of our data retrieval and processing codebase that we migrated to Wikimedia Analytics' Reportupdater infrastructure. See T150915 for more details.

Questions, bug reports, and feature suggestions

For technical, non-bug questions, email Mikhail or Chelsy. If you experience a bug or notice something wrong or have a suggestion, open a ticket in Phabricator in the Discovery board or email Deb.


Link to this dashboard: https://discovery.wmflabs.org/metrics/#app_load | Page is available under CC-BY-SA 3.0 | Code is licensed under MIT | Part of Discovery Dashboards

Click Position on Mobile App

The position of the search result that was selected, from the list that was presented to the user (used with the 'click' action).

Notes

Questions, bug reports, and feature suggestions

For technical, non-bug questions, email Mikhail or Chelsy. If you experience a bug or notice something wrong or have a suggestion, open a ticket in Phabricator in the Discovery board or email Deb.


Link to this dashboard: https://discovery.wmflabs.org/metrics/#app_click_position | Page is available under CC-BY-SA 3.0 | Code is licensed under MIT | Part of Discovery Dashboards

Invoke Source on Mobile App

The source from which the Search interface was invoked.

Notes

Questions, bug reports, and feature suggestions

For technical, non-bug questions, email Mikhail or Chelsy. If you experience a bug or notice something wrong or have a suggestion, open a ticket in Phabricator in the Discovery board or email Deb.


Link to this dashboard: https://discovery.wmflabs.org/metrics/#app_invoke_source | Page is available under CC-BY-SA 3.0 | Code is licensed under MIT | Part of Discovery Dashboards

API Calls by Referrer Class

All types of API calls are aggregated by date and referrer class.

Internal is traffic referred by Wikimedia sites, specifically: mediawiki.org, wikibooks.org, wikidata.org, wikinews.org, wikimedia.org, wikimediafoundation.org, wikipedia.org, wikiquote.org, wikisource.org, wikiversity.org, wikivoyage.org, and wiktionary.org (See Webrequest source for more information.)

Outages and inaccuracies

On 2017-06-29 we started to break down the API calls by referer class.

Questions, bug reports, and feature suggestions

For technical, non-bug questions, email Mikhail or Chelsy. If you experience a bug or notice something wrong or have a suggestion, open a ticket in Phabricator in the Discovery board or email Deb.


Link to this dashboard: https://discovery.wmflabs.org/metrics/#referer_breakdown | Page is available under CC-BY-SA 3.0 | Code is licensed under MIT | Part of Discovery Dashboards

Search Queries with Zero Results

Sometimes, searches return zero results. What we're visualising here is the proportion of the time that happens.

Zero results doesn't actually mean a failure for the user, of course: some of these events are from “prefix search” in the search box, where the system attempts to match the user's already-typed characters to an existing page name. Others are from typos, resulting in a search page with no results, but also resulting in a spelling correction the user could use to get genuine results.

We've also broken out the daily rate of change - the failure proportion's increase or decrease per day. We expect the failure rate to monotonically decrease over time once we start with projects and patches aimed at decreasing this rate.

Notes

  • 'Anotation A': On 13 April 2016 we switched to a new data format for zero results rate (see T132503) wherein we stopped lumping in different query types into just two categories (“Full-Text Search” and “Prefix Search”). The ZRR data were backfilled from 1 February 2016 under the new format which breaks down ZRR into the individual query types. We also began filtering out irrelevant query types (see T131196#2200560) and requests with unknown number of hits (“-1” in the database).
  • On 15 July 2015 we updated our heuristics to avoid counting maintenance tasks as search requests. The historic data on the dashboards is being backfilled to reflect this - until it's done, the dashboards may look somewhat strange.
  • On 15 January 2016 there was an issue with Avro serialization that prevented data from entering the Hadoop cluster. A patch was deployed on 19 January 2016. As a result, there are no recorded zero results rates for 01/15-01/19. The values you may see on those dates are estimates computed with statistical models.
  • 'R': on 2017-01-01 we started calculating all of Discovery's metrics using a new version of our data retrieval and processing codebase that we migrated to Wikimedia Analytics' Reportupdater infrastructure. See T150915 for more details.

Questions, bug reports, and feature suggestions

For technical, non-bug questions, email Mikhail or Chelsy. If you experience a bug or notice something wrong or have a suggestion, open a ticket in Phabricator in the Discovery board or email Deb.


Link to this dashboard: https://discovery.wmflabs.org/metrics/#failure_rate | Page is available under CC-BY-SA 3.0 | Code is licensed under MIT | Part of Discovery Dashboards

Search Result Rate by Query Type

Sometimes, searches return zero results - both full-text and prefix searches. What we're visualising here is the percentage of the time a search query returns zero results, split out for different query types (full-text, prefix, regex, more like, completion suggester, and geospatial). Zero results doesn't actually mean a failure for the user, of course: the “prefix search” events represent the system attempting to match the user's already-typed characters to an existing page name.

Notes

  • 'A': On 13 April 2016 we switched to a new data format for zero results rate (see T132503) wherein we stopped lumping in different query types into just two categories (“Full-Text Search” and “Prefix Search”). The ZRR data were backfilled from 1 February 2016 under the new format which breaks down ZRR into the individual query types. We also began filtering out irrelevant query types (see T131196#2200560) and requests with unknown number of hits (“-1” in the database).
  • On 15 July 2015 we updated our heuristics to avoid counting maintenance tasks as search requests.
  • On 15 January 2016 there was an issue with Avro serialization that prevented data from entering the Hadoop cluster. A patch was deployed on 19 January 2016. As a result, there are no recorded zero results rates for 01/15-01/19. The values you may see on those dates are estimates computed with statistical models.
  • 'R': on 2017-01-01 we started calculating all of Discovery's metrics using a new version of our data retrieval and processing codebase that we migrated to Wikimedia Analytics' Reportupdater infrastructure. See T150915 for more details.

Questions, bug reports, and feature suggestions

For technical, non-bug questions, email Mikhail or Chelsy. If you experience a bug or notice something wrong or have a suggestion, open a ticket in Phabricator in the Discovery board or email Deb.


Link to this dashboard: https://discovery.wmflabs.org/metrics/#failure_breakdown | Page is available under CC-BY-SA 3.0 | Code is licensed under MIT | Part of Discovery Dashboards

Zero Rate for Search Suggestions

With a lot of “full-text” searches, the system provides not only results but also suggestions - corrections for typos, for example. These are more prominent with searches that returned few or no results, for fairly obvious reasons (if the suggestion is being provided it's because the system thinks you got something wrong).

This graph shows the zero results rate for searches with suggestions, compared to the zero results rate for full-text searches overall.

Notes

  • 'A': On 13 April 2016 we switched to a new data format for zero results rate (see T132503) wherein we stopped lumping in different query types into just two categories (“Full-Text Search” and “Prefix Search”). The ZRR data were backfilled from 1 February 2016 under the new format which breaks down ZRR into the individual query types. We also began filtering out irrelevant query types (see T131196#2200560) and requests with unknown number of hits (“-1” in the database).
  • 'R': on 2017-01-01 we started calculating all of Discovery's metrics using a new version of our data retrieval and processing codebase that we migrated to Wikimedia Analytics' Reportupdater infrastructure. See T150915 for more details.

Questions, bug reports, and feature suggestions

For technical, non-bug questions, email Mikhail or Chelsy. If you experience a bug or notice something wrong or have a suggestion, open a ticket in Phabricator in the Discovery board or email Deb.


Link to this dashboard: https://discovery.wmflabs.org/metrics/#failure_suggestions | Page is available under CC-BY-SA 3.0 | Code is licensed under MIT | Part of Discovery Dashboards

Select up to 2

* Users can click on a cross-wiki result or view all the results at the sister project
This excludes the language-less Wikimedia Commons

Sister project search results traffic

Sister project (cross-wiki) snippets is a feature that adds search results from sister projects of Wikipedia to a sidebar on the search engine results page (SERP). If a query results in matches from the sister projects, users will be shown snippets from Wiktionary, Wikisource, Wikiquote and/or other projects. See T162276 for more details.

When viewing traffic split by project, these statistics include all languages and any click into a sister project snippet (either article or more results). Also, these are actual pageviews, not events from event logging.

Notes, outages, and inaccuracies

  • English Wikipedia has a different display than all the other languages due to community feedback. Specifically, it does not show results from Commons/multimedia, Wikinews, and Wikiversity. Refer to T162276#3278689 for more details.
  • Some projects (e.g. French and Catalan Wikipedias) use a community-developed sister project search sidebar, which is why we see some sister traffic before the deployment of the sister search feature across all Wikipedias.
  • Wikisource had a unknown spike on 22 June 2017 that slightly skews that project's results and overall results for that day. Specifically, we calculated 5129 pageviews on Desktop from English Wikipedia, which is an extreme outlier that we removed and imputed.
  • 'A': on 2017-06-15 we deployed the sister search feature to all Wikipedia in all languages.

Questions, bug reports, and feature suggestions

For technical, non-bug questions, email Mikhail or Chelsy. If you experience a bug or notice something wrong or have a suggestion, open a ticket in Phabricator in the Discovery board or email Deb.


Link to this dashboard: https://discovery.wmflabs.org/metrics/#sister_search_traffic | Page is available under CC-BY-SA 3.0 | Code is licensed under MIT | Part of Discovery Dashboards | Data available at Wikimedia Analytics

* Each point will become an average of this many days.

How long searchers stay on the visited search results

When someone is randomly selected for search satisfaction tracking (using our TSS2 schema), we use a check-in system and survival analysis to estimate how long users stay on visited pages. To summarize the results on a daily basis, we record a set of statistics based on a measure formally known as “median lethal dose”.

This graph shows the length of time that must pass before N% of the users leave the page (e.g. article) they visited. When the number goes up, we can infer that users are staying on the pages longer. In general, it appears it takes 15s to lose 10%, 25-35s to lose 25%, and 55-75s to lose 50%.

On most days, we retain at least 20% of the test population past the 7 minute mark (the point at which the user's browser stops checking in), so on those days we cannot calculate the time it takes to lose 90/95/99% of the users.

There are some days when we CAN calculate those times, and it can take anywhere between 270s (4m30s) and 390s (6m30s) for 90% of the users to have closed the page they clicked through from the search results page.

Annotations

Questions, bug reports, and feature suggestions

For technical, non-bug questions, email Mikhail or Chelsy. If you experience a bug or notice something wrong or have a suggestion, open a ticket in Phabricator in the Discovery board or email Deb.


Link to this dashboard: https://discovery.wmflabs.org/metrics/#survival | Page is available under CC-BY-SA 3.0 | Code is licensed under MIT | Part of Discovery Dashboards

* Each point will become an average of this many days.

How long Wikipedia searchers stay on the search result pages

When someone is randomly selected for search satisfaction tracking (using our TSS2 schema), we use a check-in system and survival analysis to estimate how long users stay on visited pages. When a Wikipedia visitor searches using autocomplete and ends up on a full-text search results page (SRP), we can track how long that page is “alive” before the user either closes the tab, clicks on a result, or navigates elsewhere.

To summarize the results on a daily basis, we record a set of statistics based on a measure formally known as “median lethal dose”. This graph shows the length of time that must pass before N% of the users leave the search results page. When the number goes up, we can infer that users are staying on the pages longer.

Notes

These summary statistics are the same between the three categories of languages we have.

  • Half of the searchers stay on the full-text SRP for 25 or more seconds – this is consistent day-to-day even after sister project search deployment.
  • ¾ of searchers have left the full-text SRP by the 35s mark. Since the deployment of the sister project search, we appear to have more days when this statistic is at 45s, indicating that users are staying on the page a little longer – possibly due to reading the cross-wiki snippets we're presenting them with.
  • 90% of searchers are done with the full-text SRP by the 1m15s mark, although on some days it's 1m45s. We've observed more days like that before the deployment of sister project search, although there was one day post-deployment in particular (2017-06-29) on which it took 2m45s for 90% of the searchers to leave the page.

Annotations

  • 'S': on 2017-04-25 we changed the rates at which users are put into event logging (see T163273). Specifically, we decreased the rate on English Wikipedia (“EnWiki”) and increased it everywhere else.
  • 'SS': on 2017-06-15 we deployed the sister search feature to all Wikipedia in all languages. Sister project (cross-wiki) snippets is a feature that adds search results from sister projects of Wikipedia to a sidebar on the search engine results page (SERP). If a query results in matches from the sister projects, users will be shown snippets from Wiktionary, Wikisource, Wikiquote and/or other projects. See T162276 for more details.

Questions, bug reports, and feature suggestions

For technical, non-bug questions, email Mikhail or Chelsy. If you experience a bug or notice something wrong or have a suggestion, open a ticket in Phabricator in the Discovery board or email Deb.


Link to this dashboard: https://discovery.wmflabs.org/metrics/#srp_surv | Page is available under CC-BY-SA 3.0 | Code is licensed under MIT | Part of Discovery Dashboards

Divides PaulScore by the maximum possible score for each F

Breakdown by Languages and Projects

On this page, we split out several metrics by language (e.g. English vs Russian) and project (e.g. Wikipedia vs Wiktionary) to help us understand the differences between wikis. See the following pages for more details on how we compute these metrics:

Notes/Tips

  • The percentages next to the language and project names represent the proportion of the total volume.
  • You can select multiple projects and multiple languages to compare simultaneously. (Hold down Ctrl on Windows or Command on Mac.)
  • For each arbitrary combination, the zero results rate is the overall rate (full-text AND prefix, web AND api).
  • The language picker will automatically choose “(None)” if you select a non-multilingual project such as Wikidata.
  • If you're interested in the overall metric for a multilingual project such as Wikipedia, make sure only “(None)” is selected in the languages picker.
  • Due to the high number of language-project combinations, we have restricted ourselves to only storing the last 30 days of data.

Questions, bug reports, and feature suggestions

For technical, non-bug questions, email Mikhail or Chelsy. If you experience a bug or notice something wrong or have a suggestion, open a ticket in Phabricator in the Discovery board or email Deb.


Link to this dashboard: https://discovery.wmflabs.org/metrics/#langproj_breakdown | Page is available under CC-BY-SA 3.0 | Code is licensed under MIT | Part of Discovery Dashboards