Paper
21 February 2020 Analysis of Wikipedia pageviews to identify popular chemicals
Yuru Cao, Hely Mehta, Ann E. Norcross, Masahiko Taniguchi, Jonathan S. Lindsey
Author Affiliations +
Abstract
A new approach to assess popularity relies on analysis of the number of times a web article is viewed. Here, a strategy is described to identify chemicals of widespread interest. The strategy makes use of Wikipedia, a rapidly growing publicly editable web encyclopedia that has become an influential knowledge base. While the total number of chemicals mentioned in Wikipedia is unknown, use of the Wikipedia Chemical Structure Explorer (WCSE) developed by Novartis enables identification of those that are described in an Infobox or Chembox along with a Simplified Molecular-Input Line-Entry system (SMILES) code. Using a Python script, all so-listed chemicals (16,243) in Wikipedia were identified and then sorted on the basis of their pageview rankings. Of the 16,243 chemicals, 846 (5.2%) belonged to controlled substances (United States Drug Enforcement Administration), WHO essential medicines, or the top 300 US drugs. These 846 chemicals received 220 million pageviews, which is 41.4% of the pageviews for all members of the Wikipedia chemical list. The number of chemicals described in the entire corpus of Wikipedia remains a tiny fraction of the <107 known chemicals. Much remains to be done to make the venerable literature and data of chemistry readily accessible. Regardless, identification of popular chemicals in this manner can be used to create selected databases, to tailor educational curricula, or to create targeted informational materials (such as safety brochures); such considerations of public demand are likely to engender corresponding widespread interest.
© (2020) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Yuru Cao, Hely Mehta, Ann E. Norcross, Masahiko Taniguchi, and Jonathan S. Lindsey "Analysis of Wikipedia pageviews to identify popular chemicals", Proc. SPIE 11256, Reporters, Markers, Dyes, Nanoparticles, and Molecular Probes for Biomedical Applications XII, 112560I (21 February 2020); https://doi.org/10.1117/12.2542835
Lens.org Logo
CITATIONS
Cited by 3 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Chemical analysis

Medicine

Databases

Chemistry

Absorption

Internet

Luminescence

RELATED CONTENT


Back to Top