The molecules database
About the database
The molecules database contains already predicted GPS positions for about one million molecules represented in SMILES string format. This data have been collected from jobs submitted to ChemGPS-NP Web for a period of time of about ten years and this database is now made available for public search.
This database can also be queried/browsed using our JSON API.
This page shows some general help for using the web inteface for the molecules database.Search Browse
The search field
By default, the search string is matched against SMILES strings:
If you want the search string to match a specific column in the database, then you can prefix it with the column name:
Some search strings might contain whitespace characters. In this case, quote the search string:
For matching multiple columns at once, supply a space separated list of search string prefixed by their column name:
Using multiple search criterias will perform a logical AND when matching the results. All searched is case insensitive.
When fuzzy matching is selected, then a full-text index is used for matching search parameters.
This mode has some advantages in that it finds different combinations of the supplied smiles string, but may return false positives especial when searching in multiple columns.
The default search mode is more traditional and supports searching for example for molecule fragments in the complete SMILES-string stored in the database.
Use wildcard characters for searching for substrings. For example, this search will find all molecules collected september 2017:
For fuzzy search theres no need for using wildcards.
These column names can be used as prefixes for search strings:
|name||text||The molecule name (might be missing or having random names)|
|smiles||text||The SMILES string (always present)|
|pos1 - pos8||float||One of the eight predicted positions|
|created||datetime||The date and time when molecule was added to database|
|ipaddr||text||The IP address from where molecule was submitted|
|hostid||text||The host ID from were molecule was added|
|md5sum||char(32)||The computed checksum for this molecule|
Selecting the option "restrict to molecules submitted from this computer" will filter your search result to molecules collected from jobs submitted from your computer. This is equivalent to using "ipaddr:xxx" with xxx replaced by your ip-address as one search critera.
If "restrict to current used queue name" is selected, then search results are also filtered on your currently used queue name. This is equivalent to using "hostid:xxx" as one search criteria, with xxx taken from the hostid cookie used to track your currently used job queue.