![]() |
Home | Advanced Search | Help | | Quick Search (enter PubMed ID): |
The YFGdb searches may be used to search and view the functional genomics data set collection. Data and annotation files are provided for download either in a tar ball or zip file format. Note that these files might be in a variety of formats, including MAGE-ML, GEO soft format, pcl or cdt files, or tab-delimited files. For more information on these file formats, see the file formats section below. Our longer term goal is to incorporate all these data within the YFGdb PostgreSQL database, so that all data can be exported either via files in common formats or via a sql dump of the database.
The Quick Search may currently be used to query YFGdb based on PubMed ID (e.g. 16963631). It is available at the top of every YFGdb page.
The Advanced Search may be used to query YFGdb based on study type, experimental technology, Gene Ontology (GO) biological process terms, file format and curation status in YFGdb. Please note that only the "study type" must be selected; the rest of the categories are optional and may be used to further refine your search.
Step 1 (required): Select one or more study types. You may sort the results by author, PubMed ID, file format or experimental technology. The default sort orders the results based on the first author's last name.
Step 2 (optional): To further narrow your results, you may also make the following selections:
Experimental technology: Select one. Please note that this is an "AND" search and the experimental technology selected should be consistent with the study type chosen in Step 1.
GO Biological Process Terms: Using the yeast Gene Ontology (GO) process slim terms (obtained from SGD), YFGdb curators manually associate GO processes addressed by each study, if appropriate. You may select one or more GO process terms to further refine your query. However, please note that each resulting study will match ALL of the selected GO terms.
File formats: To further narrow your results, select one or more file formats. Each resulting study will match ALL of the selected file formats in addition to any query selections made above.
Curation status: You may further restrict your query based on curation status in YFGdb. The current options are the following:
Query Results are organized based on study type and include a list of studies that match your search criteria. In the publication column, links to the relevant YFGdb entry, PubMed entry, SGD curated Paper and web supplements are provided when available:
Click on the YFGdb icon in order to access the YFGdb entry for that study.
Click on the arrow icon in order to access the author's or journal's web supplement for the paper.
To access the relevant PubMed entry for the paper, click on the PubMed icon.
To obtain more information on the relevant paper, click on the "SGD curated paper" icon.
Downloading Data Sets: Click on the tar or zip links in the archive column of the query results in order to download all files associated with a particular study. If the study has not yet been curated in YFGdb, then only the data files in their original format (e.g. text, pdf, Excel, soft, etc.) will be available. If the study has been curated in YFGdb, then the downloadable archive will contain the data files associated with the study, and a README file describing in detail all of the downloaded files associated with the study. These files should be untarred and uncompressed by any standard compressing/uncompressing software (for example, Stuffit Expander). For help and for downloading free versions of programs that can unzip and uncompress these tar files, see the gzip home page.
In addition to the README and data files, we also provide an archetype gene file for some data sets when appropriate. The archetype genes are meant to help indicate what comprises a significant result for a particular study, for example, CLN2 is an archetype gene for the cell cycle data sets.
Each individual study associated with a paper has a study viewer page in YFGdb. Some papers have multiple studies associated with them, in which case a disambiguation page is provided. Clicking on any of the YFGdb study IDs on the disambiguation page will open up the relevant study viewer. The study viewer page contains the following information:
The full citation of the paper associated with the data set is
provided at the top of the study viewer page. Links to the relevant
PubMed entry
, SGD curated paper
and web
supplements
are also provided when available. If the paper is associated with any entries in a public repository such as GEO and/or ArrayExpress, then the accession ids are also provided and serve as direct links to the relevant entries in those repositories.
The YFGdb study ID is a unique accession id that corresponds to a single study for a particular paper. Please note that a single paper may have more than one study associated with it, and multiple studies associated with a publication may or may not be of the same study type. The format of the YFGdb study ID is the Pubmed ID (e.g. 17314980) followed by the study id, e.g. 17314980id466. YFGdb may be searched based on study ids using the Quick Search.
The current curation status in YFGdb is provided for each study:
The study type assigned by SGD or YFGdb curators is given for each study associated with a PubMed ID.
If the study has been curated by YFGdb, then overviews of the study design are also provided. Most of these study descriptions are written by curators as they curate the data set, whereas some are parsed from MAGE-ML files (i.e. provided by the authors).
When appropriate, YFGdb curators indicate which Gene Ontology (GO) biological process(es) are addressed in a particular study. YFGdb curators manually associate GO processes addressed by each study using the yeast GO process slim terms (obtained from SGD).
The experimental technology indicates the broad technique used to generate the data set. A complete list of experimental technologies curated in YFGdb is available here.
The main source(s) of the data files in the YFGdb entry. For more details on the file formats, please see the relevant file format section below. Current data set sources include the following:
The contact person for the data set with a link to their email address. Often this is the corresponding author on the original paper, although in some cases another author may serve as the contact for the data set. Authors may be contacted for more information.
Different visualization tools, such as Java TreeView, are available on study viewer pages, if applicable. A brief description of the tools and links to launch the application for viewing the results are also provided. More visualization and analysis tools will be added over time.
Click on the tar or zip links in order to download all files associated with a particular study. If the study has not yet been curated in YFGdb, then only the data files in their original format (e.g. text, pdf, Excel, soft, etc. ) will be available. If the study has been curated in YFGdb, then the downloadable archive will contain the data files associated with the study and a README file written by YFGdb curators describing in detail all the downloaded files associated with the study. These files should be untarred and uncompressed by any standard compressing/uncompressing software (for example, Stuffit Expander). For help and for downloading free versions of programs that can unzip and uncompress these tar files, see the gzip home page.
In addition to the README and data files, we also provide an archetype gene file for some curated data sets when appropriate. The archetype genes are meant to help indicate what comprises significant expression for a particular study, for example, CLN2 is an archetype gene for the cell cycle data sets. The individual files associated with the study are also listed in tabular format with their file size and type noted. They may be downloaded invidually.
A detailed README file is written by YFGdb curators for each curated study. The README file includes the full citation, PubMed ID, study description, a brief description of the raw and/or processed data files, web supplement links and author contact information. For more information on the different file types, please see the file formats section below.
Archetype gene files are created for studies by YFGdb curators, if appropriate. Archetype genes are intended to indicate what constitutes a significant result for a particular experiment (e.g. the G1 cyclin CLN2 is an archetype gene for cell cycle data sets). Each archetype has a curated description indicating criteria used to identify significant genes in the data set and how the archetype genes meet these criteria. Archetype genes are meant to serve as benchmarks for biologists looking at their favorite genes in a data set and for computational biologists writing algorithms to find significance in an automated way.
The archetype gene files contain the following columns:
A sample archetype file is available here.
There are several other sources for functional genomics data available for both yeast and other species: