Interesting

You are currently browsing the archive for the Interesting category.

Members of Sage-N Research’s Total Support Program (TSP) will want to read the following courtesy advisory bulletin carefully and contact our support team if deemed necessary.
Note:  If you are currently out of TSP, or not using newer hardware servers (e.g. Fujitsu), the following message will be of the utmost importance to you as well:

By far the most common problem we see in the field is hard drive failures. Given there are moving parts inside hard drives which spin at a very high rate of speed for years on end, it is common for these drives to fail at some point in time. This can be a catastrophic failure resulting in a total loss of your data in the event more than one drive fails.  By monitoring and taking the correct course of action when a drive fails, data loss can probably be avoided.

SORCERER ™ systems are configured with a disk technology called RAID that allows the system to continue to function normally in the event of a single hard drive failing. It is possible that your SORCERER system may have a failed hard drive right now and you may not even be aware. It is crucial that in the event a single drive has failed, that drive must be replaced as soon as possible. If it is not replaced, another drive failing will lead to a total loss of data.

How do you know if a drive has failed or not?  The newer Fujitsu systems come equipped with a few options that make checking for hard drive failures easy and they allow automated email notification of hardware failures:

  • A quick way to tell if a hard drive has failed is by checking the lights on the front of the system. On Primergy-based SORCERER 2 and Lab systems, the hard drive bays are located at the bottom front of the tower. For Enterprise systems, they are in the separate disk subsystem. You should see a green light on every active drive in the system. If any drive has a red light on, that indicates the drive has failed. Please contact our support team immediately if this is the case at  support@sagenresearch.com.
  • SORCERERs built on newer server platforms also feature hardware monitoring software which can send out automated email alerts when a hardware problem arises. Given that most people do not physically check the lights on their system daily, we highly recommend all customers set up the alerting software if you have not already done so.  Please contact our support team at  support@sagenresearch.com for assistance on setting up the alerts. As well, we would like to offer our TSP members complimentary monitoring of email alerts by routing them to our support address.
  •  For those experienced in using the Linux command line: If you run the command “PrimeCollect” as the root user, the Fujitsu system can generate a diagnostics report for your system. You may upload your file at:  http://dropbox.yousendit.com/SageN and we will get back to you once we have analyzed the results.

We hope that this advisory bulletin will give our TSP members the opportunity for peace of mind that their SORCERER system is running smoothly, their data is protected and that the Sage-N Research support team is at your finger tips.

We will continue to strive to offer our customers the very best in available advanced hardware features for performance, reliability and expansion by using enterprise-grade (vs. consumer-grade) components that are designed for years of continuous 24/7 peak operation.

**If you are not currently covered under our TSP maintenance plan, or if your SORCERER hardware is something other than the newer server (e.g. Fujitsu), please contact info@sagenresearch.com to discuss options for rejoining TSP and/or upgrading your hardware.

Tags: , , , ,

The newest release of Sorcerer Proteomics Edition (Sorcerer PE) software is now available in beta to supported Sorcerer customers. It introduces several new enhancements:

  • New native file formats based on MS2 and SQT for greater data handling efficiency
  • The obsolete DTA and OUT formats have been removed from the internal flows of Sorcerer but are still available for import and export to legacy applications
  • Improved system performance and efficiency throughout.
  • Support for the multiple biosample feature of Scaffold — spectra files can be pre-grouped in the search to become separate biosamples in the Scaffold file
  • Built-in processing for Raw files from Thermo LTQ Orbi Velos and Q Exactive mass spectrometers
  • Now bundles the most recent TPP 4.5.2 software
  • Support for Scaffold V3.4

Release 4.2 is the latest in the V4 series of Sorcerer Proteomics Edition (Sorcerer PE) software, and is immediately available for beta testing, which means that all the new features have now been implemented and tested internally, but that the software has not yet received full testing in real-world conditions. If you would like to try out the new features, then please contact support@sagenresearch.com to request the new beta software. If you are currently using version 3.5 or earlier releases, you will also need to enter new license keys.

Sorcerer PE V4′s NEW FILE formats OFFER greater PERFORMANCE

This release completes the transition to new file formats that was begun with v4.1 (which still used the old formats behind the scenes), and now all of Sorcerer PE’s internal use of the legacy Sequest DTA and OUT file formats has been replaced by the more modern MS2 and SQT formats for representing MS2 spectra and peptide matches respectively.  In these days in which tandem mass spectrometers can generate tens of thousands of spectra every hour, it is very inefficient to represent each data item in a separate file — there is a substantial overhead in opening and closing each file, and transfers in a network environment such as Sorcerer uses are typically slow. It also wastes a lot of disk space. So using MS2 and SQT natively throughout the Sorcerer search engine greatly improves the overall performance of the system.

However, although they work well internally to the system, we don’t recommend these formats for an end-user to work with directly —  the formats are neither standardized, amenable to upstream and downstream processing tools, nor easily generalized to other search algorithms. Rather, for input to and output from Sorcerer, we’ve standardized on mzXML for spectra and pepXML for peptide matches as interchange formats that are more general and with extensive community support. PepXML is now generated by default, even if you do not select TPP postprocessing overall. Of course, Sorcerer supports  other formats, too, such as Thermo’s Raw files, but these will be converted to pass through the standard formats — mzXML in the case of Raw files.

One more word about DTA and OUT file legacy support: these files are no longer directly supported by the Sorcerer PE search engine, but you can still import DTAs, and we will have a script to generate OUT files from pepXML, if your downstream processing requires them. Please note that there is one spot in the TPP suite that expects .out files, and that is the “spectrum” hyperlink in the Peptide Viewer, which actually brings up a view of the out file, if any. Most of the scores, masses etc. for the spectrum match that are presented in that view can be added as columns directly to the peptide report. But if you do want to view these OUT files and you don’t mind the extra overhead, then consider running the OUT file compatibility script as a post-processing step. Please consult support@sagenresearch.com for further assistance with the compatibility script.

Multiple Biosample support for Scaffold

One common request from our clients who are keen Scaffold users is for enhanced support in the Sorcerer-Scaffold integration that can take advantage of Scaffold’s ability to group data into different biosamples, corresponding to different columns in the Scaffold view. We’re happy to announce a new feature in the Sorcerer PE software that speaks to this. The way it works is very simple, and requires only a minor difference to the way you have always searched data on Sorcerer.

Previously, if you selected multiple items for searching in the Web GUI, they would all be searched together and would wind up being a single biosample in the Scaffold file. Now, any separately selected item — either a single spectra file, or a folder of several files — will become its own biosample. Typically, the way this is used is to pre-group raw files in subfolders of the search data folder, and each of those subfolders will become a separate biosample, so long as they are each individually selected from within the search data folder. If, however, you select the search data folder itself at the top level, then all its contents will become a single biosample.

Of course, the existing method of working with Scaffold Desktop to add new biosamples based on merging with another Scaffold file is still available, so you can choose whichever method is more suitable for your needs.

Do be aware though, that searching more data in one run will add to the load of the Scaffold analysis. The system resources that Scaffold needs, particularly in terms of memory, is a function primarily of the number of files, and the number of spectra represented by those files. We recommend that any Sorcerer that is used for intensive Scaffold analysis should be upgraded to a minimum of 24GB of system RAM, and that users should discuss their Scaffold analysis usage and possible upgrades to their system with Sage-N support in order to ensure the best performance.

New method for extracting Thermo RAW files in Sorcerer

When Thermo introduced XCalibur 2.1 and 2.2 supporting the Orbi Velos and Q-Exactive instruments, incompatibilities in their libraries meant that the method of extracting spectra from Raw files that Sorcerer then used suddenly stopped working. In response to this, Sage-N Research developed a solution based on a new software method, but that was Windows-specific, and not well suited to other platforms such as Linux. Nevertheless, at the cost of some complexity, particularly in terms of installation, we made it work on Sorcerer, and once again had an integrated flow with Sorcerer PE for XCalibur 2.1 and above.

Now we have implemented an alternative approach, based on a method developed by Dr. Patrick Pedrioli at the University of Dundee, that allows that Sorcerer’s built-in extraction software to be used successfull with the latest XCalibur libraries. It is a lot easier to deploy on Sorcerer than the Windows-based solution , and just requires a few tweaks that Sage-N customer support can easily guide you through or do remotely. This method is now the default flow for the Sorcerer PE 4.2 release.

The Windows/msconvert method remains available for qualified customers who have the requirement to use its different feature set.

New versions of TPP and ScaffolD Software

The version of the bundled Trans-Proteomics Pipeline (TPP) software has been updated to the most recent 4.5.2 software, which provides several new enhancements and bug fixes. Also, the most recent version of Scaffold, V3.4, is now supported. Licensed users may obtain this software at the Proteome Software download site.

Other Sorcerer PE V4 enhancements

The new release rolls up other enhancements from earlier V4 releases including:

  • The SEQUEST 3G scoring module with new features to improve the sensitivity and thoroughness of peptide searches.
  • A new Web API for submitting and getting results from Sorcerer searches over the network has been implemented to help developers use Sorcerer as a search engine within their programs and scripts.
  • A component design for the Sorcerer-as-a-platform architecture, co-existing with other life science analysis software
  • Enhancements to the MUSE scripting framework to allow more powerful scripts to customize Sorcerer searching.

Please review an earlier posting for further details of these and other enhancements in Sorcerer PE V4.

 

Tags: , , , ,

Release 4.1 is an update to V4.0, which was only released as beta software to a limited number of users, so this release will be the first general release in the Sorcerer PE version 4 series.   The release is currently entering a beta-testing period, following which (probably in late summer), it will be made available to Sorcerer customers with active support arrangements, as well as installed on newly purchased Sorcerer systems.

This release contains enhancements in many different areas of the Sorcerer software:

  • The SEQUEST 3G scoring module has new features to improve the sensitivity and thoroughness of peptide searches.
  • The data flows for Sorcerer processing have been rearchitected to use MS2 and SQT data formats instead of the legacy SEQUEST DTA and OUT file formats.
  • As a solution for the issue of extracting from recent RAW files, an interface has been developed within the Sorcerer software to connect to a separate Windows system and to remotely run ProteoWizard’s new MSConvert extractor with instrument -specific libraries
  • The bundled version of Trans-Proteomic Pipeline software is updated to V4.4.1, which offers multiple enhancements.
  • The new Sorcerer software now supports Scaffold V3.1.2, with new features in TIC quantitation and batch file merging
  • The Scaffold flow has also been reworked on the Sorcerer side, enabling users to identify multiple biosamples for Scaffold in a single search.
  • A new Web API for submitting and getting results from Sorcerer searches over the network has been implemented to help developers use Sorcerer as a search engine within their programs and scripts.
  • This software release has been designed as a component for the Sorcerer-as-a-platform architecture, co-existing with other life science analysis software
  • Enhancements to the MUSE scripting framework to allow more powerful scripts to customize Sorcerer searching.

Read the rest of this entry »

Tags: , , ,

Mark Your Calendars! Sage-N Research
User Group Meeting ASMS – Denver
June 4th 2011!

The meeting is open to in-warranty Sorcerer customers and by invitation only. Pre-registration is required. A buffet dinner and refreshments are being provided, and there will be a drawing for customer door prizes. We will as usual have a ultra-cool door prize! (But make sure you come on time for the best chance to win!)

As usual, we will have great speakers, and also have training talks on the new SEQUEST 3G and the new SORCERER Proteomics Edition Software.

Date: Saturday Evening, June 4, 2011
Time: 5 PM to 8:30 PM
Address: Sheraton Denver Downtown Hotel,1550 Court Place, Denver, CO 80202, (303) 626-2517

Important Note: We are meeting on Saturday this year!

We have developed an new flow for processing Thermo RAW files that works both with the most recent XCalibur V2.1, as well as with earlier versions. This flow has been giving good results in internal testing, and we are now releasing it for beta testing to any interested, actively supported Sorcerer customer.

Thermo LTQ Velos users will have noticed the major changes to the XCalibur software that were introduced at version 2.1. The installation process is different, and requires a new component called Thermo Foundation, and some of the file names and locations have changed. All of these changes are no longer compatible with the ReAdW program that is used within the CrossOver environment by Sorcerer. One workaround which has been commonly suggested in the Thermo field is to down-rev the XCalibur used on the instrument to V2.0 and to continue using the old software for analysis. This remains a viable option, but with our newly developed solution, it is now also possible to use 2.1 RAW files on Sorcerer.

We are moving to a new spectrum extractor called msconvert (part of the ProteoWizard suite)  which works with a different version of the Thermo libraries, and for which we have developed a new integration in the CrossOver environment. We are offering this as a beta release to our in-warranty customers. This solution  entails a few Linux operations to reinstall CrossOver with the latest release, to configure the required libraries and to install a new Sorcerer workflow script; it is fairly straightforward for people comfortable with the Linux environment, or alternatively, we can do it for you if you give us remote access to your system. Please contact us at support@sagenresearch.com for more information.

Tags:


Prof. Josh Elias (left) of Stanford University receives a thank-you gift from David Chiang after his talk.

Ever wondered about target-decoy searching? Want to gain a better understanding and realistic expectation of this effective tool? SageNResearch’s video “Addressing Peptide Identification Signal-to-noise With Target-Decoy Searching”, given by Professor Josh Elias of Stanford University at our “Translational Proteomics 2.0″ meeting, can help. Dr. Elias is an Assistant Professor in Chemical and Systems Biology at Stanford University, and was part of the Steven Gygi Lab at Harvard Medical School before that. His lab is keenly interested in developing and applying methods to meet the current challenges facing scientists engaged in large scale proteome characterization.

Josh kicked off his talk with a stunning and very powerful visual to hit home the concept of what target-decoy database searching can do — you’ll never look at coffee beans in quite the same way. With this talk, you’ll know how to better find a happy medium for thresholds, smarter ways of designing your filtering criteria, when not to even consider using the method, how to get the most out of (really easy) decoy searching in SORCERER, and what’s so good about partial tryptic searches.

The 30-minute presentation is available at: http://www.scivee.tv/node/15544
To view slides, I recommend using the “full screen” mode. The slide set can also be downloaded as a Powerpoint file.

Tags: , , , , ,


Prof. Alexey Nesvizhskii (left) of University of Michigan receives a thank-you gift from David Chiang after his talk.

If you really want to understand how peptide and protein identification is done, this video talk is a must-see!

Professor Alexey Nesvizhskii of the University of Michigan is one of the co-inventors (with Dr. Andy Keller) of the popular PeptideProphet/ProteinProphet algorithm for turning search engine results into statistically consistent peptide and protein identifications. (This algorithm is also the basis for the popular Scaffold software.)

At the “Translational Proteomics 2.0″ meeting, we were privileged to have Alexey give his insightful talk that reviews the various steps involved in inferring peptide and protein identifications from large spectra datasets.

In this talk, you will learn why False Discovery Rates are preferred over P-values, why you probably should not run more than 4 replicates of a MudPIT experiment, how FDR estimations from decoy differ from Peptide/ProteinProphet, how “The Two Prophets” compute probabilities by curve-fitting the score distributions, how sensitivity and FDR are computed, and the what and why of some advanced TPP options.

The talk is available at: http://www.scivee.tv/node/12671 (45 minutes).

I recommend using the “full screen” mode so you can view the slides, which are also available as a download from the site. (Please be aware that the slideset order is different from that in the presentation.)

(Note: Both Trans-Proteomic Pipeline and Scaffold Batch software are integrated into the SORCERER platforms.)

Tags: , , , , ,

by David.Chiang@SageNResearch.com

Proteomics mass spectrometry is finally sensitive and specific enough for robust translational medicine (at least in capable hands), and holds tremendous promise to revolutionize biology and medicine. For some, it holds the key to incredible research power for decades to come.

However, there is a chasm that continues to grow between the productive and unproductive labs, because too many proteomics practitioners focus too early on low-level issues (i.e. cost, automation, ease-of-use) without first resolving high-level ones (i.e. sensitivity in presence of noise, quality of results, algorithmic suitability).

For many researchers experimenting with a new high-resolution instrument, the most common scenario is to select a workflow based on running a simple protein solution, usually a purified BSA solution or a commercial protein mixture.

Since different workflows will give basically identical protein IDs results for these simple test cases, they may conclude that all search engines are equivalent. While true when there is almost no signal noise, it is largely irrelevant in translational research. In fact, the exact same test will likely show that low-resolution and high-resolution mass specs are equivalent, the lowest quality reagents will suffice, or maybe you don’t have to clean your glassware as often. These are also true when there is little or no signal noise, but again, that is irrelevant for real-world research.

Seeing that there is little difference in protein IDs, some focus on using protein coverage as the sole metric for evaluating search engines. However, this is actually the opposite of what is needed for sensitive discovery proteomics. For example, if you are hunting for new protein biomarkers (especially a “one-hit wonder”), you do not want the protein inference engine tuned to assigning any ambiguous peptides to already found proteins, thereby hiding them from further study.

Not surprisingly, a workflow selected based on low-noise experiments and focused on protein coverage will excel for simple mixtures, but is not sensitive enough to analyze complex mixtures with wide dynamic range, such as in translational research. Scientists will be able to see the abundant peptides and proteins, but probably little else. That is roughly what most proteomics researchers find today, nothing meaningful, but enough of the obvious to not change their methodologies.

The result is that most labs are not getting the value commensurate with their investments in proteomics mass spectrometry. Under the current economic environment, this is both wasteful and dangerous.

Within the academic world, while many proteomics researchers have trouble getting any interest, a select few are swamped and have to turn away collaborators. Within drug discovery firms, while many are staring at their mostly idle mass spectrometers, a select few are running multiple mass spectrometers 24/7 sieving productively through millions of peptides.

So why are the majority of the proteomics research not producing high-value results?

With our access into the world’s top academic and drug discovery proteomics labs, we have a unique bird’s eye view into the answer. (However, like attorneys, we never give out client-specific information.)

Please allow me to share some secrets to your future success.

Read the rest of this entry »

Tags: , , , , , ,


“Translational Proteomics 2.0″ 2009 Users Meeting in Philadelphia.
Guest speakers Jimmy Eng (UWashington), Alexey Nesvizhskii (UMichigan), Josh Elias (Stanford), along with SAB member John Yates (Scripps) are in the middle row.


Stanford’s Dr. Chris Adams (left) must be feeling pretty lucky!
He gets to use a SORCERER 2 for his research (as part of Allis Chien’s mass spec core facility), AND wins an Acer One netbook door prize from David Chiang!

Translational proteomics — aka Proteomics 2.0 — is high-sensitivity proteomics for translational research, whose mastery is your key to unimaginable fame and fortune in biology and medicine!

Whether you need to catch up or to keep up, you need to hear the leading proteomics technologists reveal their secrets!

We were fortunate to have three of most accomplished technologists (Mr. Jimmy Eng, Prof Josh Elias, and Prof Alexey Nesvizhskii) at our “Translational Proteomics 2.0 Meeting” give their insider insights on high-sensitivity data analysis.

In addition, we were privileged to have Sage-N Research SAB advisor Prof John Yates, one of the fathers of proteomics, attend our meeting and join in our lively panel discussions regarding the present and future of translational proteomics.

From the talks, these are tips for best sensitivity and specificity:

* There are several equivalent ways to calculate precursor mass, all of which can result in several AMUs of mass error due to incorrect isotope assignment.
* Semi-tryptic settings for database searching gives the best performance
* Use a wider mass tolerance than your experiments will yield
* However, you don’t need a wide mass tolerance for searching if (a) you use isotope shift check and (b) you have a decent source of noisy peptide, e.g. with semi-enzyme search
* Post-process peptide IDs with proper statistical tools (e.g. PeptideProphet, DTASelect or target-decoy analysis)
* Key is to monitor the false discovery rates (FDR) with different filtering criteria
* Use monoisotopic mass for fragment ions, and for precursor ions if using high-resolution instrument
* P-values or E-values are not good for large-scale proteomics, because they don’t give you estimated data rates for a given score cut-off, and they ignore other relevant factors (e.g. retention time, mass accuracy, etc.)
* The target-decoy method is a simple and effective means of FDR estimation. It gives scores more discriminatory power by improving signal-to-noise ratio.
* Can use search scores in combination with other characteristics to get more good IDs at a particular FDR than by using score alone

We will be publishing the meeting talks online. Watch this space for details!

Tags: , , , , , , ,

Hear Khatereh discuss her work and her success with the SORCERER 2 system!

Dr. Khatereh Motamedchaboki is currently the Manager of the Proteomics Facility at the Burnham Institute for Medical Research.

She is one of our increasing number of two-time SORCERER success stories, as a previous user at the Ebrahim Zandi Lab at the University of Southern California.

Reference: Laurence M. Brill, Khatereh Motamedchabokia, Shuangding Wu, and Dieter A. Wolf, “Comprehensive proteomic analysis of Schizosaccharomyces pombe by two-dimensional HPLC-tandem mass spectrometry”, Methods (2009), doi:10.1016/j.ymeth.2009.02.023.

Click Here to See Video

Tags: , , ,

Our R&D team is busy working on the next major version of the Sorcerer-PE software, and expects to release it to then-in-warranty customers in the next few weeks.  Early previews and beta tests of some of the components will be made available by arrangement to qualified customer sites.

Highlights of the upcoming release include:

  • ETD fragmentation support and analysis
  • MUSE scripting modules for rescoring peptide matches with Olsen-Mann and Sadygov-Coon scores
  • Interoperation with major components of the Yates lab Sequest suite, including the DTASelect filtering and statistical analysis tool, and the Census quantitation application
  • Enhancements to the SEQUEST engine which provide first-pass cross-correlation scoring and E-values for greater accuracy and sensitivity

Read the rest of this entry »

Tags: , , , ,

Three of the world’s leading experts on MS-MS protein identification came together recently at Sage-N Research’s annual user group meeting, and presented methods and results for the techniques and tools with which they are associated:

  • Jimmy Eng, co-inventor of Sequest and developer of many proteomics tools, presented tips for Sequest analysis
  • Josh Elias, who pioneered the systematic use of decoy databases for FDR estimation, gave a talk on how to use that technique to address Peptide ID signal-to-noise.
  • Alexey Nesvizhskii spoke about the tools he co-authored, in “Peptide identification and protein inference using PeptideProphet and ProteinProphet”

Their talks were very wide-ranging and full of practical insights for the proteomics user community, and they explored the different research interests, data sets, analysis methods and workflows in the individual labs.  However, they all had this in common: they had kept a careful eye on their search settings, monitored sensitivity and error rates, and come to a common, if perhaps not entirely intuitive, conclusion: the most sensitive search and the lowest error rates for shotgun proteomics are achieved when using semi-enzymatic searches — that is, when one end, but not both, of the peptide is allowed to diverge from the expected cleavage site.

Read the rest of this entry »

Tags: , , , , , , , , , ,