FAQ

What is ORCHESTRA?

ORCHESTRA is an EU project, funded to disseminate recent research on computer-based in silico methods for evaluating the toxicity of chemicals.

•• In silico methods make it possible to test large numbers of chemicals (as required by EU REACH legislation) while reducing the numbers of tests on animals.

The aim of the project is therefore to promote wider understanding, awareness and appropriate use of in silico methods.

Learn more here
 

How can this project/site help me?

If you are not familiar with in-silico methods, download the introductory leaflet, study the online course and read the ORCHESTRA e-book “Theory, guidance and applications on QSAR and REACH”;
for users of in silico / QSAR methods, there are resource links on this site;
to hear discussion of the issues and priorities, watch the documentary video;
if you want to know more, keep an eye on the posted events & activities
 

What are in silico methods?

In silico methods simply means computer-based methods.  In this context they are computer-based methods for assessing the toxicity of chemicals.  

In silico methods involve using databases of existing experimental data on the toxicity of chemicals.  That data is used to predict the toxicity of chemicals which are similar to those already tested.   The advantage of in silico methods is that they make it possible to assess the toxicity of large numbers of chemicals quickly using a database and software on a desktop computer, whereas assessments were formerly only possible from long and expensive tests in laboratories.  However, the main limitation is in the amount of existing data available; much is held by industry as confidential, and in some areas (such as long term toxicities for mammals) the amount of existing data of high quality is limited.  

In silico methods are widely used (including by regulators in the US) to screen and identify chemicals for priority testing in the laboratory. They are also used in addition to available laboratory data to generate a stronger ‘weight of evidence’.  In silico methods include QSAR models, ‘Read across’ and Virtual Screening.

The term in silico contrasts with in vivo and in vitro.  In vivo methods involve testing chemicals on living organisms, including invertebrates, fish and animals.  In vitro methods involve test-tube research, where (for example) instead of putting the chemical on the skin or in the eye of an animal, the chemical is put in a test tube with some living skin cells or eye cells taken from an animal.  Both in silico and in vitro methods are known as ‘alternative methods’.  In silico methods make use of existing data from in vivo and in vitro tests.

What are QSAR models?

QSAR models (sometimes called in silico models) are one important and sophisticated example of in silico methods.  Like other in silico methods, they are used to predict the properties of chemicals, including the potential toxicity of chemicals in the body and the environment.  

The rationale for all in silico methods is that there is a relationship between (i) the molecular structure and physical / chemical properties of chemicals, and (ii) their biological effects on plants, animals and the environment.  (This applies to carbon-based chemicals.)  This relationship is the basis for predicting the toxicity of a chemical from a knowledge of its molecular structure and physical/chemical properties.  It is called the ‘structure-activity relationship’ or SAR; it has been written about for around 100 years, and is now made more possible by computer technology and databases of available experimental results.

QSAR models take this innovation a step further.  QSAR stands for ‘Quantitative Structure-Activity Relationship’. Developing a QSAR model involves statistically analysing the existing data for a range of related chemicals, in order to identify exactly which structural/physical/chemical properties of the molecule best correlate with particular measured biological effects on organisms and the environment.  By using quantitative data (e.g. for levels of toxicity), QSAR models can produce quantified assessments of toxicity.  

New QSAR models, like CAESAR and VEGA, provide the transparency required by regulators and industry. These models provide the user with full details about the basis for an assessment (i.e. why a toxicity assessment is reached) and the reliability or level of uncertainty within the assessment.  The user can see directly how well the model predicts the toxicity of similar chemicals for which there are experimental results. The user is provided with the outputs required by the REACH regulations.  

Why does industry need to know about in silico methods?

The REACH legislation puts the responsibility on industry to provide the necessary toxicity information on each substance which they manufacture, distribute and market, and to assess and manage the risks linked to those substances.  This is the principle of ‘no data, no market’.  Industry are therefore a key stakeholder in influencing the future use of in silico methods.  

For further information, see the page ‘Why use in silico methods?’ on this site.  It outlines why industry needs to use in silico methods, or at least know about them.

Why do toxicology researchers need to know about in silico methods?

Unlike industry working within a regulatory context, toxicology and ecotoxicology researchers have the freedom to select and/or develop QSAR models specifically for their scientific use, and do not need to meet the regulatory demands for transparency.  

In silico methods, and specifically QSAR models, can be used in a range of ways within research.  QSAR models provide statistical evidence of patterns of toxicity across a series of chemicals, and so, for example, can provide an agenda for empirical research into the mechanisms of action by which particular molecular properties generate those observed biological and environmental effects.  

In silico methods can identify structural alerts for toxicity not yet identified by human experts (see for instance Ferrari and Gini, 2011).  In this way in silico methods should not be considered competing with human experts; that view is an out-dated myth.  In reality, modern in silico methods provide valuable support to human experts, to better explain the chemicals features related to toxicity.

In silico methods will be central to toxicology in the future.  REACH has created a moment of potential transformation in regulatory toxicology which will impact on the wider research field.  Looking ahead, and beyond Europe, the Tox21 and ToxCast initiatives in the USA are involved in screening thousands of chemicals for toxicity, and so are expected to reshape the toxicological procedures for evaluation of chemicals.  By providing up-to-date data for thousands of chemicals, these major projects will put in silico models in a central role.  Experience and understanding of in silico models will be essential for analysing and making full use of that data in future substance evaluations.

Do in silico methods require expertise, or can anyone use them?

Using QSARs requires expertise, firstly in order to judge which model to use for a particular substance and endpoint, and secondly in order to evaluate the level of reliability in the results.  

Company managers may employ consultants or specialist employees to carry out the evaluation, or may develop the expertise to do it themselves.  We suggest all managers can benefit from a core understanding of what the models can produce, and of what factors will increase or decrease the reliability of a result, in order to be able to use the toxicity evaluation wisely as a basis for decision-making.   (They may also need this understanding if their advisors or consultants are attached to in vivo laboratories and potentially pre-disposed to prefer in vivo tests.)

Expertise and care in practice are vital for the future trajectory of QSAR models.  Now and in the future they may prove to be an increasingly valuable technology, with a potentially important function for protecting human health and the environment.  It is therefore vital to avoid the all-too-familiar trajectory of a developing technology being initially over-stated in terms of its capabilities, and then discredited when it is over-applied or used in an uninformed way.  

It is therefore in the public interest, and in the interests of all professional stakeholders, that QSAR models are explained openly, and are used appropriately and wisely in practice.  That principle underlies the ORCHESTRA project, and is why we offer critical understanding rather than promotion.

Furthermore, ORCHESTRA is devoted to make more explicit the evaluation of the results obtained from QSAR models, in particular through the VEGA platform.

ORCHESTRA is preparing a book on QSAR, which will be made freely available, describing in detail the theory and the practical use of the QSAR models.

Are in silico methods / QSAR models accepted by REACH?

REACH explicitly encourages innovation in toxicity evaluation.  The development of alternative methods is one of the purposes of REACH. 

The legislation sets out conditions specifically for the use of QSAR models, and the European Chemicals Agency (ECHA) offers detailed guidance.  Even the introductory ‘Guidance in a nutshell’ on substance registration advises industry to ‘collect QSAR estimated results for the substance if suitable models are available’ as an initial step.

However, acceptability depends on both the model and on how it is used in practice.  See ‘What makes a good QSAR model?’

Why are in silico methods not yet used widely in REACH?

QSARs have been used for decades in the development of pharmaceuticals, where a drug is to be developed to achieve a particular biological action.  Our interest here is in the reverse use, where the chemical is known, and the biological action is to be predicted.  That too has been the focus of research for decades.  But here we are concerned specifically with their use for evaluating toxicity within the regulatory framework of REACH, and it is early days.  

In practice, the use of in silico methods by European industry within REACH is still limited.  There are  three key practical issues that can delay their use.  They are highly inter-connected.

(i) Progress in developing models for regulatory use:  The limitations for model development are primarily the lack of good available experimental data as the basis for developing models.  The results of the many thousands of past animal experiments carried out or commissioned by industry are held as confidential by industry.   (The regulatory demand for the transparency of QSAR models inevitably requires that they not only reveal the experimental data used to develop the model, but also make available for independent review the details of how the experimental results were achieved.)  In his video interview (see documentary, part 4) Professor John Dearden made a plea to industry to release that data.  

A positive development is that it is now clearer to QSAR developers than ever before, exactly what is expected of QSAR models for regulatory use.  The regulatory framework of REACH, the OECD principles and the ECHA guidelines effectively work together to increase the demands on models in terms of rigour, reliability and transparency.    

(ii) Uptake by industry and consultants: uptake requires awareness of the models available, confidence in their reliability and confidence in regulatory acceptance if they are used.  (See the FAQ: ‘Do in silico methods require expertise, or can anyone use them?’)  The use of in silico methods for REACH also involves a simultaneous shift from simply using in vivo methods as an accepted form of evidence, to using different and complementary sources of evidence.  

(iii) Acceptance in practice by regulators (ECHA and the national competent authorities):  The ultimate responsibility of regulators is to protect human health and the environment, so there is some caution in accepting the use of alternative methods.  The experience of European regulators in using QSARs is also still limited.  

The regulation has made the shift to animal testing being ‘only as a last resort’, but process requires confidence, shared expertise and communication between these three major stakeholders.  It may be encouraged by wider interest among industry shareholders, publics, and policy makers, in realisation that there are alternatives to animal testing.  

Are in silico methods accepted by other chemical regulations?

For decades, QSAR models have been used in the USA to evaluate a series of properties of chemicals.  Indeed, Section 5 of TSCA (Toxic Substance Control Act) requires a manufacturer and/or importer of a new chemical substance to submit a premanufacture notice (PMN) to the US EPA 90 days before commencing the manufacture or import of a new chemical.  Decisions have often been taken without further experimental data.  The US EPA instigated and promoted the development and use of a series of QSAR models to predict properties of interest.

In Denmark, the Danish Environmental Protection Agency has developed and used QSARs for regulatory use.  (See FAQ: ‘Are QSARs expensive or free to use’.)

Some models have been developed for the EU pesticide regulation.  In particular the DEMETRA project developed QSAR models for 5 ecotoxicological endpoints. However, the Pesticide Directive explicitly requires experimental data on pesticides.  These models have therefore been developed for use on related compounds, such as degradation products and impurities, for which the legislation is more flexible.

The EU Cosmetic Directive is an example in the opposite direction, where alternative methods have a central role.  From 2013 no animal experiment should be used to generate data for cosmetics, even though many object that the alternative methods are not powerful enough to fully substitute animal experiments.  

Further information

1)  Benfenati E, Ed: Quantitative structure-activity relationships (QSAR) for pesticides regulatory purposes. Amsterdam: Elsevier; 2007.

Are QSAR models available for all endpoints?

QSAR models are available and in use for some endpoints, while being limited for others (e.g. long term mammalian toxicity).  There are several reasons, including:

  1. the limited amount of high quality experimental data currently available (and released by industry) for some endpoints,
  2. the higher complexity of the chemical and biological processes involved in generating the toxicity for some endpoints, and
  3. the lower regulatory acceptance for certain endpoints.  

In general, for physico-chemical properties QSAR models are more reliable, because the phenomenon to be modelled is less complex, and thousands of experimental results are available in the databases.  Whereas, for models to predict chronic effects, the data are much more limited, and the phenomenon is much more complex.

The EC funded ANTARES project is evaluating existing models which could be used for REACH.  Hundreds of possible models have been identified and listed, which theoretically could be used for tens of REACH endpoints. ANTARES is currently checking the performance of the QSAR models for several endpoints.  (See: FAQs: ‘Is there an independent review of QSAR models’ and ‘What makes a good QSAR model’.)

What makes a good QSAR model?

According to REACH regulation (Annex XI) an assessment using a QSAR model is valid if:

  1. the model is recognised as scientifically valid;
  2. the substance is included in the applicability domain of the model;
  3. results are adequate for classification and labelling and for risk assessment;
  4. adequate documentation of the methods is provided.’

These four ‘conditions’ are specific to the regulatory context, and address important scientific and practical aspects of the use of in silico models.  It is important to note that the four conditions are not just about the model, but also about the particular chemicals it is used for in an assessment, the particular regulatory function it is used for, and how well the model and its use are documented.  

A QSAR model is built on a set of experimental data for particular chemicals (a ‘training set’) and it is trialled by testing other chemicals for which there is also experimental data available (the ‘test set’).  In this way a QSAR model is designed to evaluate a particular set of chemicals in relation to a particular endpoint.  So while it could be used for other chemicals outside this ‘applicability domain’, the evaluation of those chemicals will have a higher level of uncertainty.  

For this reason, REACH and ECHA do not provide approval for models: each use of QSAR models has to be evaluated on its own merit, on a case-by-case basis.  

The first of the four conditions above is nevertheless about the model itself.  In the implementation of REACH, ECHA and others usefully refer to the five OECD principles for QSAR models.  The OECD stated that ‘to facilitate the consideration of a (Q)SAR model for regulatory purposes, it should be associated with the following information:

  1. a defined endpoint;
  2. an unambiguous algorithm;
  3. a defined domain of applicability;
  4. appropriate measures of goodness-of-fit, robustness and predictivity;
  5. a mechanistic interpretation, if possible.’

These five principles refer to the rigour of the model and the transparency of the information provided by the model developer.  The REACH demand to provide ‘adequate documentation of the models’ and their use, is central to demanding and ensuring the quality of models and the quality of their use, now and in the future.  It is ultimately about enabling the regulator to evaluate the chemical safety assessments independently in the interest of human health and the environment.

Is there an independent review of QSAR models?

ECHA does not intend to produce a list of ‘approved’ models, because the value of a model depends on how it is used.  (See FAQ: ‘What makes a good QSAR model?’)  Every user of QSARs needs to be aware that QSAR models are only appropriate and reliable for specific sets of chemicals.  A highly reliable model will not produce reliable results for chemicals that lie outside the domain of applicability.  In addition, models may be suitable for different regulatory functions: risk assessment, classification and labelling or prioritisation, because each makes different demands on the model.  

In terms of the form of the outputs, the QSAR Model Reporting Format (QMRF) checks that a certain number of pieces of information are given.  However, such an assessment is carried out on the basis of the values provided by the developer of the model, and these are not checked independently.

The EC funded ANTARES project is evaluating existing models which could be used for REACH. Hundreds of possible models have been identified and listed, which theoretically could be used for tens of REACH endpoints. ANTARES is currently checking the performance of the QSAR models for several endpoints: carcinogenicity, mutagenicity, LD50, fish and daphnia acture toxicity, bioconcentration factor, ready biodegradability and logP.

Given the range of endpoints, the vast numbers of chemicals, the large palette of mathematical algorithms available, and the potential to use tens of thousands of chemical fragments and thousands of chemical descriptors to build a predictive model, it is clearly possible for future scientists to generate huge numbers of QSAR models.  It is easy to imagine an explosion in the number of useable models, many with quite similar performance.  

Further information

1) ANTARES project.  www.antares-life.eu

Are QSAR models expensive, or free to use?

The CAESAR and VEGA platforms have been produced from EC-funded research, and are therefore freely available for use.  The software can be downloaded. Furthermore, predicted values for more than four million chemicals will be made freely available.

The Danish Environmental Protection Agency has created and made available a database with predictions from more than 70 (Q)SAR models on endpoints for physico-chemical properties, fate, eco-toxicity, absorption, metabolism and toxicity.  More than half of all the estimates are for mammalian (human) toxicity endpoints and include commercial data sets from TOPKAT and MULTICASE as well as models developed in-house.  A structure set of about 166.000 discrete organic chemicals, including almost half of the (around) 100,000 EINECS substances, has been batched through the models, and the results are integrated in the database.  (EINECS: European INventory of Existing Commercial chemical Substances.)

The US Environmental Protection Agency made available a series of QSAR models, within the EPISuite and the T.E.S.T. platforms.

The EC funded ANTARES project is evaluating existing models which could be used for REACH, many of which are free to use.  Hundreds of possible models have been identified and listed, which theoretically could be used for tens of REACH endpoints.

The EU Joint Research Centre has promoted the availability of reliable computer-based estimation methods for use in the regulatory assessment of chemicals.  Within the JRC Institute for Health and Consumer Protection (IHCP), the European Chemicals Bureau (ECB) developed a range of freely available software.  One of the aims of the IHCP is to support the implementation of EU chemicals policy (including the safety assessment of industrial chemicals, chemicals in consumer products, pesticides and biocides) through the development, assessment and application of computational (in silico) methods. 

Typically in such QSAR models, the training set used to develop the model, the algorithm, and other aspects of the model are open for scrutiny by the regulator.   However, in commercially available software the full documentation of each model may not be available.  Indeed, the algorithm and the training set used to build the model are often confidential.  Despite this, the use of commercial programs has not been explicitly criticised by regulatory authorities.  In reality, commercial programs have been used in the USA and Europe, because of their ease of use, their availability and because they represent a major source of QSAR models.  We do not foresee restrictions in their use, and in our opinion it would be a pity to renounce these models.  However, it is likely that in the case of two similar models being available, one commercial and one freely available, the user will prefer the second one.  Within REACH, it also seems likely that the regulators will demand more transparency from the commercial models, and/or prefer the use of fully documented models.

How can I access QSAR platforms for our use?

Several platforms of QSAR models exist. The ANTARES project lists hundreds of QSAR models, tens of them are freely available.

Some well known platforms are:

EPISuite  www.epa.gov/oppt/exposure/pubs/episuite.htm

T.E.S.T.  www.epa.gov/nrmrl/std/qsar/qsar.html#TEST

CAESAR  www.caesar-project.eu/

JRC http://ihcp.jrc.ec.europa.eu/our_labs/computational_toxicology

VEGA:  The ORCHESTRA project, with ANTARES, has developed the VEGA platform, which incorporates CAESAR and T.E.S.T. models into a single framework. An advantage of the VEGA platform is the facilitated and supported user-access: supporting information, tutorials and videos are all available on the web site.  It has been developed from the point of view of the user, and of the REACH requirements. Each model produces not only a predicted value, but also many pages of explanation and assessment of its reliability.

What's next