# ΟΙ ΣΥΝΕΡΓΑΤΕΣ ΠΟΥ

## ΣΥΜΒΑΛΛΟΥΝ ΕΝΕΡΓΑ ΣΤΗΝ ΕΠΙΤΥΧΙΑ

XLSTAT-Dose is a statistical analysis MS Excel add-in complementary to XLSTAT-Pro that has been developed for dose analysis in the chemical and pharmaceutical industries. The software's main features are dose effect analysis including a large variety of model options (Logit, Probit, Gompertz, Log-log) and four-parameter logistic regression which enables fitting models of the type a-(d-a)/(1+(x/c)^b).

# Dose effect analysis

### What is dose effect analysis

Logistic regression (Logit, Probit, complementary Log-log, Gompertz models) is used to model the impact of doses of chemical components (for example a medicine or phytosanitary product) on a binary phenomenon (healing, death).

### Natural mortality in dose effect analysis

Natural mortality should be taken into account in order to model the phenomenon studied more accurately. Indeed, if we consider an experiment carried out on insects, certain will die because of the dose injected, and others from other phenomenon. None of these associated phenomena are relevant to the experiment concerning the effects of the dose but may be taken into account. If p is the probability from a logistic regression model corresponding only to the effect of the dose and if m is natural mortality, then the observed probability that the insect will succumb is:

P(obs) = m + (1- m) * p

Abbot's formula (Finney, 1971) is written as:

p = (P(obs) – m) / (1 – m)

The natural mortality m may be entered by the user if it is known from previous experiments, or it can be determined by XLSTAT.

### Compute effect dose from from ED01 to ED99 including ED 50, ED 90

XLSTAT calculates ED 50 (or median dose), ED 90 and ED 99 doses which correspond to doses leading to an effect respectively on 50%, 90% and 99% of the population.

### Results for dose effect analysis in XLSTAT

• Goodness of fit coefficients: This table displays a series of statistics for the independent model (corresponding to the case where the linear combination of explanatory variables reduces to a constant) and for the adjusted model.
• Observations: The total number of observations taken into account (sum of the weights of the observations);
• Sum of weights: The total number of observations taken into account (sum of the weights of the observations multiplied by the weights in the regression);
• DF: Degrees of freedom;
• -2 Log(Like.): The logarithm of the likelihood function associated with the model;
• R² (McFadden): Coefficient, like the R2, between 0 and 1 which measures how well the model is adjusted. This coefficient is equal to 1 minus the ratio of the likelihood of the adjusted model to the likelihood of the independent model;
• R²(Cox and Snell): Coefficient, like the R2, between 0 and 1 which measures how well the model is adjusted. This coefficient is equal to 1 minus the ratio of the likelihood of the adjusted model to the likelihood of the independent model raised to the power 2/Sw, where Sw is the sum of weights.
• R²(Nagelkerke): Coefficient, like the R2, between 0 and 1 which measures how well the model is adjusted. This coefficient is equal to ratio of the R² of Cox and Snell, divided by 1 minus the likelihood of the independent model raised to the power 2/Sw;
• AIC: Akaike’s Information Criterion;
• SBC: Schwarz’s Bayesian Criterion.
• Test of the null hypothesis H0: Y=p0: The H0 hypothesis corresponds to the independent model which gives probability p0 whatever the values of the explanatory variables. We seek to check if the adjusted model is significantly more powerful than this model. Three tests are available: the likelihood ratio test (-2 Log(Like.)), the Score test and the Wald test. The three statistics follow a Chi2 distribution whose degrees of freedom are shown.
• Type III analysis: This table is only useful if there is more than one explanatory variable. Here, the adjusted model is tested against a test model where the variable in the row of the table in question has been removed. If the probability Pr > LR is less than a significance threshold which has been set (typically 0.05), then the contribution of the variable to the adjustment of the model is significant. Otherwise, it can be removed from the model.
• Model parameters: The parameter estimate, corresponding standard deviation, Wald's Chi2, the corresponding p-value and the confidence interval are displayed for the constant and each variable of the model. If the corresponding option has been activated, the "profile likelihood" intervals are also displayed.
• Model equation: The equation of the model is then displayed to make it easier to read or re-use the model.
• Standardized coefficients table: The table of standardized coefficients (also called beta coefficients) is used to compare the relative weights of the variables. The higher the absolute value of a coefficient, the more important the weight of the corresponding variable. When the confidence interval around standardized coefficients has value 0 (this can be easily seen on the chart of normalized coefficients), the weight of a variable in the model is not significant.
• Predictions and residuals table: The predictions and residuals table shows, for each observation, its weight, the value of the qualitative explanatory variable, if there is only one, the observed value of the dependent variable, the model's prediction, the same values divided by the weights, the standardized residuals and a confidence interval.
• Probability analysis table: If only one quantitative variable has been selected, the probability analysis table allows to see to which value of the explanatory variable corresponds a given probability of success.

# Four/Five-parameter parallel lines logistic regression

What is four/five-parameter parallel lines logistic regression?

The four parameter logistic model writes:

y = a + (d -a) / [1 + (x / c)b] model (1.1)

where a, b, c, d are the parameters of the model, and where x corresponds to the explanatory variable and y to the response variable. a and d are parameters that respectively represent the lower and upper asymptotes, and b is the slope parameter. c is the abscissa of the mid-height point which ordinate is (a+b)/2. When a is lower than d, the curve decreases from d to a, and when a is greater than d, the curve increases from a to d.

The five parameter logistic model writes:

y = a + (d -a) / [1 + (x / c)b]e model (1.2)

where e is an additional parameter, the asymmetry factor.

The four parameter parallel lines logistic model writes:

y = a + (d -a) / [1 + (s0 * x / c0 + s1 * x / c1)b] model (2.1)

where s0 is 1 if the observation comes from the standard sample, and 0 if not, and where s1 is 1 if the observation is from the sample of interest, and 0 if not. This is a constrained model because the observations corresponding to the standard sample influence the optimization of the values of a, b, and d. From the above writing of the model, one can understand that this model generates two parallel curves, which only difference is the positioning of the curve, the shift being given by (c1-c0). If c1 is greater than c0, the curve corresponding to the sample of interest is shifted to the right of the curve corresponding to the standard sample, and vice-versa.)

The five parameter parallel lines logistic model writes:

y = a + (d -a) / [1 + (st * x / c0 + sp * x / c1)b]e model (2.2)

XLSTAT allows to fit:
•    model 1.1 or 1.2 to a standard sample or to the sample of interest,
•    model 2.1 or 2.2 to the standard sample and and to the standard sample the same time.

XLSTAT allows to either fit models 1.1 or 1.2 to a given sample (A case), or to fit models 1.1 or 1.2 to the standard (0) sample and then fit models 2.1 or 2.2 to both the standard sample and the sample of interest (B case).
If the Dixon’s test option is activated, XLSTAT tests for each sample if some outliers influence too much the fit of the model. In the A case, a Dixon’s test is performed once the model 1.1 or 1.2 is fitted. If an outlier is detected, it is removed, and the model is fitted again, and so on, until no outlier is detected. In the B case, we first perform a Dixon’s test on the standard sample, then on the sample of interest, and then, the models 2.1 or 2.2 is fitted on the merged samples, without the outliers.
In the B case, and if the sum of the sample sizes is greater than 9, a Fisher’s F test is performed to detect if the a, b, d and e parameters obtained with models 1.1 or 1.2 are not significantly different from those obtained with model 2.1 or 2.2.

Results displayed by XLSTAT

If no group or a single sample was selected, the results are shown for the model and for this sample. If several sub-samples were defined (see sub-samples option in the dialog), the model is first adjusted to the standard sample, then each sub-sample is compared to the standard sample.

Fisher's test assessing parallelism between curves: The Fisher’s F test is used to determine if one can consider that the models corresponding the standard sample and the sample of interest are significantly different or not. If the probability corresponding to the F value is lower than the significance level, then one can consider that the difference is significant.
Goodness of fit coefficients: This table shows the following statistics:
•    The number of observations;
•    The number of degrees of freedom (DF);
•    The determination coefficient R2;
•    The sum of squares of the errors (or residuals) of the model (SSE or SSR respectively);
•    The means of the squares of the errors (or residuals) of the model (MSE or MSR);
•    The root mean squares of the errors (or residuals) of the model (RMSE or RMSR);

Model parameters: This table displays the estimator and the standard error of the estimator for each parameter of the model. It is followed by the equation of the model.
Predictions and residuals: This table displays giving for each observation the input data and corresponding prediction and residual. The outliers detected by the Dixon’s test, if any, are displayed in bold.

Charts: On the first chart are displayed in blue color, the data and the curve corresponding to the standard sample, and in red color, the data and the curve corresponding to the sample of interest. A chart that allows to compare predictions and observed values as well as the bar chart of the residuals are also displayed.

## Benefits

• Easy and user-friendly
• Data and results shared seamlessly
• Modular
• Didactic
• Affordable
• Accessible - Available in many languages
• Automatable and customizable

## System configuration

• Windows:
• Versions: 9x/Me/NT/2000/XP/Vista/Win 7/Win 8
• Excel: 97 and later
• Processor: 32 or 64 bits
• Hard disk: 150 Mb
• Mac OS X:
• OS: OS X
• Excel: X, 2004 and 2011
• Hard disk: 150Mb.