• Rezultati Niso Bili Najdeni

View of Use of trellis graphics in the analysis of results from field experiments in agriculture

N/A
N/A
Protected

Academic year: 2022

Share "View of Use of trellis graphics in the analysis of results from field experiments in agriculture"

Copied!
12
0
0

Celotno besedilo

(1)

Use of Trellis Graphics in the Analysis of Results from Field Experiments in Agriculture

Katarina Č obanovi ć , Emilija Nikoli ć - ð ori ć , and Beba Mutavdži ć

1

Abstract

Trellis graphics (Becker, Cleveland, and Shyu, 1996) is a very effective method for visualizing multidimensional data sets. The basic idea behind trellis graphics is to display any of a large variety of 1-D, 2-D or 3-D statistical plot types in trellis layout of panels, where each panel displays a subset of the data for different values of one or more additional discrete or continuous conditioning variables.

The data that we use for the illustration of different applications of trellis graphics are the results of a field experiment conducted at the Institute for Field and Vegetable Crops in Novi Sad in the period 1994-1998 (Čobanović et al., 2001) with three fertilizers (nitrogen, phosphorus and potassium) in three repetitions with nine variants of wheat. In the experiment, four quantities of each fertilizer were applied (0, 50, 100, 150 kg/ha) at plots of the same size in 20 from 64 possible combinations, whereby the yield of wheat (t/ha) was the measured outcome.

1 Introduction

Modern computer technology has changed the way statistical analysis and summary is done. In particular, graphical methods of analysis and summary now play more important role, and deserve increased emphasis in scientific reports (Cleveland,1993).

Statistical graphics generally have two major functions: the analysis and the presentation which is traditionally their primary function. With the development of computer technology and with the intensive use of computer software, the analysis function of statistical graphics has assumed increasing importance (Schmid,1983). In recent times, the statistical graphics are a very attractive way of visual communication and analysis.

1 Faculty of Agriculture, University of Novi Sad, 21000 Novi Sad, Trg Dositeja Obradovića 8, Serbia; katcob@polj.ns.ac.yu

(2)

Graphical methods have a central role in the Exploratory Data Analysis (EDA), the approach of data analysis introduced by John Tukey (1977). As the human brain is very powerful in processing visual information, sometimes a single glance at a graph is enough to identify a very complex structure of data and relations between the variables. Cleveland and his associates at Bell Laboratories did a lot of research in the theory of graphical perception. They focused on the psychophysical aspects of human graphical processing.

The appropriately drawn graphs may help in understanding the data and in presenting the results to others. On the other hand, the graphs not properly drawn, can lead to the wrong conclusions (Tufte, 2001)

Statistical graphics are used both in teaching statistics as well as in the research (Nikolić-ðorić et al., 2006).

The data analysis based on graphics is the first step in a great many statistical investigations.

The trellis graphics were introduced by Becker, Cleveland, and Shyu, 1996.

The basic idea behind the trellis graphics is to display any of a large variety of 1- D, 2-D or 3-D statistical plot types in trellis layout of panels, where each panel displays a subset of the data for different values of one or more additional discrete or continuous conditioning variables.

The name trellis comes from the arrangement of the plots which looks like a trellis (grid, lattice). The term “trellis” comes from gardening where it is an open structure used as a support for vines.

One important application of a trellis display is uncovering the structure of multivariate data and relations of the variables in the multivariable data sets (Becker, Cleveland, and Shyu, 1996). The trellis display enables making important discoveries not found in the initial analysis. By comparing each conditioned panel on the same scale it is possible not just to explore if the relationship between the variables exists, but if it holds for all the levels of the conditioning variable as well.

Although the trellis was developed initially in the context of large data sets, it is also useful for modelling the data from the designed experiments, even small experiments, and it is a very powerful tool for revealing the structure of interactions in the studies of how a response depends on explanatory variables (Cleveland and Fuentes, 1997).

One of the first applications of the trellis diagram was to present the yields data of ten varieties of barley in an experiment arranged in randomized blocks, carried out in the State of Minnesota in the years 1930 and 1931, at six locations (Fisher, 1966). The trellis display led to the conclusion that the data are in error.

(3)

2 Software for Trellis

A trellis display was implemented in S/S-Plus system (Backer and Cleveland, 1996).

The special package “lattice” for producing trellis graphs was developed in R- language.

Also the trellis was implemented in the statistical packages GENSTAT 8, VSN, International, 2005 and STATISTICA, Statsoft Inc, where it was named a categorized graph, the term first used in 1990. S-Plus Clinical Pack developed by Insightful Corporation, enables the easy integration of powerful S-Plus graphics within SAS environment.

2.1 Software for Trellis

A trellis display consists of panels laid out into a three-way rectangular array of columns, rows and pages. For the small data sets one page is usually enough. In the case of larger data sets multi-page layouts are necessary for presenting the whole data set. The panels of a trellis display, by default are ordered left-to-right and bottom-to-top but may be changed by the user in some other way.The table ordering, for example, is left-to-right and top-to-bottom. The strip labels written at the tops of panels indicate the conditioning (slicing) variable and ranges, values or levels of it, depending on whether it is a numerical (continuous, discrete) or categorical variable.

A trellis display may be created by a single command line. It is based on repeating the same graphical specification for each element in a Cartesian product of levels of one or more factors. The programs in S-Plus and R trellising library have the structure:

b

* a / X

~ Y

where Y is a continuous variable, X is a continuous variable or factor and a, b levels of factors, variables or functions of the fitted model. A scatter plot also consists of panels that are defined by Cartesian product of variables. Each of the panels is a plot of different set of variables but it is based on the entire set of observations and not on a subset as in the case of trellis display. In S-Plus and R it is possible to present a scatter-plot matrix conditioned on values on a relevant variable, i.e. in trellis form.

The very flexible object-oriented S-Plus and R languages make possible the control of display in order to present maximum information of data. It is also possible to define the aspect ratio, multi-panel layout, plotting symbols, lines, colours, character sizes.

(4)

3 Experimental data

The data that we use for the illustration of different applications of trellis graphics are the results of a field experiment conducted at the Institute for Field and Vegetable Crops in Novi Sad in the period 1994-1998 ( Čobanović et al., 2001) with three fertilizers (nitrogen, phosphorus and potassium) in three repetitions with nine variants of wheat. In the experiment, four quantities of each fertilizer were applied (0, 50, 100, 150 kg/ha) at the plots of the same size in 20 from 64 possible combinations (Table 1), whereby the yield of wheat (t/ha) was the measured outcome.

Table 1: Combinations of fertilizer applied in the experiment.

Variants N P K

1 0 0 0

2 100 0 0

3 0 100 0

4 0 0 100

5 100 100 0

6 100 0 100

7 0 100 100

8 50 50 50

9 50 100 50

10 50 100 100

11 100 50 50

12 100 100 50

13 100 100 100

14 100 150 100

15 100 150 150

16 150 50 50

17 150 100 50

18 150 100 100

19 150 150 100

20 150 150 150

4 Discussion

In the analysis of the results from the field experiment in agriculture, several univariate, bivariate and high dimensional trellis displays were applied. Also for the same type of trellis various partitioning of data were made in order to focus on the different features of data.

In order to explore the characteristics of data distribution one-dimensional trellis were applied: trellis dot-plot (Figure 1), box-plots (Figure 2), histograms with probability density plot (Figure 3), and quantile plots (Figure 4). With these

(5)

diagrams the subpopulation structure, the distribution shape and the presence of outliers may be quickly revealed. Figure (1) illustrates that the highest wheat yield was in 1994/95 and the lowest in 1996/7, regardless of the variety and combination of fertilizers.

Trellis notched box-plots (Figure 2) shows that Pobeda had the highest median wheat yield while the yield of variety Lasta had the highest variability. Also on the basis of 95% median confidence interval that consists of notches that are drawn about the median and are extended to

n 58 IQR ,

1

± , where IQR is interquartile range, it is possible to make multiple comparisons of the median yields. The notches that do not overlap represent a significant difference between the medians.

Figure 1: The yield against variety and year given the variant of fertilizer.

Yield (ta/ha) 2 4 6 8 10

BALKAN EVROPAITALIJALASTA NOVRANAPOBEDA PROTEINKRANANISZVEZDA

1 2

2 4 6 8 10

3 4

BALKAN EVROPAITALIJALASTA NOVRANAPOBEDA PROTEINKRANANISZVEZDA

5 6 7 8

BALKAN EVROPAITALIJALASTA NOVRANAPOBEDA PROTEINKRANANISZVEZDA

9 10 11 12

BALKAN EVROPAITALIJALASTA NOVRANAPOBEDA PROTEINKRANANISZVEZDA

13 14 15 16

BALKAN EVROPAITALIJALASTA NOVRANAPOBEDA PROTEINKRANANISZVEZDA

17

2 4 6 8 10

18 19

2 4 6 8 10 20

1994/5 1995/6 1996/7 1997/8

(6)

The trellis histograms with probability density plots (Figure 3) show that the average wheat yield strongly depends on the level of nitrogen. The greatest deviation from normal distribution is for zero level of nitrogen (Figure 4).

In order to explore the presence of interaction, series of box-plots conditioned on values of one or two factors were applied. If two predictors are not of equal importance the exposure should be taken as panel and the confounder as the condition variable.

5 10

5 10

Y

BALKAN

EVROPA

ITALIJA LASTA

NOVRANA

POBEDA ZVEZDA

RANANISKA

PROTEINKA

Figure 2: Trellis notched box-plots.

1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5 10.5

1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5 10.5 Y

0.0 0.1 0.2 0.3 0.4 0.5

0.0 0.1 0.2 0.3 0.4 0.5

0 50

100 150

Figure 3: Trellis histograms with probability density plots.

(7)

Theoretical Quantile

Observed Value

N: 0

-4 -3 -2 -1 0 1 2 3 4

1 2 3 4 5 6 7 8 9 10 11

N: 100

-4 -3 -2 -1 0 1 2 3 4

N: 50

-4 -3 -2 -1 0 1 2 3 4

1 2 3 4 5 6 7 8 9 10 11

N: 150

-4 -3 -2 -1 0 1 2 3 4

Figure 4: Trellis Q-Q plots.

Fertelizer

Yield(t/ha)

EVROPA 90 1 2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18 19 20 2

4 6 8 10

BALKAN 1 2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18 19 20

PROTEINKA 1 2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18 19 20

POBEDA 1 2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18 19 20 2

4 6 8 10

ZVEZDA 1 2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18 19 20

LASTA 1 2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18 19 20

ITALIJA 1 2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18 19 20 2

4 6 8 10

NOVOSADSKA RANA 5 1 2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18 19 20

RANA NISKA 1 2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18 19 20

Figure 5: A box-plot of the wheat yield against variant of fertilizer given variety.

(8)

Varieties

Y ie ld (t /h a)

1994/5

EVROPA BALKAN PROTEINK POBEDA ZVEZDA LASTA ITALIJA NOVRANA RANANIS

12 34 56 78 109 11

1995/6

EVROPA BALKAN PROTEINK POBEDA ZVEZDA LASTA ITALIJA NOVRANA RANANIS

1996/7

EVROPA BALKAN PROTEINK POBEDA ZVEZDA LASTA ITALIJA NOVRANA RANANIS

12 34 56 78 109 11

1997/8

EVROPA BALKAN PROTEINK POBEDA ZVEZDA LASTA ITALIJA NOVRANA RANANIS

Figure 6: A box-plot of the wheat yield against varieties given year.

As we want to provide the equal representation of the effects of the two predictor variables (variety and fertilizer) the panel variables were switched.

Figures 5 and 6 are the examples of the use of Trellis graph in exploring the interactions fertilizer-variety and variety-year. The conditional dependence on two variables is presented in Figure 7, showing the wheat yield against variant of fertilizer given variety and year.

A two-dimensional trellis scatter plot is applied in order to fit the function that approximates the influence of fertilizers on the wheat yield. Figure 8 displays the influence of nitrogen that may be approximated with quadratic regression. It could be easily noticed that the parameters of regression models are not the same for all varieties.

The simultaneous influence of nitrogen and phosphorus on the wheat yield is displayed by means of 3D trellis plot (Figure 9). The 3D scatter plots for

(9)

individual wheat varieties suggest the quadratic regressions that contain linear and quadratic terms for nitrogen and phosphorus, and their interaction.

The better insight into nitrogen-phosphorus interaction is given in the trellis contours plots made on the basis of estimated quadratic response surfaces (Figure 10).

Fertelizers

Yield (t/ha) 1994/95 1 2 3 4 5 6 7 8 9 10 11

1996/5

1 2 3 4 5 6 7 8 9 10 11

1996/7

1 2 3 4 5 6 7 8 9 10 11

EVROPA

1997/8

1 3 5 7 9 1113151719 1 2 3 4 5 6 7 8 9 10 11

BALKAN 1 3 5 7 9

1113151719

PROTEINKA 1 3 5 7 9

1113151719

POBEDA 1 3 5 7 9

1113151719

ZVEZDA 1 3 5 7 9

1113151719

LASTA 1 3 5 7 9

1113151719

ITALIJA 1 3 5 7 9

1113151719

NOV.RANA 1 3 5 7 9

1113151719

RANA NISKA 1 3 5 7 9

1113151719

Figure 7: The wheat yield against variant of fertilizer given variety and year.

(10)

Yield (t/ha)

EVROPA

0 50 100 150

0 2 4 6 8 10

BALKAN

0 50 100 150

PROTEINKA

0 50 100 150

POBEDA

0 50 100 150

0 2 4 6 8 10

ZVEZDA

0 50 100 150

LASTA

0 50 100 150

ITALIJA

0 50 100 150

0 2 4 6 8 10

NOV. RANA

0 50 100 150

RANA NISKA

0 50 100 150

Figure 8: The wheat yield against level of nitrogen given variety.

EVROPA BALKAN PROTEINKA

POBEDA ZVEZDA LASTA

ITALIJA NOVOSADSKA RANA RANA NISKA

Figure 9: 3-D trellis scatter plots and response surfaces for given variety.

(11)

0 40 80 120 0 40 80 120

0 40 80 120

N 0

40 80 120

0 40 80 120 0

40 80 120

P

3.3 4.2

5.0 5.9

4.2 5.0

5.9

7.7

5.0 5.9 6.8 7.7 4.2

5.0 5.9 6.8

4.2

5.0 5.9

6.8 4.2

5.0

5.9 6.8 7.7

3.3 4.2

5.0 5.9

6.8 4.2

5.0 5.9 6.8

3.3

5.0 5.9

BALKAN EVR OPA ITALIJA

LASTA N OVR ANA POBED A

ZVEZD A R ANAN ISKA PR OTEINKA

Figure 10: Trellis contour plots.

5 Concluding remarks

Trellis graphics help in choosing the adequate regression model which reflects the influence of nitrogen, phosphorus and potassium on the wheat yield for individual varieties and for the whole experiment.

The separate influence of nitrogen and phosphorus may be approximated with quadratic functions that differ for the different varieties of wheat. The simultaneous effect may be modelled with response surfaces that is quadratic regression that contains linear and quadratic terms of nitrogen and phosphorus, and their interaction. The response surfaces also differ for different variants of the wheat.

The preliminary analysis based on trellis graphs suggested that the influence of the considered fertilizers on the wheat yield for the whole experiment may be modelled with quadratic function that includes dummy variables. Each dummy variable quantifies the influence of a particular variety of wheat.

Acknowledgements

The paper has been supported by the Ministry of science and environmental protect of Republic of Serbia (Project No.149007).

(12)

References

[1] Becker, R.A., Cleveland, W.S., and Shyu, M.J. (1996): The visual design and control of trellis graphics displays. Journal of Computational and Graphical Statistics, 5, 123-155.

[2] Becker, R.A. and Cleveland, W.S. (1996): S-PLUS Trellis Graphics User’s Manual. Math, Trellis Versions 2.0 & 2.1., M, Inc. MathSoft.

[3] Cleveland, W.S. and Fuentes, M. (1997): Trellis Display: Modeling Data from Designed Experiments (Technical Report). Murray Hills, NJ: Bell Labs.

[4] Čobanović, K., Nikolić-ðorić, E., Jovanović, M., Malešević, M., and Mutavdžić, B. (2001): Ispitivanje sortnih karakteristika pšenice primenom proizvodnih funkcija. Zbornik radova sa XXVIII Jugoslovenskog simpozijuma o operacionim istraživanjima, SYMOP-IS 2001, 417-420, Beograd.

[5] Fisher, R. (1966): Design of Experiments. Edinburgh: Oliver and Boyd.

[6] Nikolić-ðorić, E., Čobanović, K., and Lozanov-Crvenković, Z. (2006):

Statistical graphics and experimental data, ICOST 7, Proceedings (Editors:

Allan Rossman, Beth Chance), Brazil.

[7] STATISTICA 7.1 (2006): StatSoft,Inc. University Licence, Novi Sad.

[8] Tufte, E.R. (2001): The Visual Display of Quantitative Information, 2nd edn.

Cheshire, CT: Graphic Press.

[9] Tukey, J.W. (1977): Exploratory Data Analysis, Reading, Mass: Addison- Wesley.

Reference

POVEZANI DOKUMENTI

Within the empirical part, the author conducts research and discusses management within Slovenian enterprises: how much of Slovenian managers’ time is devoted to manage

The research attempts to reveal which type of organisational culture is present within the enterprise, and whether the culture influences successful business performance.. Therefore,

– Traditional language training education, in which the language of in- struction is Hungarian; instruction of the minority language and litera- ture shall be conducted within

The article focuses on how Covid-19, its consequences and the respective measures (e.g. border closure in the spring of 2020 that prevented cross-border contacts and cooperation

We analyze how six political parties, currently represented in the National Assembly of the Republic of Slovenia (Party of Modern Centre, Slovenian Democratic Party, Democratic

Several elected representatives of the Slovene national community can be found in provincial and municipal councils of the provinces of Trieste (Trst), Gorizia (Gorica) and

On the other hand, he emphasised that the processes of social development taking place in the Central and Eastern European region had their own special features (e.g., the

The comparison of the three regional laws is based on the texts of Regional Norms Concerning the Protection of Slovene Linguistic Minority (Law 26/2007), Regional Norms Concerning