Outlier Detection Method for Identifying Outliers that are not in Gaussian Distribution

Show simple item record

dc.contributor.author Adikaram, K.K.L.B.
dc.contributor.author Hussein, M.A.
dc.contributor.author Effenberger, M.
dc.contributor.author Becker, T.
dc.date.accessioned 2022-08-25T04:04:29Z
dc.date.available 2022-08-25T04:04:29Z
dc.date.issued 2015-03-04
dc.identifier.citation Adikaram, K. K. L. B., Hussein, M. A., Effenberger, M. & Becker, T. (2015). Outlier Detection Method for Identifying Outliers that are not in Gaussian Distribution 12th Academic Sessions, University of Ruhuna, Matara, Sri Lanka, 86.
dc.identifier.issn 2362-0412
dc.identifier.uri http://ir.lib.ruh.ac.lk/xmlui/handle/iruor/7883
dc.description.abstract The most statistical methods demand outlier (noise) in Gaussian distribution. When outliers are not in Gaussian distribution, these methods produce bias results. We introduce an outlier detection method that performs best when the outliers are in non Gaussian distribution. The method is non-parametric and based on properties of arithmetic progression (AP). If the number of elements in AP is n, the maximumelement is a max, the minimumelement is aOT,„, andthe sum of all elements is S„.ThenRmax = amax amm an(i/?min = amax ,a™fnis always equal to 2/n. Usually, /?„,ajc>2/«implies that S n ~ amin*n a max*n $n — the maximumelementis an outlier and Rmin>2/mmp\ies that the minimum elementis an outlier. The value 2/n is nonparametric and always between 0 and 1. If t is a threshold relevant to the considered domain, the value 2/n + t can be used to identify significant outliers where 0 < t <1 -2/n. The method identifies one outlier at a time and continuous application of the method allows detection of multiple outliers. The algorithm was tested using several artificial and real data sets. The real data sets were the data which were automatically recorded with a frequency of twelve data points per day from a biogas plant, over a period of seven months. Among the different parameters, we selected the H2 content, which we expected to maintain linear behavior during the stable operation. When the outliers are non-Gaussian, the Grubbs’ test locates 0% - 17% as significant outliers at the significance level of 0.05,. With the new method, there was t, which was capable of locating more outliers than Grubbs’ test en_US
dc.language.iso en en_US
dc.publisher University of Ruhuna, Matara, Sri Lanka en_US
dc.subject Gaussian distribution en_US
dc.subject multiple outlier detection en_US
dc.subject non-parametric method en_US
dc.subject significant outliers en_US
dc.title Outlier Detection Method for Identifying Outliers that are not in Gaussian Distribution en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account