According to Bhatla et al. [1
], credit card fraud happens when some individual uses another individual’s credit card for his or her own benefit, whilst neither the card owner or the card issuer are aware of the fact that the card is being used. Furthermore, the fraudster has no connection with any of the other parties involved and no intention of either contacting the card owner or making repayments for the purchases made.
While credit card fraud has been increasing worldwide, there is a significant effort to try to reduce the losses for credit card companies and merchants. Detecting it poses some challenges, including the high number of transactions, the amount of data involved in each of them and the need to reduce the occurrences of false positives, which can overload the fraud department responsible for checking it, for example, via phone call to the cardholder [2
]. One way is to focus on the records regarding the phone calls and other features, to classify possible fraudulent acts.
In this paper, we introduce an approach to sound analysis events in the fraud context, so that acoustic indices can be computed to improve the performance of the classifiers. Complemented by variables from the purchase data to try to find a correlation between them and the fraudsters’ behavior. Moreover, in this work we addressed the feature extraction task, related with the fraud detection using buyer-placed information combined with the sound analysis (and its complexity), as a business intelligence tool.
An audio can be interpreted as a wave signal in which the observations are indexed in time. Using the Fourier transform, it is possible to convert and segment to an equivalent representation in frequency domain. Therefore, this type of analysis represents a sum of sinusoidal waves, as a composition of simple harmonic vibrations, with specific frequencies. The frequency content is rich in intelligible information that can be used to differentiate between the signal sources by working out the differences between them.
Complementary to that, following Sueur et al. [3
], the importance of sound as a raw material extends to many areas of science. Thus, the audio analysis can be summarized through indices, which represent its complexity, as well as phenomenon both in the domain of time and frequency.
This paper is organized as follows. In Section 2
, we describe the data structure and adopted audio processing methodology and establish the basis for a theoretical approach, thus introducing some of the concepts to present sound analysis studies, classification techniques and its statistical performance. Later, Section 3
summarizes different works related to past studies using acoustic classification. Then, Section 4
presents the empirical results and Section 5
concludes with remarks and discusses future work.
3. Related Works
Statistical classification methods can be applied considering the many challenging contexts of consumer credit. Over the last two decades, the number of studies in this area have increased significantly and have aimed not only at improving the models, given fraud detection presents challenging conditions such as skewed distributions and non-uniform cost per error in the data but also in techniques facing pre-processing methods [17
]. Furthermore, such classification tasks have also been used in the fraud detection field for the ever-increasing e-commerce credit card scenario, see Bolton [20
Additionally, the enrichment of the database by adopting tools describe unstructured data, such as fraudulent activity and a new area to be explored. For example, companies are increasingly generating information related to modeling linguistic and acoustic data in which they are being recorded/collected and require to be processed [21
]. For instance, Burgoon [24
] applied this analysis to detect fraudulent statements in company conference calls.
Burgoon et al. [24
] presented a case in which linguistic analysis combined with acoustic indices were used in the description of false communication, in negotiations of public and private companies in order to investigate fraudulent actions, investors, researchers and lenders. These cases appeared more relevant in situations where the interlocutors needed to improvise the communication and this situation is more similar to the case of analyzing telephone calls for fraud detection. In this case, the vocal quality of speech is related to the situation of the speaker lying or not [24
], the vibration of the vocal folds is facilitated by the muscles that surround the larynx. These muscles contract during stress and arousal states, which increases the tension that affects the frequency of vocal fold vibration.
There are many companies that analyze discourse (linguistic and vocal analysis), however it is common to realize that this work is done manually and is exhaustive and boring [25
]. This problem motivated this work, as it studied the viability of entropy in these analyses in the form of acoustic indices. Although commonly applied in the biology field for example in Sueur [11
] where the acoustic indexes were applied to identify the biodiversity in forests, it can also be applied in many different contexts, aiming to extract audio characteristics to help the specialists make decision (usually in the classification task).
The next section will present all three points mentioned here: Credit/Behavior Score, Linguistic and Vocal Analysis and Acoustic Indexes under the perspective of a study case from a company, located in São Paulo state, Brazil, which analyzes the quality of e-commerce purchases aimed to highlight possible frauds.
The data analyses considered only 2018, presenting approximately a 50% fraud rate in the set of a company that operates throughout Brazil. In the first part, we conducted a descriptive analysis of the data aiming to gain some insights into the relationship between features related to the buyer’s informed characteristics.
It is worth mentioning that the database provided by the company, under study, presented the purchases as fraudulent operations or not (labeled data set). These fraudulent operations have been subject to review by an operator (assisted by the model of the company), however not classified as such and subsequently confirmed as fraudulent (once the occurrence materialized).
There are places that have a higher number of frauds reported. For example, the cities of São Paulo and Rio de Janeiro had a high rate of fraud (56.6% and 65%, respectively).
Considering the domain variable of the informed email, it helped to classify bad payers (fraudulent activities) serving as a flags-up. For instance, different quantiles for Yahoo.com.br and Uol.com.br domain were shown, according to Figure 2
Regarding the category of the products, we found a significant difference between cell phones (which experiences the most fraud attempts) compared to the clothes category. It was also possible to identify some relations between the variables: gender and product category. Products such as cellphones, games, sporting goods and electronics were more likely to be fraud among men buyers while in the female gender, clothing products, footwear and beauty products were the most targeted. In certain categories, important facts were observed. Adding the gender dimension, for example, it was observed that in the category of mobile phones, males were associated with a higher risk even being the most recurring consumers. However, in categories such as games, when purchases were made by males, there was a much lower chance of fraud compared to when purchases were made by the females.
Noting the recurrence of fraud activity throughout the days of the week, the highest rates were observed on Thursdays (52.1%), Fridays (54.6%) and Sundays (50.9%), while a smaller difference between fraud and non-fraud activities on the other days. Thus, greater care should be taken in combining the other purchasing information in order to better classify these transactions into analyzes.
The recurrence of frauds given the period of the day relates to the period of the order which was placed. Figure 3
discriminates some differences, especially perceiving a higher fraud incidence in orders placed at dawn. Thus, it was observed that there is an increase in fraudulent operations in the period at dawn, representing 67.4% of the activities, compared to the other times of day.
Another informative variable was the installment of the purchase made, which in practice will also be conditioned to the value of the product. It was observed that purchases with five or fewer installments have a relatively higher order value. For six or more, there is no significant difference in values.
In the second part of the case analysis, it was observed that features related to audio, for example, the duration of buyer-related audio fraudulent cases tend to be longer (average 178.12 s). In contrast, the average duration of the non-fraud audio is 118.62 s, 33.43%.
shows the spectrogram of two audios, one related to a fraudulent activity and the other a non-fraudulent, where each audio has two channels. The left is related to the buyer’s voice and the right of the operator/company). The difference between the dynamics of the audios, observing the left channels respectively can be observed, where one is longer than the other, as well as its dynamism.
Prior to the modeling procedure, the database was divided into training (80%) and testing (20%).
Thus, using only 41 features (after transformations of some variables into dummies) two techniques were adopted; logistic regression (LR) and random forest (RF). Since a linear model was adopted, the categorical variables were transformed into dummy features, increasing to 17 “new” features. These results of processing/extraction of acoustic indices was incorporated into classification models and presented relevant results in the case study.
Four models were adjusted in this analysis; a full model using the Logistic Regression and Random Forest (using all the features) and then a reduced model for both classifiers. After concluding the full Logistic Regression and Random Forest with all the variables, we tried to reduce the number of variables through the the importance of each one on the first models.
The adjustments of the classification models can be represented/compared by Kolmogorov- Smirnov test for classification models, as shown in Figure 5
Considering each model optimal cut-off, some performance statistics were calculated (based on the prediction set) shown in Table 4
The complete logistic regression classifier indicates greater relevance to the indices acoustic; especially to acoustic richness. In fact, it can be seen see that it was the most relevant in all the classifiers tested. The random forest classifier presented a better performance, in which acoustic richness was the most important feature. Therefore, the next step was to test them again using only the indexes as the features in the model, in Figure 6
. The adopted function is “varImp.RandomForest” implemented in the “caret” package. This function is related with the variables’ predictive power according to the Accuracy-based importance (prediction accuracy on the out-of-bag sample) and Gini-based importance (the node split decision of which variable). For further information, please see Reference [26
It is important to point out that acoustic richness remained the most important feature for all the four adjusted models. After selecting the RF summary model, the company studied provided two new unlabeled audios in order to estimate the likelihood of the fraud. The final model requires information from client features (gender, email domain, frequent consumer, ordered day, ordered time, city, product, installments and price), as well as the recorded audio features extracted.
Similarly, the same models were adjusted excluding audio-related features. Comparing the models in all the scenarios (Logistic Regression & Random Forest), the metrics showed a reduction in their performance. For instance, comparing the best performing model (again Random Forest), the area under the curve (AUC) showed a reduction of −3.98%, Sensitivity (SEN) −10.1% and Specificity (SPE) −12.05%. Those finds corroborate with the results, showing that the acoustic analysis enhances the fraud detection model combined with buyer-placed information.
Finally, the last test were conducted, in which two new audios were labeled, according to the developed model, which gave a probability of being a fraudulent action (using a Random Forest classifier). The first purchase tested gave a 71.91% probability of fraud, that is, the chance of a fraudulent operation, wherein this scenario the fraud occurred. In a second purchase the model gave a 3.31% of fraud probability, where indeed it was not a fraudulent operation. Therefore, the results showed a satisfactory and promising performance and further tests shall be conducted in order to implement it in the studied company detection process.
This study shows evidence that acoustic analysis combined with the common credit score approach shows good results towards e-commerce fraud detection. In addition, acoustic index analysis applied to speech interpretation (linguistic and vocal) explores a new area of application of entropy to problems related to industry. For example, detection of fraudulent claims or even the detection of purchases distinguishing good buyers and fraudsters, in which fraudsters hold legitimate data/ information from other people.
The acoustic indexes were, in fact, capable of providing better accuracy to the models, enhanced with the Random Forest. For the calculation of acoustic indices, we used the original audios without any pre-processing of the signals or even observed the dynamics of their complexity over time (calculate entropy of signals every 10 s for example). Thus, the results present further discussions/suggestions that may be incorporated into future work.
This work presented a limitation, addressed to the obtained results generalization, whereby to be implemented by the company (in the business intelligence tool). This manuscript finds shall be extended. We showed that models could be outperformed using acoustic analysis; nevertheless, comparison methods (cross-validation & holdout) can be incorporated into the estimation process, as a major component to enhance the models’ predictive ability. For further details, please see Reference [27
]. In the meantime, the results presented remain valid given that the sample size used to obtain the results is large, ensuring good asymptotic performance in the obtained estimates [28
The results also indicate that the developed technology could be implemented to help the human analysis of purchases still requiring further analysis in order to confirm the accuracy of the adopted model. Using this technology combined with the operators’ perception could help them make better decisions in cases of doubt if the buyer is lying or not. This study helped the company to make an analysis under two perspectives; first the model developed may reduce the number of frauds by improving the classification of fraudulent activities and, second, suggesting the adoption of data analysis unstructured as a way to enrich the database.