Paper 5
Introduction to data
Domestic and foreign major companies invested billions in Vietnam’s e-commerce
industry. In 2018, five Vietnamese companies were among Southeast Asia’s topten most-visited e-commerce websites. The ecosystem has seen rapid
development in terms of product diversity, technical infrastructure, and support
services. E-commerce models for consumer-to-consumer, business-to-business,
and consumer-to-business transactions have developed gradually over time.
Emerging business ideas include selling through social media and mobile phones.
The evolution of selling from websites to live broadcasts demonstrates the
worldwide e-commerce industry’s growth. Electronic businesses may potentially
benefit from the growing digital content industry. Additionally, many domestic and
foreign businesses have begun to develop cross-border e-commerce. This market
will continue to expand in the coming years.
The data set consists of 11 questions to measure customer perception in the online
environment. The sample size is equal to 485. The questionnaire has been
responded to by participants who bought from e-commerce websites.
We want to classify the questions of this research to get the best results. The two
multivariate methods for the analysis are exploratory and confirmatory factor
analysis. We start with the exploratory factor analysis and continue to drag the
appropriate number of factors.
Exploratory data Analysis
Exploratory data analysis (EDA), which originated centuries ago, is a data analysis
approach that emphasizes pattern recognition and hypothesis generation from raw
data. It is suggested as the first step of any data analysis task for exploring and
understanding data, and has been applied in many disciplines such as Geography,
Marketing, and Operations Management. However, even though EDA techniques,
such as data visualization and data mining, have been used in some procedures in
auditing, EDA has not been employed in auditing in a systematical way. This
dissertation consists of three essays to investigate the application of EDA in audit
research. The study contributes to the auditing literature by identifying the
importance of EDA in auditing, proposing a framework to describe how auditors
could apply EDA to auditing, and using two cases to demonstrate the benefits that
auditors can gain from EDA by following the proposed framework.
First of all, we checked the presence of missing values in the data. There are no
missing values. For finding the outliers, we perform the data by Rosner test. The
results showed that there is no outlier there. By PCA method, the scree plot of the
data is shown in below:
A common method for determining the number of PCs to be retained is a graphical
representation known as a scree plot. A Scree Plot is a simple line segment plot that
shows the eigenvalues for each individual PC. It shows the eigenvalues on the yaxis and the number of factors on the x-axis. It always displays a downward curve.
Our scree plot tells us that we can drag three factors from the data to measure
customer perception in the online environment. Before this, we examined the
sample adequacy and the validity of the data. KMO is a test conducted to examine
the strength of the partial correlation (how the factors explain each other) between
the variables. KMO values closer to 1.0 are considered ideal while values less than
0.5 are unacceptable. This statistic for out data is equal to 0.74 which is ideal.
Bartlett’s test for Sphericity compares your correlation matrix (a matrix of Pearson
correlations) to the identity matrix. In other words, it checks if there is a
redundancy between variables that can be summarized with some factors. The
value of this statistic is less than 0.05. The three factors are shown in the below
table:
Factor
1
H1
0.747
H2
0.806
H3
0.792
H4
0.852
2
H5
0.733
H6
0.785
H7
0.851
H8
0.538
3
H9
0.936
H10
0.772
H11
0.412
All the factors are considered latent variables. There are three latent variables.
We can call the three factors as follows:
1. Hedonic value
2. Perceived Mental Benefits
3. Electronic Loyalty
These are the three factors that could measure customer perception in the online
environment.
Confirmatory Factor Analysis
Factor analysis is a family of statistical strategies used to model unmeasured
sources of variability in a set of scores. Confirmatory factor analysis (CFA),
otherwise referred to as restricted factor analysis, structural factor analysis, or the
measurement model, typically is used in a deductive mode to test hypotheses
regarding unmeasured sources of variability responsible for the commonality
among a set of scores. It can be contrasted with exploratory factor analysis (EFA),
which addresses the same basic question but in an inductive, or discovery-oriented,
mode. Although CFA can be used as the sole statistical strategy for testing
hypotheses about the relations among a set of variables, it is best understood as
an instance of the general structural equation modeling (SEM). In that model, a
useful distinction is made between the measurement model and the structural
model. The measurement model (i.e., CFA) concerns the relations between
measures of constructs, indicators, and the constructs they were designed to
measure (i.e., factors). The structural model concerns the directional relations
between constructs. In a full application of SEM, the measurement model is used
to model constructs, between which directional relations are modeled and tested
in the structural model. (PsycINFO Database Record (c) 2012 APA, all rights
reserved).
In the last part, we dragged the three factors. Now, we can make a structural model
equation and consider some hypotheses to investigate the relationship between
the factors. The model is shown as follows:
Hypotheses:
H1: PMB significantly impacts HV
H2: HV significantly impacts ELOY
H3: PMB significantly impacts ELOY
The regression standard coefficients are shown in here:
Coefficient T statistic
PMB --<
0.424
8.581
--< ELOY
0.275
4.302
PMB --< ELOY
0.564
8.606
HV
HV
After analyzing the survey data in R software, the result in above Table 5 shows
that all criteria to assess the mediating role of hedonic value are satisfied.
Therefore, hedonic value is a partial mediator in relationship-perceived mental
benefits and electronic loyalty. Because all the t- statistics are higher than 1.96, we
can say that all the mentioned hypotheses are significant.
Reliability and validity assessment of the model
A Cronbach’s alpha coefficient (CA) higher than 0.7 was used to evaluate scale
reliability. A scale has convergent validity when the composite reliability (CR) is
equal to or more than 0.7.
متغیرها
PMB
HV
ELOY
CRONBACH COMPOSITE RELIABILITY
Validty
)Alpha>0.7(
)Cr>0.7(
)AVE>-
-
-
We see that all the measurements are ideal. It means that our model is a significant
and valid model.
We perform an Exploratory Factor Analysis to reduce the dimension and to
understand which latent factors affect customer perception in the online
environment. In addition, confirmatory factor analysis was performed to research
how well the data fit the model. CFA was also used for testing hypotheses that there
are relationships between the observed variables and the latent factor or construct.
There are some other multivariate methods that could be used in extending the
research. Canonical correlation analysis is used to identify and measure the
associations between two sets of variables. Canonical correlation is appropriate in
the same situations where multiple regression would be, but where are there are
multiple intercorrelated outcome variables. Canonical correlation analysis
determines a set of canonical variates, orthogonal linear combinations of the
variables within each set that best explain the variability both within and between
sets.
References
1. Fabrigar, L. R., & Wegener, D. T. (2011). Exploratory factor analysis. Oxford
University Press.
2. Gorsuch, R. L. (1988). Exploratory factor analysis. In Handbook of multivariate
experimental psychology (pp. 231-258). Springer, Boston, MA.
3. Khoa, B. T. (2022). Dataset for the electronic customer relationship
management based on SOR model in electronic commerce. Data in brief, 42,
108039.