Published Work 2
IET Image Processing
Review Article
Urdu handwritten text recognition: a survey
ISSN-
Received on 8th April 2019
Revised 26th March 2020
Accepted on 14th April 2020
doi: 10.1049/iet-ipr-
www.ietdl.org
Mujtaba Husnain1 , Malik Muhammad Saad Missen1, Shahzad Mumtaz1, Mickaël Coustaty2, Muzzamil
Luqman2, Jean-Marc Ogier2
1Department
of Computer Science & IT, The Islamia University of Bahawalpur 63100, Pakistan
Lab, Université of La Rochelle Av. Michel Cŕepeau, 17000 La Rochelle, France
E-mail:-
2L3i
Abstract: Work on the problem of handwritten text recognition in Urdu script has been an active research area. A significant
progress is made in this interesting and challenging field in the last few years. In this study, the authors presented a
comprehensive survey for a number of offline and online handwritten text recognition systems for Urdu script written in Nastaliq
font style from 2004 to 2019. Following features make their contribution worthwhile and unique among the reviews of a similar
kind: (i) their review classifies the existing studies based on types of recognition systems used for Urdu handwritten text, (ii) it
covers a very different outlook of the recognition process of the Urdu handwritten text at different granularity levels (e.g.
character, word, ligature, or sentence level), (iii) this review article also presents each of surveyed articles in following
dimensions: the task performed, its granularity level, dataset used, results obtained, and future dimensions, and (iv) lastly it
gives the summary of the surveyed articles according to the granularity levels, publishing years, related tasks or subtasks, and
types of classifiers used. In the end, major challenges and tasks related to Urdu handwritten text recognition approaches are
also discussed in detail.
1
Introduction
Handwriting recognition is an active area of research in the field of
pattern recognition and has various applications in industrial and
professional applications. Some of these applications include forms
processing in government, administrative, health and academic
institutes; postal address recognition, processing of bank cheques
etc. Handwriting recognition concerns with automatic transforming
a source language into its symbolic representation. The source
language can be represented either in its spatial (offline) or
temporal (online) [1] form in graphical marks. In-depth analysis of
handwritten text give rise to a number of useful applications such
as author profiling [2], named entity recognition [3], recognition of
overlapped characters [4] etc.
In the late 1950s, the first optical character recognition (OCR)
system was developed for the recognition of Latin text [5, 6] that
deals with the recognition of numerals only. With the
advancements in OCR, the systems available nowadays are
expanded to recognise Latin script, and characters of a variety of
other languages such as Chinese, Japanese, Arabic, Persian etc.
Optical character recognition of Urdu script started in the late 2000
and the first work on Urdu OCR is published in 2004. The
literature review identified the fact that there has been a lack of
research efforts in Urdu handwritten text recognition as compared
to recognition of script of other language [7–9]. Furthermore, there
are few Urdu OCR systems for a printed text that are commercially
available [10, 11] but there is no system available for Urdu
handwritten text recognising to date.
In verbal communication, the Urdu language adopted many
dialects across the regions but in formal writing Urdu script uses
standard way. Furthermore, it is also observed that Urdu written
script shares similarities with other languages such as Arabic and
Persian [12]. Therefore, automatic interpretation of Urdu
handwritten text would have prevalent and ubiquitous benefits.
Urdu handwriting is used in writing official assignments and
documents in almost all the organisations in Pakistan. Most of the
time, all this data (provided in the form of handwritten applications
or forms) is typed into computers for further processing that
requires huge man power, processing equipment, time, money, and
other resources. For example data entry in National Database and
IET Image Process.
© The Institution of Engineering and Technology 2020
Registration Authority offices in Pakistan for processing requests
of National Identity Cards, student's applications in government
institutes, signature on banker cheques etc. require an automatic
Urdu handwritten text recognition tool to recognise the text and
process the documentation in real-time environment. Furthermore,
the system should provide facility to save the information in some
appropriate database. This will reduce a significant number of
resources in daily official matters considering the nature of this
task.
The development of a Urdu handwritten recognition system can
assist in reading historic Urdu manuscripts to make the content of
these manuscripts available. The content of manuscripts is written
in a clear and readable way as compared to the handwritten text,
which makes the task of recognition of contents of manuscript
much simpler. On the other hand, some issues associated with
handwritten text make the task of developing the system for
recognition of handwritten text more challenging and complicated.
Some of these issues are differences in writing style (even from the
same author), image degradation due to cursive nature of the script,
poor quality or illegible handwriting etc. [5, 6, 13].
This survey article gives a comprehensive survey of Urdu
handwritten text recognition. Most recent survey articles on this
topic were published in 2013 [7, 14, 15]. As mentioned above, this
survey article focus on the literature related to offline Urdu
handwritten text recognition only. Furthermore, some recent
notable work can be found in [16–19] for recognition of printed
Urdu text.
This paper is organised as follows: a brief introduction of the
Urdu script is given in Section 2. In Section 3, we describe basic
steps in intelligent character recognition (ICR) by describing the
usual process of Urdu text recognition. Furthermore, datasets
associated with Urdu handwritten text are also discussed. Then in
Section 4, state-of-the-art discussion on the existing work related to
Urdu ICR at different granularity levels is given in detail. Open
issues, future directions, and detailed analysis of the existing work
are presented in Section 5. The conclusion is given in Section 6.
1
Fig. 1 Basic alphabets and numerals of Urdu script
more complex and challenging. Furthermore, the writing style of
Urdu is also one of the reasons that make the script even more
complex to recognise.
Naskh and Nastaliq are two widely used fonts while writing
Urdu script. Nastaliq is more complex than Naskh [18, 23, 25]
since Nastaliq font has a variety of variations in the shape of an
alphabet depending on its position in the word as compared to
Naskh, e.g. the alphabet
(bay) has different shapes according to
its position in a word shown in Fig. 3.
Above mentioned issues make the recognition of Urdu
handwritten text more challenging as compared to any other script.
There are some inherent characteristic features of Urdu alphabets
that may help in recognising Urdu's handwritten text. These
features include accent and diacritical marks that help in
differentiating one character from the other (number and position
of dots), number and position of loops and arcs etc.
It is pertinent to mention that the non-availability of the
appropriate resources and required techniques is one of the reasons
that makes the recognition of Urdu handwritten text a complex
task.
3
Fig. 2 List of non-joiner and joiner alphabets of Urdu script
(a) Non-joiner, (b) Joiner alphabets in Urdu script
Fig. 3 Different shapes of Urdu alphabet at a different position in a word
2
Urdu script
Urdu is the national language of Pakistan and also considered as
one of the two official languages of Pakistan [20] (with the other
being English). It is widely spoken and understood as a second
language by a majority of people of Pakistan [21, 22] and also
being adopted increasingly as a first language by the people living
in urban areas of Pakistan.
Urdu script is written from right to left while numerics are
written from left to right, this is the reason Urdu can be considered
as one of the bidirectional languages. Urdu script consists of 38
basic letters and 10 numerics as shown in Fig. 1. This alphabet set
is also considered as a superset of all Urdu script-based languages
alphabets, i.e. the Arabic script contains 28 while the Persian script
contains 32 alphabets [23]. Furthermore, the Urdu script also
contains some additional alphabets to express the Hindi phonemes.
Both the Hindi and Urdu languages [23] have the same phonology
with the only difference in the writing style of the script. All the
Urdu-script-based languages such as the Arabic and Persian have
some unique characteristics, i.e. (i) the script of these languages is
written from right to left in cursive style and (ii) the script of these
languages is context-sensitive, i.e. written in the form of ligatures,
which is a combination of a single or many alphabets. Owing to
this context-sensitivity, most of the alphabets have different shapes
depending on their position and the adjoining character in the word
[10]. The connectivity of alphabets [24] has enriched the Urdu
vocabulary with almost 24, 000 ligatures.
In Urdu script, alphabets are classified into two groups: joiner
and non-joiner [21–23]. The joiner characters join to other
characters on the initial, middle, and end position in the ligature.
While non-joiner appears in isolated form. Fig. 2 shows a list of the
joiner and non-joiner alphabets of Urdu. It is observed that the
authors may write the words of the Urdu script in which the
individual letters overlap with each other. This kind of overlapping
makes the segmentation process of splitting the words to characters
2
Intelligent character recognition (ICR)
In the field of computer vision and pattern recognition, ICR is the
process of recognition of given handwritten text. However, a
complete handwriting recognition system also includes correct
segmentation of words into characters, formatting of the extracted
segments, and finding the most reasonable words. One of the
significant issues in processing the handwritten forms and
applications is that the data samples processed by the ICR systems
do not necessarily contain the characters in isolated form. In this
situation, the segmentation process is used to extract the specific
area of interest from the whole image. To resolve this issue and to
reduce the segmentation errors, the handwritten forms and
applications are designed in a way that the correct position of each
character can be calculated easily. The work reported in [26]
investigated the scenario that to which extent the reliability
provided by the ICR system allows the user to make use of reject
option, i.e. to reject the text images having more than one character
overlapping due to cursive nature of handwriting, as well as the
images containing single characters that are not correctly
recognised. This activity helps in designing a tool for the system
designer to enhance the efficiency of the system as the volume of
forms containing less constrained data fields increases.
For the rest of the paper, the term ICR will be used for
handwritten text recognition. There are some issues in ICR such as
a change in font, the slope of the line, different writing style even
from a single writer, overlapping joining letters, missing placement
of dots and diacritics aka secondary strokes etc. that make the
process of ICR more challenging than the recognition of printed
text. Furthermore, the cursive nature of Urdu script makes these
issues even more complex and challenging while recognising the
handwritten Urdu text. These issues are discussed in detail in
subsequent subsections below.
3.1 Urdu-based ICR systems
Urdu-based ICR systems can be divided into two types, i.e. online
and offline [7, 15, 23]. In online ICR, real-time text recognition is
performed using sensors to detect and analyse the pen tip
movement, stroke position, baseline detection etc. While in offline,
character recognition implicates the automatic conversion of
handwritten text from a scanned image of the paper. It is also
observed by the researchers that offline character recognition is a
complex process than online character recognition [27–29].
In practice, the textual data is given as input in the form of a
scanned image analysed and recognised as machine-readable
characters. The text data may have different fonts and handwriting
styles that need to be preprocessed to produce an ideal and clear
view of the input data. ICR is different from OCR in the sense that
ICR is associated with handwritten text recognition while OCR
works on recognition of printed text. A typical ICR system
comprised three phases: (i) preprocessing, (ii) segmentation, and
(iii) recognition. In the first phase, a set of operations are
IET Image Process.
© The Institution of Engineering and Technology 2020
Table 1 Details of Urdu datasets
Dataset
Statistics
Fig. 4 Basic geometrical strokes of Urdu script [12]
performed to reduce the ink-noise ratio because the input set of
handwritten documents may include inconsistent text because of
having different writing styles. These operations include skewness
smoothing, chain coding, and baseline removal [7, 8]. In the
segmentation phase, the scanned image is segmented at three levels
mentioned in [27, 28, 30]. In the first level, the text image is
segmented for extracting the baseline by using a profile along the Y
-axis, known as vertical projection. In the second phase, the
horizontal projection profile is calculated for every row as the sum
of all column pixel values inside the row. In the last level, the
image segments are further segmented to extract the basic
graphical and geometrical components of the text image. To get the
finer results from segmentation, one must have to move through all
the three levels. In the last phase, recognition is performed in
which the segmented text data is scanned and matched with the
stored training set data.
The application of ICR has increased its efficacy towards
automatic recognition of real-world handwritten documents to
make them useful for various business and academic applications.
3.2 Datasets related to Urdu handwritten text
One of the biggest issues that an artificial intelligence expert
encounters—besides designing a model to solve a specific problem
—is having an appropriate dataset that directly relates to the
problem at hand. Furthermore, the dataset should be processed in a
way so that the designed model can make sense of the information.
In this subsection, we discuss the datasets associated with Urdu
handwritten text and are used in this survey article are discussed in
detail.
Up to our knowledge, there are five datasets available for Urdu
handwritten text namely, Urdu Nastaliq Handwritten Dataset
(UNHD) [31], Urdu Printed Text Image Database (UPTI) [12],
Centre for Pattern Recognition and Machine Intelligence
(CENPARMI) [32], Cursive and Language Adaptive Methodology
(CALAM) [33] and Prince Mohammad Bin Fahd University-Urdu/
Arabic Database (PMU-UD) [34]. Among these datasets, UNHD
[31] and UPTI [12] are the promising datasets since both the
corpus contain an appropriate amount of the respective data
instances at the word and character level. These basic instances can
help the researchers to perform recognition and classification
operations using state-of-the-art machine learning approaches such
as deep networks etc. Furthermore, these two datasets may be
identified as benchmark datasets since there is no other dataset
having such enough data points. Table 1 summarises the detail of
the Urdu handwritten data corpus discussed in the literature.
4
Exploratory analysis
As mentioned earlier, the current survey is conducted in four broad
dimensions namely the designated task completed by the Urdu ICR
system; its granularity level; dataset used; and the quality of results
obtained. Our aim is to assess in detail the use of different
approaches used in the above-mentioned dimensions in the domain
of Urdu ICR. In general, we categorised the tasks and issues
discussed above based on different levels, as given below.
• Character level recognition
• Word level recognition
• Ligature level recognition
IET Image Process.
© The Institution of Engineering and Technology 2020
Price
total number of writers 500
text lines per page 8
UNHD [31]
total number of text lines 10, 000
public
total Number of words 312, 000
total number of characters 187, 200
total number of writers 250
UPTI [12]
total number of text images 10, 000
US250
text lines per page 6
total number of characters 970, 650
total number of writers 343
CENPARMI [32]
total number of text images 19, 432
US500
total samples of Urdu handwritten digits 180
total number of writers 725
total number of images 1200
CALAM [33]
total number of text lines 3, 043
US400
total number of words 46, 664
total number of ligatures 101, 181
total number of writers 70
PMU-UD [34]
total number of text images 5, 180
public
total number of Urdu digits 10
• Sentence level recognition
4.1 Character level recognition
It is noteworthy that the efficacy of the approach used in Urdu ICR
is dependent on the performance of the method proposed.
Therefore, an effective combination of the methods used in Urdu
ICR can eventually help in recognising the alphabets correctly. The
following two analysis tasks come under the umbrella of Urdu ICR
at character level recognition: (i) Urdu alphabet recognition and (ii)
Urdu numerals recognition. This section will cover the approaches
that focus on the above-mentioned tasks.
4.1.1 Urdu alphabet recognition: Handwritten text recognition at
the character level is a challenging task because of having a large
number of variations in writing styles (even from a single author).
It is observed from the literature related to character level
recognition in Urdu script, artificial neural network (ANN) and its
different variants are widely used. An ANN [35] is a collection of
nodes (aka artificial neurons) linked with each other. These links
between artificial neurons are enabled to transmit a signal from one
to another within the network. These neurons can process the
signals received and then propagate to the neurons connected in
subsequent layers. The structure of the ANN may be affected by
the kind of information flowing through it because a neural
network usually trains itself using the input and labelled output.
The problem of developing a generic type of ICR that can
resolve the issues associated with any language is challenging since
different languages exhibit different characteristic features and thus
generalising this type of system is not possible. To overcome this
problem, a novel approach is proposed in [29] exploring that the
character set of any language can be represented by primitive
geometrical strokes. One of the promising features of the approach
is that the recogniser (artificial neural network) has to be trained
only once. The data structure of the character set should be
represented in the form of geometrical strokes in some XML file.
This file helps in training the neural network only for once for each
word in the language. Fig. 4 shows a set of 13 basic geometrical
strokes. For evaluation purposes, a set of 25 handwritten Urdu text
samples were tested and achieved the success rate of 75–80%. One
of the limitations of this approach is that it does not apply to the
words having dots and diacritics.
Owing to having a large character (or alphabet) set, there is an
inherent similarity among some major strokes as shown in Fig. 4.
This similarity is one of the challenging issues in Urdu ICR.
3
Fig. 5 Urdu alphabets
(a) Redundancy of strokes among Urdu alphabets, (b) Grouping of characters based on the number of strokes [36]
Fig. 6 Features
(a) Initial trend for Alif, (b) Character-box-slope for seen, (c) Cusp in hey, (d) Feature
vector for intersections of Daal with axes (U: up, D: down, R: right, L: left), (e)
Anticlockwise finishing trend of Aen [36]
Fig. 7 Alphabets ‘De’ (left) and ‘Daal’ (right) were repeatedly misrecognised
Keeping in view the above-mentioned fact, in [36], the authors
divided the Urdu alphabet set in four groups according to the
number of strokes as shown in Fig. 5b.
The authors performed an online Urdu ICR considering singlestroke characters only. Some novel features (shown in Fig. 6) are
extracted and then fed to three different classifiers namely backpropagation neural network (BPNN), probabilistic neural network
(PNN), and correlation-based classifier. The proposed approach is
tested on 85 instances of single stroke characters taken from 35
writers of different age groups. The results showed that the PNN
classifier achieved a higher accuracy of 95% as compared to the
other two classifiers. Unlikely BPNN, the PNN-based classifiers
require no initial training. This is the reason PNN-based classifiers
achieved higher accuracy than BPNN.
For isolated character recognition, the authors in [37] proposed
a technique in which a feature vector is built by analysing the
primary and secondary strokes while writing Urdu characters in
isolated form. Some of the stroke features that were used to train
the classifier are the length of the bounding box diagonal; the angle
of the bounding box diagonal; the distance between the first and
last point; the sine and cosine of the angle between the first and last
point; the total length of the primary stroke; and the total angle
traversed. A linear classifier is applied to the data set of five
samples each of 38 Urdu alphabets, i.e. a total of 190 characters are
provided by two different writers who can write Urdu characters
smoothly. The classifier recognised the characters with an error
rate of almost 6% because there are some characters that share
quite similar shapes (see Fig. 7) and are not correctly recognised.
A similar work is reported in [38] by considering the initial half
of different Urdu characters. In this work, only those characters
were considered that change their shapes with respect to their
position and context in a word. Fig. 8 depicts the Urdu alphabets in
the initial half forms and are classified on the basis of the number
of strokes. Almost 100 native Urdu writers and speakers were
invited to write in Urdu script. The writers were provided with
stylus and digitising tablet to get the dataset of 3600 instances of
4
Fig. 8 Classification of initial half forms based on the number of strokes
[38]
Urdu letters in the initial half form. A combination of multilevel
one-dimensional wavelet analysis with Daubechies wavelet [39–
42] is applied to extract features from these instances. A number of
neural networks with different configuration were trained for
recognition purposes. Among these networks, BPNN provided a
maximum recognition rate of 92%. In [43], the authors introduced
a novel similar character discrimination method for recognition of
handwritten Urdu character in online mode. The proposed
methodology is a three-step model namely the pre-classification,
feature extraction, and fine classification process. In the preclassifier phase, the discrimination of similar characters is
performed by making smaller subsets based on stroke number and
diacritics. Secondly, the structural features and wavelet features are
extracted manually. Finally, a group of different machine learning
classifiers such as support vector machines (SVMs), ANNs, and
recurrent neural network (RNN) classifiers are applied separately
and the obtained results are compared for fine classification within
subsets. The results showed that the RNN classifier, when applied
without using the proposed pre-classifier and features, can be
applied to check the end-to-end the capability of the RNN
classifier. Furthermore, the experimental results depict that the
proposed method is efficient and achieves an overall accuracy of
96% on a large-scale self-collected dataset. It is pertinent to
mention that the proposed approach is also feasible to apply to
other Arabic-based scripts.
Multidimensional long short term memory (MDLSTM) neural
network is one of the RNN that is implicitly used for sequence
learning and segmentation in a multidimensional environment [44–
46]. This model was used for the first time in the work of [47] for
Urdu script recognition. One of the promising features of the model
IET Image Process.
© The Institution of Engineering and Technology 2020
Table 2 Comparison of the results obtained on UPTI
dataset using different methods
Reference Features Approach UPTI dataset Accuracy, %
Fig. 9 Tagged sample from UPTI dataset
(a) Text line image, (b) Ground truth or transcription [18]
[47]
pixels
MDLSTM
[49]
pixels
BLSTM
[50]
pixels
BLSTM
[51]
statistical
features
MDLSTM
68% training
16% validation
16%
46%
44% validation
10%
46% training
34% validation
20%
46%
16% validation
16% test
-
94.85
94.7
Table 3 Accuracy reported on common datasets
Dataset
Reference Accuracy, %
Approaches
UPTI [12]
Fig. 10 Different samples of input images with noise and without noise
[31]
Fig. 11 Grouping of Urdu characters according to shape similarity. Each
group is numbered from right to left [55]
is that it can scan the input image in all four directions thus
reducing the chance of ambiguity. For evaluation purposes, UPTI
(Urdu Printed Text Image) dataset [12] is used that contains 10, 000
scanned images of both Urdu handwritten and printed text.
MDLSTM is one of the supervised techniques, therefore, each
input sample in the dataset is tagged and labelled with the
appropriate information, shown in Fig. 9. The dataset is further
divided according to the following ratio: 68% for training, 16% for
both testing and validation purposes. To evaluate the accuracy of
the proposed approach, Levenshtein edit distance [48] is computed
between the output text and base-line results and achieved the
accuracy of 98% as compared to results reported in the work of
[49, 50] reporting 88.94 and 89.00% accuracy, respectively. Table 2
shows a comparison of the proposed approach on the UPTI dataset
[12] with other techniques.
A promising work is reported in [31] in which Urdu
handwritten text is recognised using UNHD [52]. This dataset
https://sites.google.com/site/researchonurdulanguageresearch
on the Urdu language can be accessed publicly. The dataset
contains 312, 000 words (including both Urdu script and Urdu
numerals) written on a total of 10, 000 lines by 500 writers of
different age groups. The writers are directed to write on white
pages of size A4. Each individual is provided six blank pages
labelled with author ID and the page number. One of the samples
of written pages is shown in Fig. 10. Furthermore, to maintain the
uniformity in data, the writers were asked to write the provided
printed text. To recognise the text, the bidirectional long short term
memory (BLSTM)-based approach is proposed that is based on a
RNN capable of restoring the previous sequence information. For
evaluation purposes, the dataset is divided into 50% for training,
30% for validation, and 20% for testing and achieved 6-8% error
IET Image Process.
© The Institution of Engineering and Technology 2020
UNHD
CENPARMI [32]
[53]
[54]
[31]
[9]
-
MDLSTM (Leven. Dist)
MDLSTM (CTC)
BLSTM
SVM
rate that can be improved using two-dimensional BLSTM as
proposed by the authors. Table 3 gives the summary of accuracy
reported on common datasets in the Urdu domain.
In [53], the authors proposed a novel approach for Urdu text
recognition at the character level, written in Nastaliq font by
combining the convolution neural network (CNN) and MDLSTM
(convo-recursive deep learning model). In the first phase, CNN is
deployed to extract the characteristic features which are then fed to
MDLSTM in the second phase. This approach outperformed the
state-of-the-art systems on the UPTI data set.
In a very recent work [55], the authors proposed the use of the
convolutional neural network to recognise the multi-font offline
Urdu handwritten characters in an unconstrained environment.
They also introduced a novel dataset of Urdu handwritten
characters and numerals by inviting a number of native Urdu
speaking people from different age groups and academic level.
They grouped the Urdu isolated characters based on the shape
similarity, as shown in Fig. 11. A series of experiments were
performed on their proposed dataset. The accuracy achieved for
character recognition is among the best while comparing with the
ones reported in the literature for the same task.
4.1.2 Urdu numeral recognition: It is quite easy for a human
being to recognise the handwritten digit data but for the computer
system, there is a need for an intelligent approach based on some
machine learning algorithms developed for this kind of job. The
digit writing stroke, length, width, orientation, and other
geometrical features tend to change while writing the same digit
even by the same author. These different writing styles may
introduce shape variations of Urdu numerals that may break the
strokes primitives and also change their topology. These issues
make Urdu handwritten digit recognition one of the active research
areas in the field of image processing.
Unfortunately, there is no commercially available standard
dataset of Urdu numerals. Owing to this lack of resource, the
researchers developed their own dataset and concluded the results.
This section covers some notable work related to handwritten digit
recognition in the Urdu domain.
In [56], different transformations of Daubechies wavelet [39–
42] are applied for features extraction from the dataset of about
2150 samples of handwritten Urdu digits. A sample of the dataset
is shown in Fig. 12 above. For evaluation purposes, 2000 samples
were used for training the neural network and 150 instances for
testing. To decompose the images into different frequency bands,
both the low-pass and high-pass filtering are applied at each phase
5
Fig. 14 Grayscale and gradient strength of Urdu ligature at different level
(a) Greyscale image of size 128 × 128, (b) Gradient strength, (c) Gradient direction,
(d) Division of gradient image into 9 × 9 blocks [9]
Table 4 Experimental results with different image sizes and
gradient features [9]
Image size
Feature size Results
Fig. 12 Handwritten Urdu digit samples [56]
64 × 64
64 × 128
128 × 64
64 × 128
-
94.96% (3580/-% (3590/-% (3585/-% (3621/3770)
Fig. 15 Sample sentence in Urdu script. Underlined words are showing
named entities
4.2 Word level recognition
Fig. 13 Recognition rate using different Daubechies wavelets for
handwritten digits [56]
of Daubechies wavelet [39–42] filtering. For classification
purposes, BPNN is used and achieved an average recognition rate
of 92.05% as shown in Fig. 13. The recognition time (in seconds)
are 1.0247, 1.1015, 2.003, 1.4659, 1.501, 1.1163, 1.4792, 1.508,
1.5725, and 1.4754, respectively, for each of the ten wavelet filters,
thus achieving the average training time of 1.4245.
In [25, 57], the authors presented the similarities and
dissimilarities between Urdu and Arabic script while recognition of
handwritten numeric data. A hybrid technique of hidden Markov
model (HMM) and the fuzzy rule is used to recognise the
handwritten digit of both Arabic and Urdu script. The dataset is
prepared by inviting 30 trained users to write both the Urdu and
Arabic numerals and collected 900 samples in total. The system
obtained 97, 96, and 97.8% recognition rates using fuzzy rule,
HMM, and hybrid approach, respectively. The authors also
conclude that the separation of numerals from Urdu text in a
handwritten text is still a challenging issue due to having shape
similarity, e.g. first alphabet of Urdu script (Alif) and Urdu
numeric (One) both are represented by the same symbol of .
Similarly, in [34], a multi-language handwritten numeral
recognition system is proposed using novel structural features. A
total of 65 local structural features are extracted and several
classifiers are used for testing numeral recognition. Random forest
was found to achieve the best results with an average recognition
of 96.73%. The proposed method is tested on six different popular
languages, including Arabic Western, Arabic Eastern, Persian,
Urdu, Devanagari, and Bangla. It is pertinent to mention that in this
study, only those digits in the languages are chosen that do not
resemble each other. Yet using the novel feature extraction method,
a high recognition accuracy rate is achieved. The experiments are
performed on well-known available datasets of each language. A
dataset for the Urdu language is also developed in this study and
introduced as PMU-UD (https://archive.ics.uci.edu/ml/datasets/
PMU-UD). The results indicate that the proposed method gives
high recognition accuracy than other methods.
6
Recognition of cursive script such as Urdu is performed by
analysing the fundamental units, i.e. the individual letters that
combine to form words. Several techniques are used so far in this
field to enhance recognition results. This section covers notable
work related to the recognition of Urdu handwritten text at the
word level.
In [9], a set of gradient and structural features were extracted
and fed to SVM to recognise Urdu handwritten words from a given
text. Robert's filter mask [58] is applied for extracting gradient
features from a 128 × 128 grey level image of handwritten text. To
overcome noise and spatial resolution in the image, Gaussian filters
[59] are used to normalise image as 9 × 9 scales as shown in Fig.
14. Structural features of an image including projection profile,
topological features, upper and lower profiles etc. are extracted by
a calculating distance of pixel positions from the upper right and
upper left corner of the image. These calculations are then
normalised using Gaussian-like estimations. For evaluation
purpose, CENPARMI dataset [32] is divided into training (60%),
validation (30%), and testing (20%). Table 4 depicts the
experimental results with different image sizes and gradient
features.
Name-entity recognition (NER) is a task to extract and identify
real-world objects such as persons, organisations, locations etc. that
can be represented with a proper name [60]. For example in Fig.
(Bahawalpur: name of city) and
15, the underlined words
(Punjab: Name of the state of Pakistan) are the named entities
in Urdu. Two approaches are used for the development of NER
systems in the Urdu domain [61] namely (i) rule-based approach;
(ii) statistical approaches. Rule-based approaches are based on
gazetteer lists manually written by linguistics. This approach is
language-dependent, i.e. the techniques used to develop a rulebased system for one language cannot be applied to other
languages. The authors also performed online recognition of Urdu
handwritten text using the tree-based dictionary approach and
achieved a 96% accuracy rate.
In [62], the authors used a rule-based system and achieved an
accuracy rate of 75% in extracting 13 different named entities from
two different sets of Urdu documents collected from news
resources. A similar approach is proposed in [63] in which about
2000 Urdu documents were analysed for extracting different named
entities thus achieving 73% accuracy.
IET Image Process.
© The Institution of Engineering and Technology 2020
In statistical approaches for handwritten text recognition,
different machine learning models such as HMM, SVMs, and
Bayesian decision theory etc. are also used. These approaches are
language independent and easily applicable to multiple languages
as well. In [64], the authors used two different models maximum
entropy (ME) and conditional random field (CRF) for information
extraction. For evaluation purposes, a corpus of 50, 000 tokens was
carefully analysed to extract a number of different named entities.
F-measure for ME-based approach was 55.30% and that of the
CRF-based approach was 68.90%.
A multilingual approach is proposed in [65] in which the
authors developed a statistical CRF model for extracting named
entities from South and Southeast Asian languages particularly for
Bengali, Hindi, Telugu, Oriya, and Urdu. For Bengali and Hindi
languages, the gazetteer lists were also used. The system achieved
F-measure of 59.39% for Bengali, 33.12% for Hindi, 28.71% for
Oriya 4.749% for Telugu and 35.52% for Urdu. Table 5
summarises different approaches used in developing the Urdu NER
system.
It is observed that modern mobile phones are equipped with
pointing devices to write the script in any language instead of
typing. This activity attracts the researchers to develop a system
that can read such a type of text in real-time. To develop such a
system to recognise the Urdu handwritten text, the authors
proposed a mobile-based system [66] using a number of various
classifiers such as HMM, fuzzy logic, k-nearest neighbour (k-NN),
hybrid HMM fuzzy, hybrid k-NN fuzzy, and CNN. The statistical
parameters are checked and validated to examine the performance
of the classifiers. The experimental results concluded that by
applying the proposed preprocessing algorithms using a rule-based
approach, the recognition rates are impressively enhanced. The
vector quantisation method provided better results for recognition
of online Urdu handwritten characters while the hybrid k-NN and
fuzzy classifier showed promising results for recognition of
ligatures and words in mobile phones.
In [67], the authors presented a quite different approach for
reading the cursive handwriting. The method includes the process
of finding patterns in order of writing from the static images of
handwriting. In other words, they designed a novel segmentation
algorithm decomposing the ‘unfolded’ ink into strokes. A novel
approach is introduced to compare the ink of and unknown
handwriting with the pattern set of reference words and a graph
search algorithm to search for the best interpretation among the
possible ones.
Then compare the ink-matching patterns to the ink of the
unknown handwriting with the set of reference words (or terms).
These reference words are written by whom the transcripts are
given. Then the graph search algorithm is applied to search for the
best interpretation among the possible ones. The novelty lies in the
fact that the proposed method does not involve any feature
extraction, nor a classification stage and may benefit from a
linguistic context, if available. The authors reported the results of
experiments on 8000 samples written in different cursive languages
and draw some conclusions and outline further developments.
The process of recognition at word level requires unicode
formatting and usage of proper font due to the cursive nature of
Urdu script. Furthermore, there is a need for new techniques such
as deep neural networks with enhanced mapping functions to
upgrade the existing ones to resolve the issues for recognition of
Urdu text at the word level.
4.3 Ligature level recognition
As mentioned earlier, ligatures are formed by combining alphabets
in Urdu script. Some examples of the ligature are given in Fig. 3.
While writing ligature in Urdu script, alphabets may change their
shape based on the the position in the word (or ligature). In this
section, work done in Urdu ICR at the ligature level is reviewed in
detail.
In [68], the authors proposed a novel segmentation free
approach for online recognition of Urdu handwritten script by
using different features of ligatures. These features include position
and number of the loop(s), number of intersections, the direction of
ligature etc. The ligatures can be written by using primary and
secondary strokes. For stroke identification, the authors used the
backpropagation neural network. The proposed system was trained
on a set of 240 ligatures written by different native Urdu writers. A
total of 860 ligatures were tested, for evaluation purposes, with an
accuracy rate of 93% (base ligatures) and 98% (ligature of
secondary strokes).
A similar task is performed in [69] using two different methods,
the HMM and fuzzy logic classifier. Input strokes were initially
divided into 62 classes based on the starting and ending style of
ligature, using fuzzy rules. For the base stroke ligatures, 26
temporal features were extracted. To reduce time complexity, these
features were truncated to 16 characteristic features. About 1800
ligatures are collected from 15 trained and native Urdu writers and
were classified with an accuracy rate of 87.65%.
A novel approach is proposed in a recent work reported in [70]
in which the bottom-up technique is used to group the connected
components of ligatures written on the same line of the cursive text
document. The traced sequence of the connected components is
stored as a feature vector and labelled with appropriate indices. The
proposed approach is tested on various samples of the dataset
generated by inviting native Urdu writers. Accuracy of about 96%
is achieved from experimental results in recognising ligatures. In a
notable work of [4], the authors performed a similar task on Urdu
printed text using a cross-correlation technique on a small subset of
Nastaliq text.
In a recent work [71], a holistic OCR system is proposed for
printed Urdu Nastaliq font using statistical features and hidden
Markov models to recognise Urdu ligatures. For training purposes,
1525 unique high-frequency Urdu ligature clusters from the
standard UPTI database are selected. The overall ligature dataset is
first to split into primary and secondary ligatures and is recognised
separately. In the second phase, the heuristics approach is applied
to recognise the complete ligature by associating secondary
ligatures with the primary ligature. For evaluation purposes, the
model is assessed through various experiments thus achieving
92.20% accuracy compared to the recent studies on this problem.
In a noteworthy work [72], the authors developed an Urdu OCR
system that claims to recognise the Urdu handwritten text at the
ligature level. The important achievement in this approach is that it
resolves the issues associated with the segmentation problems
while pre-processing the Urdu handwritten text having overlapping
characters. The proposed approach is solely based on a semisupervised multi-level clustering to categorise and recognise the
ligatures. The classification process is then performed using four
widely used machine learning techniques, i.e. decision trees, linear
discriminant analysis, naive Bayes, and k-nearest neighbour (kNN). The obtained results showed an overall accuracy of 62, 61,
73, and 90% for the decision tree, linear discriminant analysis,
naive Bayes, and k-NN, respectively.
4.4 Sentence level recognition
Table 5 Summary of approaches used for developing Urdu
NER system
Reference Approach used F-measure, %
Corpus
[62]
[63]
[64]
[65]
rule-based
rule-based
ME, CRF
CRF
-,-,-
1,62,275 tokens
36,000 tokens
55,000 tokens
36,000 tokens
IET Image Process.
© The Institution of Engineering and Technology 2020
In [54], the authors utilised the implicitly based recognition
technique for Urdu text in the Nastaliq style. Multidimensional
RNN (MD-RNN) in a combination with long short term memory
(LSTM) and connectionist temporal classification (CTC) output
layers is trained on the features extracted by the sliding window on
Urdu text lines. The reason behind using MD-RNNs is to replace
the single recurrent connection found in conventional RNNs with
as many recurrent connections as the number of dimensions in the
data. During the forward pass at each point in the data sequence,
7
Fig. 16 MD-RNN-based recognition system for Urdu Nastaliq [54]
Table 6 Distribution of the articles published year-wise
Year
References
No. of articles-
total
[29]
[61]
[17, 18, 68]
[4, 74]
[25, 36, 37, 57]
[9, 19, 63–65, 69]
[3, 24, 62]
[38, 60, 70]
[56]
[47, 54, 73]
[11, 31, 53, 71]
[43, 72, 75]
[55, 66]
-
the hidden layer of the network receives both an external input and
its own activation from one step back along all dimensions. Fig. 16
depicts the overall system, which is divided into three stages: input,
MD-LSTM, and output layers.
In the first phase, the overlapped sliding windows are organised
to extract the features on a greyscale image of the Urdu text line.
As a result, a sequence of feature vectors is generated for each text
line. Secondly, the sequence of the feature vector of each text line
is given to MD-LSTM for training purposes. This sequence is
finally assigned to appropriate labels using the CTC layer. This
methodology is applied to UPTI dataset [12] and acquired an
accuracy rate of 96.4% as compared with some other recognition
systems using the same technique.
A similar task is performed in [73] by introducing the features
of zones (a variant of implicit segmentation approach) that are
extracted using an n × n sliding window of uniform size. These
features are extracted by calculating the density of pixels, statistical
features of the pixels that represent relevant information about the
character etc. To get the same number of zoning regions of the
same size, the greyscale images of Urdu text lines were normalised
to fix height and width size (30, 50, 70, and 90).
8
Table 7 Distribution of articles based on classifier used
Classification
Ref.
No. of
technique
articles
neural networks
rule based on HMM
SVM
others
[31, 36, 47, 53–56, 68, 73]
[43, 66, 75]
[25, 57, 61–64, 69, 76]
[9, 65]
[29, 37, 70, 72, 74]
11
8
2
5
For each zone i, two features f 1i and f 2i are computed. The value
of f 1i is the ratio between the sum of all pixel intensity values in the
ith zones (ni) and the sum of all pixel intensity in the image (∑ ni)
as given in (1). Similarly, the value of f 2i is the ratio between the
sum of all pixel intensity values in the ith zones (ni) and the
number of pixels in the ith zone ( f size), as described in (2)
f 1i =
ni
∑ ni
(1)
f 2i =
ni
f size
(2)
The authors also conclude that the values of zoning features are
prone to distortion and noise due to the cursive property of the
Nastaliq style. Equation (3) depicts the feature vector that is based
on top-to-bottom and right-to-left scan values using the LSTM
sequence learning property. Where FV is the feature vector and
Fx, y is the zoning feature of the zone x,y. Experiments were
performed on UPTI dataset [12] achieving a recognition rate of
93.38%.
Tables 6 and 7 depict the distribution of articles based on the
publishing year and classification techniques discussed in this
study, respectively
FV = F1, 1F1, 2, …F1, n; F2, 1, F2, 2, …F2, n
(3)
In a very recent work reported in [75], a group of researchers
proposed a novel gated BLSTM (GBLSTM) model for recognition
of both the printed and handwritten Urdu Nastaleeq text based on
IET Image Process.
© The Institution of Engineering and Technology 2020
Table 8 Distribution of articles based on the Urdu
handwritten recognition tasks performed at different granular
levels
Level of
References
No. of
analysis
articles
character level
word level
ligature level
sentence level
alphabet: [29, 31, 36–38, 47, 49,
50, 53]
[43, 55]
numeric: [25, 55–57]
[9, 60–66]
[4, 68–72]
[54, 73, 75]
11
4
8
6
3
ligature information. The novelty lies in incorporating the raw
pixel values as the characteristic features instead of human crafted
features since the latter being more error-prone. For evaluation
purposes, the proposed model is trained and tested on un-degraded
and tested on a novel but artificially degraded versions of Urdu
printed and handwritten text images dataset. The recognition
accuracy of the proposed GBLSTM model is 96.71%, i.e. higher
than the prevalent Urdu OCR systems.
Table 8 shows the distribution of articles based on the
granularity level discussed in this study.
5
Discussion
This study reviews a number of interesting and useful works that
primarily focused on challenging issues associated with Urdu
handwritten text recognition. Table 6 depicts the year-wise
distribution of articles surveyed for this state-of-the-art review and
it is quite obvious that researchers showed more interest in Urdu
ICR during-.
If we consider granule-based Urdu handwritten text recognition,
most of the studies focused on character and word level as shown
in Table 8. Moreover, it was observed that a very limited work
done on sentence and ligature level. It is evident from Table 7 that
ANN is the most widely used approach to resolve various tasks
since the ANN approach is more efficient and coherent for solving
the problems having high-dimensional data set as compared to
other techniques.
5.1 Future directions and open challenges
In this section, we will discuss some open issues related to Urdu
ICR.
The data set developed and collected from a number of
resources is so much noisy and unstructured that it causes
maximum time consumption while preprocessing. One of the major
challenges is the validation of the proposed methods given in the
surveyed articles. The researchers must have access to some
standard benchmarks dataset results to validate their proposed
methods.
Furthermore, very few attempts have been made to extract the
optimised features that might help in achieving higher recognition
accuracy. Therefore, various hybrids of machine learning and
optimisation techniques can be developed for feature subset
selection. To extract the characteristic features, both structural and
statistical pixel-based approaches are not utilised. To excerpt the
correct and targeted features, various machine learning and
optimisation techniques can be combined.
Some more open issues are recognition of ambiguous,
overlapped, and illegible words and separation of numeric data
from the script in handwritten text.
6
Conclusion
Urdu script is categorised as one of the cursive and bidirectional
script derived from Arabic, i.e. the reason the Urdu script shares
with the Arabic almost similar challenges and issues but with
higher complexity. This complexity is due to the usage of
diacritics, which changes the alphabet from the Arabic alphabet set
IET Image Process.
© The Institution of Engineering and Technology 2020
to the Urdu alphabet set, e.g.
(Kaaf) can be misinterpreted and
recognised as
(Gaaf) because of having a quite similar shape.
The character of (Gaaf) is not in the Arabic alphabet set. Similar
issues are associated with the recognition of Urdu alphabets, which
share similar shapes both in Urdu and Arabic character sets.
This study presents an all-inclusive and state-of-the-art review
of the research published in various aspects of Urdu handwritten
text recognition from 2004 to 2019. The survey is reviewed at four
granularity levels namely character level, word level, ligature level,
and sentence level. It has also been observed that the literature
presented in the study covered the significant progress made in the
field of recognition of printed Urdu text only rather than
recognition of handwritten Urdu script. This is due to the inherent
cursive nature and different writing styles of Urdu handwritten text.
One of the main contributions of the survey is to highlight a list of
available public datasets for Urdu handwritten text recognition.
Furthermore, Urdu character recognition in Nastaliq script is a
two-level process; first, the text is segmented into lines and then
the lines are combined to make ligatures and isolated characters.
Segmentation of ligatures into basic character shapes is more
challenging and still an open field for researchers in this area.
We also conferred some significant conclusions from the review
regarding the techniques and approaches applied. Apart from ANN
and SVM; we analysed that some of the ensemble learning
approaches and linear-time algorithms like random forest,
connected components, pixel-based structural, and statistical
feature extraction using formal concept analysis. Furthermore, the
deep network and its different variations must be applied to address
the challenging issues in a disciplined way. Some of these issues
are uncertainty in recognition, limited number of training samples
etc. Therefore, there must be an attempt to apply the abovementioned approaches to resolve these issues.
7
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
References
Bahlmann, C.: ‘Directional features in online handwriting recognition’,
Pattern Recognit., 2006, 39, (1), pp. 115–125
Anwar, W., Bajwa, I.S., Ramzan, S.: ‘Design and implementation of a
machine learning-based authorship identification model’, Sci. Program.,
2019, 2019, pp. 98–110
Jahangir, F., Anwar, W., Bajwa, U.I., et al.: ‘N-gram and gazetteer list based
named entity recognition for Urdu: a scarce resourced language’. Proc. 10th
Workshop on Asian Language Resources, Mumbai, India, 2012, pp. 95–104
Sattar, S.A., Haque, S., Pathan, M.K., et al.: ‘Implementation challenges for
Nastaliq character recognition’. Int. Multi-Topic Conf., Jamshoro, Pakistan,
2008, pp. 279–285
Mori, S., Nishida, H., Yamada, H.: ‘Optical character recognition’ (John
Wiley & Sons Inc., USA, 1999)
Herbert, H.F.: ‘The history of OCR, optical character recognition’
(Recognition Technologies Users Association, Manchester Center, VT, 1982)
Khan, N.H., Adnan, A., Basar, S.: ‘An analysis of off-line and on-line
approaches in Urdu character recognition’. Proc. 15th Int. Conf. on Artificial
Intelligence, Knowledge Engineering and Data Bases (AIKED'16), Venice,
Italy, 2016, pp. 29–31
Jan, Z., Shabir, M., Khan, M.A., et al.: ‘Online urdu handwriting recognition
system using geometric invariant features’, Nucleus, 2016, 53, (2), pp. 89–98
Sagheer, M.W., He, C.L., Nobile, N., et al.: ‘Holistic Urdu handwritten word
recognition using support vector machine’. 2010 20th Int. Conf. on Pattern
Recognition, Istanbul, Turkey, 2010, pp-
Wali, A., Hussain, S.: ‘Context sensitive shape-substitution in Nastaliq
writing system: analysis and formulation’. Innovations and Advanced
Techniques in Computer and Information Sciences and Engineering,
Bridgeport, CT, USA, 2007, pp. 53–58
Akram, Q.U.A., Hussain, S.: ‘Ligature-based font size independent OCR for
Noori Nastalique writing style’. 2017 1st Int. Workshop on Arabic Script
Analysis and Recognition (ASAR), Nancy, France, 2017, pp. 129–133
Sabbour, N., Shafait, F.: ‘A segmentation-free approach to Arabic and Urdu
OCR’ in ‘Document recognition and retrieval XX’, vol. 8658 (International
Society for Optics and Photonics, USA, 2013), pp. p 86580N
Baldominos, A., Saez, Y., Isasi, P.: ‘A survey of handwritten character
recognition with MNIST and EMNIST’, Appl. Sci., 2019, 9, (15), p. 3169
Pal, U., Jayadevan, R., Sharma, N.: ‘Handwriting recognition in indian
regional scripts: a survey of offline techniques’, ACM Lang. Inf. Process.
(TALIP), 2012, 11, (1), p. 1
Jameel, M., Kumar, S., Karim, A.: ‘A review on recognition of handwritten
Urdu characters using neural networks’, Int. J. Adv. Res. Comput. Sci., 2017,
8, (9), pp. 727–730
Pal, U., Sarkar, A.: ‘Recognition of printed urdu script’. 2003 Proc. Seventh
Int. Conf. on Document Analysis and Recognition, Edinburgh, UK, 2003, pp-
Shamsher, I., Ahmad, Z., Orakzai, J.K., et al.: ‘OCR for printed Urdu script
using feed forward neural network’, Proc. World Acad. Sci. Eng. Technol.,
2007, 23, pp. 172–175
9
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
[40]
[41]
[42]
[43]
[44]
[45]
[46]
[47]
10
Ahmad, Z., Orakzai, J.K., Shamsher, I., et al.: ‘Urdu Nastaleeq optical
character recognition’, Proc. World Acad. Sci. Eng. Technol., 2007, 26, pp.
249–252
Javed, S.T., Hussain, S., Maqbool, A., et al.: ‘Segmentation free Nastalique
urdu ocr’, World Acad. Sci. Eng. Technol., 2010, 46, pp. 456–461
Simons, G.F., Fennig, C.D.: ‘Ethnologue: languages of asia’ (Sil
International, USA, 2017)
Rahman, T.: ‘Language and politics in pakistan’, Language, 1998, 133, p. 9
Mahboob, A., Jain, R.: ‘Bilingual education in india and pakistan’ in
‘Bilingual and multilingual education’ (Springer, Cham, USA, 2016), pp. 1–
14
Razzak, M.I.: ‘Online urdu character recognition in unconstrained
environment’. PhD thesis, International Islamic University, Islamabad, 2011
Lehal, G.S.: ‘Choice of recognizable units for Urdu OCR’. Proc. Workshop
on document analysis and recognition, Mumbai, India, 2012, pp. 79–85
Razzak, M.I., Hussain, S.A., Sher, M.: ‘Numeral recognition for Urdu script
in unconstrained environment’. 2009 Int. Conf. on Emerging Technologies,
Islamabad, Pakistan, 2009, pp. 44–47
De-Stefano, C., Fontanella, F., Marcelli, A., et al.: ‘Rejecting both
segmentation and classification errors in handwritten form processing’. 2014
14th Int. Conf. on Frontiers in Handwriting Recognition, Heraklion, Greece,
2014, pp. 569–574
dos Santos, R.P., Clemente, G.S., Ren, T.I., et al.: ‘Text line segmentation
based on morphology and histogram projection’. 2009 10th Int. Conf. on
Document Analysis and Recognition, Barcelona, Spain, 2009, pp. 651–655
Marinai, S., Nesi, P.: ‘Projection based segmentation of musical sheets’. Proc.
of the Fifth Int. Conf. on Document Analysis and Recognition (ICDAR'99)
(Cat. No. PR00318), Bangalore, India, 1999, pp. 515–518
Ali, A., Ahmad, M., Rafiq, N., et al.: ‘Language independent optical character
recognition for hand written text’. Proc. 8th Int. Multitopic Conf. 2004
(INMIC 2004), Lahore, Pakistan, 2004, pp. 79–84
Malik, S.A., Maqsood, M., Aadil, F., et al.: ‘An efficient segmentation
technique for Urdu optical character recognizer (OCR)’. Future of
Information and Communication Conf., San Francisco, CA, USA, 2019, pp.
131–141
Ahmed, S.B., Naz, S., Swati, S., et al.: ‘Handwritten urdu character
recognition using one-dimensional BLSTM classifier’, Neural Comput. Appl.,
2019, 31, (4), pp-
Sagheer, M.W., He, C.L., Nobile, N., et al.: ‘A new large urdu database for
off-line handwriting recognition’. Int. Conf. on Image Analysis and
Processing, Vietri sul Mare, Italy, 2009, pp. 538–546
Choudhary, P., Nain, N.: ‘Calam: linguistic structure to annotate handwritten
text image corpus’, ‘Computational intelligence in data mining’, vol. 2
(Springer, India, 2015), pp. 449–460
Alghazo, J.M., Latif, G., Alzubaidi, L., et al.: ‘Multi-language handwritten
digits recognition based on novel structural features’, J. Imaging Sci.
Technol., 2019, 63, (2), pp-
Van Gerven, M., Bohte, S.: ‘Artificial neural networks as models of neural
information processing’, Front. Comput. Neurosci., 2017, 11, p. 114
Haider, I., Khan, K.U.: ‘Online recognition of single stroke handwritten Urdu
characters’. 2009 IEEE 13th Int. Multitopic Conf., Islamabad, Pakistan, 2009,
pp. 1–6
Shahzad, N., Paulson, B., Hammond, T.: ‘Urdu qaeda: recognition system for
isolated urdu characters’. Proc. UI Workshop on Sketch Recognition, Sanibel
Island, Florida, 2009
Safdar, Q.-T.-A., Khan, K.U.: ‘Online urdu handwritten character recognition:
initial half form single stroke characters’. 2014 12th Int. Conf. on Frontiers of
Information Technology, Islamabad, Pakistan, 2014, pp. 292–297
Meurant, G.: ‘‘Wavelets: a tutorial in theory and applications’, vol. 2
(Academic Press, USA, 2012)
Yelampalli, P.K.R., Nayak, J., Gaidhane, V.H.: ‘Daubechies wavelet-based
local feature descriptor for multimodal medical image registration’, IET
Image Process., 2018, 12, (10), pp-
Daubechies, I.: ‘The wavelet transform, time-frequency localization and
signal analysis’, IEEE Trans. Inf. Theory, 1990, 36, (5), pp-
Daubechies, I.: ‘Ten lectures on wavelets’, vol. 61 (SIAM, USA, 1992)
Safdar, A., Ullah Khan, K., Peng, L., et al.: ‘A novel similar character
discrimination method for online handwritten urdu character recognition in
half forms’, Sci. Iranica, 2018, to appear
Kalchbrenner, N., Danihelka, I., Graves, A.: ‘Grid long short-term memory’,
arXiv preprint arXiv:-, 2015
Hochreiter, S., Schmidhuber, J.: ‘Long short-term memory’, Neural Comput.,
1997, 9, (8), pp-
Sak, H., Senior, A., Beaufays, F.: ‘Long short-term memory recurrent neural
network architectures for large scale acoustic modeling’. Fifteenth Annual
Conf. of the Int. Speech Communication Association, Singapore, Singapore,
2014
Naz, S., Umar, A.I., Ahmed, R., et al.: ‘Urdu Nasta'liq text recognition using
implicit segmentation based on multi-dimensional long short term memory
neural networks’, SpringerPlus, 2016, 5, (1), p. 2010
[48]
[49]
[50]
[51]
[52]
[53]
[54]
[55]
[56]
[57]
[58]
[59]
[60]
[61]
[62]
[63]
[64]
[65]
[66]
[67]
[68]
[69]
[70]
[71]
[72]
[73]
[74]
[75]
[76]
Yujian, L., Bo, L.: ‘A normalized Levenshtein distance metric’, IEEE Trans.
Pattern Anal. Mach. Intell., 2007, 29, (6), pp-
Ahmed, S.B., Naz, S., Razzak, M.I., et al.: ‘Evaluation of cursive and noncursive scripts using recurrent neural networks’, Neural Comput. Appl., 2016,
27, (3), pp. 603–613
Ul-Hasan, A., Ahmed, S.B., Rashid, F., et al.: ‘Offline printed Urdu Nastaleeq
script recognition with bidirectional LSTM networks’. 2013 12th Int. Conf.
on Document Analysis and Recognition, Washington, DC, USA, 2013, pp-
Naz, S., Umar, A.I., Shirazi, S.H., et al.: ‘Segmentation techniques for
recognition of Arabic-like scripts: a comprehensive survey’, Educ. Inf.
Technol., 2016, 21, (5), pp-
Gosselin, B.: ‘Multilayer perceptrons combination applied to handwritten
character recognition’, Neural Process. Lett., 1996, 3, (1), pp. 3–10
Naz, S., Umar, A.I., Ahmad, R., et al.: ‘Urdu Nasta'liq text recognition system
based on multi-dimensional recurrent neural network and statistical features’,
Neural Comput. Appl., 2017, 28, (2), pp. 219–231
Naz, S., Umar, A.I., Ahmad, R., et al.: ‘Offline cursive Urdu-Nastaliq script
recognition using multidimensional recurrent neural networks’,
Neurocomputing, 2016, 177, pp. 228–241
Husnain, M., Saad Missen, M.M., Mumtaz, S., et al.: ‘Recognition of Urdu
handwritten characters using convolutional neural network’, Appl. Sci., 2019,
9, (13), p. 2758
Borse, R., Ansari, I.: ‘Offline handwritten and printed urdu digits recognition
using Daubechies wavelet’ (ER Publication, New Delhi, India, 2015)
Razzak, M.I., Hussain, S.A., Belaïd, A., et al.: ‘Multi-font numerals
recognition for urdu script based languages’ (International Journal of Recent
Trends in Engineering (IJRTE), Academy publisher, Inria-, 2009)
Davis, L.S.: ‘A survey of edge detection techniques’, Comput. Graph. Image
Process., 1975, 4, (3), pp. 248–270
Young, I.T., Van Vliet, L.J.: ‘Recursive implementation of the Gaussian
filter’, Signal Process., 1995, 44, (2), pp. 139–151
Naz, S., Umar, A.I., Shirazi, S.H., et al.: ‘Challenges of urdu named entity
recognition: a scarce resourced language’, Res. J. Appl. Sci., Eng. Technol.,
2014, 8, (10), pp-
Malik, S., Khan, S.A.: ‘Urdu online handwriting recognition’. 2005 Proc.
IEEE Symp. on Emerging Technologies, Islamabad, Pakistan, 2005, pp. 27–
31
Singh, U.P., Goyal, V., Lehal, G.S.: ‘Named entity recognition system for
Urdu’. Proc. COLING 2012, Mumbai, India, 2012, pp-
Riaz, K.: ‘Rule-based named entity recognition in urdu’. Proc. 2010 Named
Entities Workshop, Stroudsburg, PA, USA, 2010, pp. 126–135
Mukund, S., Srihari, R., Peterson, E.: ‘An information-extraction system for
urdu—a resource-poor language’, ACM Trans. Asian Lang. Inf. Process.
(TALIP), 2010, 9, (4), p. 15
Ekbal, A., Bandyopadhyay, S.: ‘Named entity recognition using support
vector machine: a language independent approach’, Int. J. Electr. Comput.
Syst. Eng., 2010, 4, (2), pp. 155–170
Anwar, F.: ‘Online urdu handwritten text recognition for mobile devices using
intelligent techniques’. PhD thesis, International Islamic University,
Islamabad, 2019
De Stefano, C., Marcelli, A., Parziale, A., et al.: ‘Reading cursive
handwriting’. 2010 12th Int. Conf. on Frontiers in Handwriting Recognition,
Kolkata, India, 2010, pp. 95–100
Husain, S.A., Sajjad, A., Anwar, F.: ‘Online urdu character recognition
system’. Machine Vision Applications (MVA), Tokyo, Japan, 2007, pp. 98–
101
Razzak, M.I., Anwar, F., Husain, S.A., et al.: ‘HMM and fuzzy logic: a hybrid
approach for online urdu script-based languages’ character recognition’,
Knowl.-Based Syst., 2010, 23, (8), pp. 914–923
Panwar, S., Ahamed, M., Nain, N.: ‘Ligature segmentation approach for urdu
handwritten text documents’. Proc. 2014 Int. Conf. on Information and
Communication Technology for Competitive Strategies, Rajasthan, India,
2014, p. 1,
Din, I.U., Siddiqi, I., Khalid, S., et al.: ‘Segmentation-free optical character
recognition for printed urdu text’, EURASIP J. Image Video Process., 2017,
2017, (1), p. 62
Khan, N.H., Adnan, A., Basar, S.: ‘Urdu ligature recognition using multi-level
agglomerative hierarchical clustering’, Cluster Comput., 2018, 21, (1), pp.
503–514
Naz, S., Ahmed, S.B., Ahmad, R., et al.: ‘Zoning features and 2DLSTM for
urdu text-line recognition’, Proc. Comput. Sci., 2016, 96, pp. 16–22
Sattar, S.A., Haque, S., Pathan, M.K.: ‘Nastaliq optical character recognition’,
Cluster Comput., 2008, pp. 329–331
Ahmad, I., Wang, X., Mao, Y.H., et al.: ‘Ligature based Urdu Nastaleeq
sentence recognition using gated bidirectional long short term memory’,
Cluster Comput., 2018, 21, (1), pp. 703–714
Choudhary, P., Nain, N.: ‘A four-tier annotated urdu handwritten text image
dataset for multidisciplinary research on urdu script’, ACM Trans. Asian LowResource Lang. Inf. Process. (TALLIP), 2016, 15, (4), p. 26
IET Image Process.
© The Institution of Engineering and Technology 2020